public class GoogleHadoopFileSystem extends GoogleHadoopFileSystemBase
This implementation sacrifices a small amount of cross-bucket interoperability in favor of more straightforward FileSystem semantics and compatibility with existing Hadoop applications. In particular, it is not subject to bucket-naming constraints, and files are allowed to be placed in root.
GoogleHadoopFileSystemBase.Counter, GoogleHadoopFileSystemBase.GcsFileChecksumType, GoogleHadoopFileSystemBase.ListStatusFileNotFoundBehavior, GoogleHadoopFileSystemBase.OutputStreamType, GoogleHadoopFileSystemBase.ParentTimestampUpdateIncludePredicateAUTHENTICATION_PREFIX, BLOCK_SIZE_DEFAULT, BLOCK_SIZE_KEY, BUFFERSIZE_DEFAULT, BUFFERSIZE_KEY, counters, DEFAULT_FILTER, defaultBlockSize, ENABLE_GCE_SERVICE_ACCOUNT_AUTH_KEY, GCE_BUCKET_DELETE_ENABLE_DEFAULT, GCE_BUCKET_DELETE_ENABLE_KEY, GCS_APPLICATION_NAME_SUFFIX_DEFAULT, GCS_APPLICATION_NAME_SUFFIX_KEY, GCS_BATCH_THREADS, GCS_BATCH_THREADS_DEFAULT, GCS_CLIENT_ID_KEY, GCS_CLIENT_SECRET_KEY, GCS_COPY_BATCH_THREADS, GCS_COPY_BATCH_THREADS_DEFAULT, GCS_COPY_MAX_REQUESTS_PER_BATCH, GCS_COPY_MAX_REQUESTS_PER_BATCH_DEFAULT, GCS_CREATE_SYSTEM_BUCKET_DEFAULT, GCS_CREATE_SYSTEM_BUCKET_KEY, GCS_ENABLE_COPY_WITH_REWRITE_DEFAULT, GCS_ENABLE_COPY_WITH_REWRITE_KEY, GCS_ENABLE_FLAT_GLOB_DEFAULT, GCS_ENABLE_FLAT_GLOB_KEY, GCS_ENABLE_INFER_IMPLICIT_DIRECTORIES_DEFAULT, GCS_ENABLE_INFER_IMPLICIT_DIRECTORIES_KEY, GCS_ENABLE_MARKER_FILE_CREATION_DEFAULT, GCS_ENABLE_MARKER_FILE_CREATION_KEY, GCS_ENABLE_PERFORMANCE_CACHE_DEFAULT, GCS_ENABLE_PERFORMANCE_CACHE_KEY, GCS_ENABLE_REPAIR_IMPLICIT_DIRECTORIES_DEFAULT, GCS_ENABLE_REPAIR_IMPLICIT_DIRECTORIES_KEY, GCS_FILE_SIZE_LIMIT_250GB, GCS_FILE_SIZE_LIMIT_250GB_DEFAULT, GCS_GENERATION_READ_CONSISTENCY_DEFAULT, GCS_GENERATION_READ_CONSISTENCY_KEY, GCS_HTTP_CONNECT_TIMEOUT_DEFAULT, GCS_HTTP_CONNECT_TIMEOUT_KEY, GCS_HTTP_MAX_RETRY_DEFAULT, GCS_HTTP_MAX_RETRY_KEY, GCS_HTTP_READ_TIMEOUT_DEFAULT, GCS_HTTP_READ_TIMEOUT_KEY, GCS_HTTP_TRANSPORT_DEFAULT, GCS_HTTP_TRANSPORT_KEY, GCS_INPUTSTREAM_FADVISE_DEFAULT, GCS_INPUTSTREAM_FADVISE_KEY, GCS_INPUTSTREAM_FAST_FAIL_ON_NOT_FOUND_ENABLE_DEFAULT, GCS_INPUTSTREAM_FAST_FAIL_ON_NOT_FOUND_ENABLE_KEY, GCS_INPUTSTREAM_INPLACE_SEEK_LIMIT_DEFAULT, GCS_INPUTSTREAM_INPLACE_SEEK_LIMIT_KEY, GCS_INPUTSTREAM_MIN_RANGE_REQUEST_SIZE_DEFAULT, GCS_INPUTSTREAM_MIN_RANGE_REQUEST_SIZE_KEY, GCS_MARKER_FILE_PATTERN_KEY, GCS_MAX_LIST_ITEMS_PER_CALL, GCS_MAX_LIST_ITEMS_PER_CALL_DEFAULT, GCS_MAX_REQUESTS_PER_BATCH, GCS_MAX_REQUESTS_PER_BATCH_DEFAULT, GCS_MAX_WAIT_MILLIS_EMPTY_OBJECT_CREATE_DEFAULT, GCS_MAX_WAIT_MILLIS_EMPTY_OBJECT_CREATE_KEY, GCS_OUTPUTSTREAM_TYPE_DEFAULT, GCS_OUTPUTSTREAM_TYPE_KEY, GCS_PARENT_TIMESTAMP_UPDATE_ENABLE_DEFAULT, GCS_PARENT_TIMESTAMP_UPDATE_ENABLE_KEY, GCS_PARENT_TIMESTAMP_UPDATE_EXCLUDES_DEFAULT, GCS_PARENT_TIMESTAMP_UPDATE_EXCLUDES_KEY, GCS_PARENT_TIMESTAMP_UPDATE_INCLUDES_DEFAULT, GCS_PARENT_TIMESTAMP_UPDATE_INCLUDES_KEY, GCS_PERFORMANCE_CACHE_DIR_METADATA_PREFETCH_LIMIT_DEFAULT, GCS_PERFORMANCE_CACHE_DIR_METADATA_PREFETCH_LIMIT_KEY, GCS_PERFORMANCE_CACHE_LIST_CACHING_ENABLE_DEFAULT, GCS_PERFORMANCE_CACHE_LIST_CACHING_ENABLE_KEY, GCS_PERFORMANCE_CACHE_MAX_ENTRY_AGE_MILLIS_DEFAULT, GCS_PERFORMANCE_CACHE_MAX_ENTRY_AGE_MILLIS_KEY, GCS_PROJECT_ID_KEY, GCS_PROXY_ADDRESS_DEFAULT, GCS_PROXY_ADDRESS_KEY, GCS_REQUESTER_PAYS_BUCKETS_KEY, GCS_REQUESTER_PAYS_MODE_KEY, GCS_REQUESTER_PAYS_PROJECT_ID_KEY, GCS_SYSTEM_BUCKET_KEY, GCS_WORKING_DIRECTORY_KEY, GHFS_ID, initUri, listStatusFileNotFoundBehavior, MR_JOB_HISTORY_DONE_DIR_KEY, MR_JOB_HISTORY_INTERMEDIATE_DONE_DIR_KEY, PATH_CODEC_DEFAULT, PATH_CODEC_KEY, PATH_CODEC_USE_LEGACY_ENCODING, PATH_CODEC_USE_URI_ENCODING, pathCodec, PERMISSIONS_TO_REPORT_DEFAULT, PERMISSIONS_TO_REPORT_KEY, PROPERTIES_FILE, REPLICATION_FACTOR_DEFAULT, SERVICE_ACCOUNT_AUTH_EMAIL_KEY, SERVICE_ACCOUNT_AUTH_KEYFILE_KEY, systemBucket, UNKNOWN_VERSION, VERSION, VERSION_PROPERTY, WRITE_BUFFERSIZE_DEFAULT, WRITE_BUFFERSIZE_KEY| Constructor and Description |
|---|
GoogleHadoopFileSystem()
Constructs an instance of GoogleHadoopFileSystem; the internal
GoogleCloudStorageFileSystem will be set up with config settings when initialize() is called.
|
| Modifier and Type | Method and Description |
|---|---|
protected void |
checkPath(org.apache.hadoop.fs.Path path) |
protected void |
configureBuckets(GoogleCloudStorageFileSystem gcsFs,
java.lang.String systemBucketName,
boolean createSystemBucket)
Validates and possibly creates the system bucket.
|
org.apache.hadoop.fs.Path |
getDefaultWorkingDirectory()
Gets the default value of working directory.
|
org.apache.hadoop.fs.Path |
getFileSystemRoot()
Returns the Hadoop path representing the root of the FileSystem associated with this
FileSystemDescriptor.
|
java.net.URI |
getGcsPath(org.apache.hadoop.fs.Path hadoopPath)
Translates a "gs:/" style hadoopPath (or relative path which is not fully-qualified) into
the appropriate GCS path which is compatible with the underlying GcsFs or gsutil.
|
org.apache.hadoop.fs.Path |
getHadoopPath(java.net.URI gcsPath)
Validates GCS Path belongs to this file system.
|
protected java.lang.String |
getHomeDirectorySubpath()
Override to allow a homedir subpath which sits directly on our FileSystem root.
|
java.lang.String |
getScheme()
As the global-rooted FileSystem, our hadoop-path "scheme" is exactly equal to the general
GCS scheme.
|
append, checkOpenUnchecked, close, completeLocalOutput, concat, copyFromLocalFile, copyFromLocalFile, copyToLocalFile, create, createCounterMap, delete, delete, deleteOnExit, getCanonicalServiceName, getContentSummary, getDefaultBlockSize, getDefaultPort, getDefaultReplication, getDelegationToken, getFileChecksum, getFileStatus, getGcsFs, getHadoopScheme, getHomeDirectory, getUri, getUsed, getWorkingDirectory, getXAttr, getXAttrs, getXAttrs, globStatus, globStatus, initialize, initialize, listStatus, listXAttrs, makeQualified, mkdirs, open, processDeleteOnExit, removeXAttr, rename, setListStatusFileNotFoundBehavior, setOwner, setPermission, setTimes, setVerifyChecksum, setWorkingDirectory, startLocalOutputaddFileSystemForTesting, append, append, clearStatistics, closeAll, closeAllForUGI, copyFromLocalFile, copyFromLocalFile, copyToLocalFile, create, create, create, create, create, create, create, create, create, create, createNewFile, createNonRecursive, createNonRecursive, exists, get, get, get, getAllStatistics, getBlockSize, getCacheSize, getCanonicalUri, getDefaultBlockSize, getDefaultReplication, getDefaultUri, getFileBlockLocations, getLength, getLocal, getName, getNamed, getReplication, getStatistics, getStatistics, isDirectory, isFile, listStatus, listStatus, listStatus, mkdirs, mkdirs, moveFromLocalFile, moveFromLocalFile, moveToLocalFile, open, printStatistics, setDefaultUri, setDefaultUri, setReplicationpublic GoogleHadoopFileSystem()
protected void configureBuckets(GoogleCloudStorageFileSystem gcsFs, java.lang.String systemBucketName, boolean createSystemBucket) throws java.io.IOException
Sets and validates the root bucket.
configureBuckets in class GoogleHadoopFileSystemBasegcsFs - GoogleCloudStorageFileSystem to configure bucketssystemBucketName - Name of system bucketcreateSystemBucket - Whether or not to create systemBucketName if it does not exist.java.io.IOException - if systemBucketName is invalid or cannot be found and createSystemBucket is
false.protected void checkPath(org.apache.hadoop.fs.Path path)
checkPath in class GoogleHadoopFileSystemBaseprotected java.lang.String getHomeDirectorySubpath()
getHomeDirectorySubpath in class GoogleHadoopFileSystemBasepublic org.apache.hadoop.fs.Path getHadoopPath(java.net.URI gcsPath)
getHadoopPath in class GoogleHadoopFileSystemBasegcsPath - Fully-qualified GCS path, of the form gs://public java.net.URI getGcsPath(org.apache.hadoop.fs.Path hadoopPath)
getGcsPath in class GoogleHadoopFileSystemBasehadoopPath - Hadoop path.public java.lang.String getScheme()
getScheme in interface FileSystemDescriptorgetScheme in class GoogleHadoopFileSystemBasepublic org.apache.hadoop.fs.Path getFileSystemRoot()
FileSystemDescriptorgetFileSystemRoot in interface FileSystemDescriptorgetFileSystemRoot in class GoogleHadoopFileSystemBasepublic org.apache.hadoop.fs.Path getDefaultWorkingDirectory()
getDefaultWorkingDirectory in class GoogleHadoopFileSystemBaseCopyright © 2019. All rights reserved.