@InterfaceAudience.LimitedPrivate(value="Extensions") @InterfaceStability.Unstable public interface S3AStore extends ClientManager, org.apache.hadoop.fs.statistics.IOStatisticsSource, ObjectInputStreamFactory, org.apache.hadoop.fs.PathCapabilities, org.apache.hadoop.service.Service
The ClientManager interface is used to create the AWS clients;
the base implementation forwards to the implementation of this interface
passed in at construction time.
The interface extends the Hadoop Service interface
and follows its lifecycle: it MUST NOT be used until
Service.init(Configuration) has been invoked.
ObjectInputStreamFactory.StreamFactoryCallbacks| Modifier and Type | Method and Description |
|---|---|
Duration |
acquireReadCapacity(int capacity)
Acquire read capacity for operations.
|
Duration |
acquireWriteCapacity(int capacity)
Acquire write capacity for operations.
|
ClientManager |
clientManager() |
software.amazon.awssdk.services.s3.model.CompleteMultipartUploadResponse |
completeMultipartUpload(software.amazon.awssdk.services.s3.model.CompleteMultipartUploadRequest request)
Complete a multipart upload.
|
File |
createTemporaryFileForWriting(String pathStr,
long size,
org.apache.hadoop.conf.Configuration conf)
Demand create the directory allocator, then create a temporary file.
|
Map.Entry<Duration,Optional<software.amazon.awssdk.services.s3.model.DeleteObjectResponse>> |
deleteObject(software.amazon.awssdk.services.s3.model.DeleteObjectRequest request)
Delete an object.
|
Map.Entry<Duration,software.amazon.awssdk.services.s3.model.DeleteObjectsResponse> |
deleteObjects(software.amazon.awssdk.services.s3.model.DeleteObjectsRequest deleteRequest)
Perform a bulk object delete operation against S3.
|
org.apache.hadoop.fs.LocalDirAllocator |
getDirectoryAllocator()
Get the directory allocator.
|
org.apache.hadoop.fs.statistics.DurationTrackerFactory |
getDurationTrackerFactory() |
software.amazon.awssdk.core.ResponseInputStream<software.amazon.awssdk.services.s3.model.GetObjectResponse> |
getRangedS3Object(String key,
long start,
long end)
Retrieves a specific byte range of an S3 object as a stream.
|
RequestFactory |
getRequestFactory() |
S3AStatisticsContext |
getStatisticsContext() |
StoreContext |
getStoreContext() |
default boolean |
hasCapability(String capability)
The StreamCapabilities is part of ObjectInputStreamFactory.
|
software.amazon.awssdk.services.s3.model.HeadObjectResponse |
headObject(String key,
ChangeTracker changeTracker,
Invoker changeInvoker,
S3AFileSystemOperations fsHandler,
String operation)
Performs a HEAD request on an S3 object to retrieve its metadata.
|
void |
incrementPutCompletedStatistics(boolean success,
long bytes)
At the end of a put/multipart upload operation, update the
relevant counters and gauges.
|
void |
incrementPutProgressStatistics(String key,
long bytes)
Callback for use in progress callbacks from put/multipart upload events.
|
void |
incrementPutStartStatistics(long bytes)
At the start of a put/multipart upload operation, update the
relevant counters.
|
void |
incrementReadOperations()
Increment read operations.
|
void |
incrementWriteOperations()
Increment the write operation counter.
|
boolean |
inputStreamHasCapability(String capability)
Return the capabilities of input streams created
through the store.
|
org.apache.hadoop.fs.statistics.DurationTrackerFactory |
nonNullDurationTrackerFactory(org.apache.hadoop.fs.statistics.DurationTrackerFactory factory)
Given a possibly null duration tracker factory, return a non-null
one for use in tracking durations -either that or the FS tracker
itself.
|
UploadInfo |
putObject(software.amazon.awssdk.services.s3.model.PutObjectRequest putObjectRequest,
File file,
ProgressableProgressListener listener)
Start a transfer-manager managed async PUT of an object,
incrementing the put requests and put bytes
counters.
|
software.amazon.awssdk.services.s3.model.UploadPartResponse |
uploadPart(software.amazon.awssdk.services.s3.model.UploadPartRequest request,
software.amazon.awssdk.core.sync.RequestBody body,
org.apache.hadoop.fs.statistics.DurationTrackerFactory durationTrackerFactory)
Upload part of a multi-partition file.
|
software.amazon.awssdk.transfer.s3.model.CompletedFileUpload |
waitForUploadCompletion(String key,
UploadInfo uploadInfo)
Wait for an upload to complete.
|
getOrCreateAsyncClient, getOrCreateAsyncS3ClientUnchecked, getOrCreateS3Client, getOrCreateS3ClientUnchecked, getOrCreateTransferManager, getOrCreateUnencryptedS3Clientbind, factoryRequirements, readObject, streamTypeDuration acquireWriteCapacity(int capacity)
capacity - capacity to acquire.Duration acquireReadCapacity(int capacity)
capacity - capacity to acquire.StoreContext getStoreContext()
org.apache.hadoop.fs.statistics.DurationTrackerFactory getDurationTrackerFactory()
S3AStatisticsContext getStatisticsContext()
RequestFactory getRequestFactory()
ClientManager clientManager()
void incrementReadOperations()
void incrementWriteOperations()
void incrementPutStartStatistics(long bytes)
bytes - bytes in the request.void incrementPutCompletedStatistics(boolean success,
long bytes)
success - did the operation succeed?bytes - bytes in the request.void incrementPutProgressStatistics(String key, long bytes)
key - key to file that is being written (for logging)bytes - bytes successfully uploaded.org.apache.hadoop.fs.statistics.DurationTrackerFactory nonNullDurationTrackerFactory(org.apache.hadoop.fs.statistics.DurationTrackerFactory factory)
factory - factory.@Retries.RetryRaw Map.Entry<Duration,software.amazon.awssdk.services.s3.model.DeleteObjectsResponse> deleteObjects(software.amazon.awssdk.services.s3.model.DeleteObjectsRequest deleteRequest) throws MultiObjectDeleteException, software.amazon.awssdk.core.exception.SdkException, IOException
OBJECT_DELETE_REQUESTS and write
operation statistics
OBJECT_DELETE_OBJECTS is updated with the actual number
of objects deleted in the request.
Retry policy: retry untranslated; delete considered idempotent.
If the request is throttled, this is logged in the throttle statistics,
with the counter set to the number of keys, rather than the number
of invocations of the delete operation.
This is because S3 considers each key as one mutating operation on
the store when updating its load counters on a specific partition
of an S3 bucket.
If only the request was measured, this operation would under-report.
A write capacity will be requested proportional to the number of keys
preset in the request and will be re-requested during retries such that
retries throttle better. If the request is throttled, the time spent is
recorded in a duration IOStat named STORE_IO_RATE_LIMITED_DURATION.
deleteRequest - keys to delete on the s3-backendMultiObjectDeleteException - one or more of the keys could not
be deleted.software.amazon.awssdk.core.exception.SdkException - amazon-layer failure.IOException - IO problems.@Retries.RetryRaw Map.Entry<Duration,Optional<software.amazon.awssdk.services.s3.model.DeleteObjectResponse>> deleteObject(software.amazon.awssdk.services.s3.model.DeleteObjectRequest request) throws software.amazon.awssdk.core.exception.SdkException
OBJECT_DELETE_REQUESTS statistics.
Retry policy: retry untranslated; delete considered idempotent. 404 errors other than bucket not found are swallowed; this can be raised by third party stores (GCS).
A write capacity of 1 ( as it is signle object delete) will be requested before
the delete call and will be re-requested during retries such that
retries throttle better. If the request is throttled, the time spent is
recorded in a duration IOStat named STORE_IO_RATE_LIMITED_DURATION.
If an exception is caught and swallowed, the response will be empty;
otherwise it will be the response from the delete operation.
request - request to makesoftware.amazon.awssdk.core.exception.SdkException - problems working with S3IllegalArgumentException - if the request was rejected due to
a mistaken attempt to delete the root directory.@Retries.RetryRaw software.amazon.awssdk.services.s3.model.HeadObjectResponse headObject(String key, ChangeTracker changeTracker, Invoker changeInvoker, S3AFileSystemOperations fsHandler, String operation) throws IOException
key - The S3 object key to perform the HEAD operation onchangeTracker - Tracks changes to the object's metadata across operationschangeInvoker - The invoker responsible for executing the HEAD request with retriesfsHandler - Handler for filesystem-level operations and configurationsoperation - Description of the operation being performed for tracking purposesIOException - If the HEAD request fails, object doesn't exist, or other I/O errors occur@Retries.RetryRaw software.amazon.awssdk.core.ResponseInputStream<software.amazon.awssdk.services.s3.model.GetObjectResponse> getRangedS3Object(String key, long start, long end) throws IOException
key - The S3 object key to retrievestart - The starting byte position (inclusive) of the range to retrieveend - The ending byte position (inclusive) of the range to retrieveIOException - If the object cannot be retrieved other I/O errors occurFor additional metadata about the retrieved object@Retries.OnceRaw software.amazon.awssdk.services.s3.model.UploadPartResponse uploadPart(software.amazon.awssdk.services.s3.model.UploadPartRequest request, software.amazon.awssdk.core.sync.RequestBody body, org.apache.hadoop.fs.statistics.DurationTrackerFactory durationTrackerFactory) throws software.amazon.awssdk.awscore.exception.AwsServiceException, UncheckedIOException
Retry Policy: none.
durationTrackerFactory - duration tracker factory for operationrequest - the upload part request.body - the request body.software.amazon.awssdk.awscore.exception.AwsServiceException - on problemsUncheckedIOException - failure to instantiate the s3 client@Retries.OnceRaw UploadInfo putObject(software.amazon.awssdk.services.s3.model.PutObjectRequest putObjectRequest, File file, ProgressableProgressListener listener) throws IOException
It does not update the other counters, as existing code does that as progress callbacks come in. Byte length is calculated from the file length, or, if there is no file, from the content length of the header.
Because the operation is async, any stream supplied in the request must reference data (files, buffers) which stay valid until the upload completes. Retry policy: N/A: the transfer manager is performing the upload. Auditing: must be inside an audit span.
putObjectRequest - the requestfile - the file to be uploadedlistener - the progress listener for the requestIOException - if transfer manager creation failed.@Retries.OnceTranslated software.amazon.awssdk.transfer.s3.model.CompletedFileUpload waitForUploadCompletion(String key, UploadInfo uploadInfo) throws IOException
incrementPutCompletedStatistics(boolean, long)
to update the statistics.key - destination keyuploadInfo - upload to wait forIOException - IO failureCancellationException - if the wait() was cancelled@Retries.OnceRaw software.amazon.awssdk.services.s3.model.CompleteMultipartUploadResponse completeMultipartUpload(software.amazon.awssdk.services.s3.model.CompleteMultipartUploadRequest request)
request - requestorg.apache.hadoop.fs.LocalDirAllocator getDirectoryAllocator()
File createTemporaryFileForWriting(String pathStr, long size, org.apache.hadoop.conf.Configuration conf) throws IOException
LocalDirAllocator.SIZE_UNKNOWN if the
size is unknown.
LocalDirAllocator.createTmpFileForWrite(String, long, Configuration).pathStr - prefix for the temporary filesize - the size of the file that is going to be writtenconf - the Configuration objectIOException - IO problemsboolean inputStreamHasCapability(String capability)
capability - string to query the stream support for.default boolean hasCapability(String capability)
hasCapability in interface org.apache.hadoop.fs.StreamCapabilitiescapability - string to query the stream support for.Copyright © 2008–2025 Apache Software Foundation. All rights reserved.