public static interface TransformInput.Builder extends SdkPojo, CopyableBuilder<TransformInput.Builder,TransformInput>
| Modifier and Type | Method and Description |
|---|---|
TransformInput.Builder |
compressionType(CompressionType compressionType)
If your transform data is compressed, specify the compression type.
|
TransformInput.Builder |
compressionType(String compressionType)
If your transform data is compressed, specify the compression type.
|
TransformInput.Builder |
contentType(String contentType)
The multipurpose internet mail extension (MIME) type of the data.
|
default TransformInput.Builder |
dataSource(Consumer<TransformDataSource.Builder> dataSource)
Describes the location of the channel data, which is, the S3 location of the input data that the model can
consume.
|
TransformInput.Builder |
dataSource(TransformDataSource dataSource)
Describes the location of the channel data, which is, the S3 location of the input data that the model can
consume.
|
TransformInput.Builder |
splitType(SplitType splitType)
The method to use to split the transform job's data files into smaller batches.
|
TransformInput.Builder |
splitType(String splitType)
The method to use to split the transform job's data files into smaller batches.
|
equalsBySdkFields, sdkFieldscopyapplyMutation, buildTransformInput.Builder dataSource(TransformDataSource dataSource)
Describes the location of the channel data, which is, the S3 location of the input data that the model can consume.
dataSource - Describes the location of the channel data, which is, the S3 location of the input data that the model
can consume.default TransformInput.Builder dataSource(Consumer<TransformDataSource.Builder> dataSource)
Describes the location of the channel data, which is, the S3 location of the input data that the model can consume.
This is a convenience method that creates an instance of theTransformDataSource.Builder avoiding the
need to create one manually via TransformDataSource.builder().
When the Consumer completes, SdkBuilder.build() is called immediately and
its result is passed to dataSource(TransformDataSource).dataSource - a consumer that will call methods on TransformDataSource.BuilderdataSource(TransformDataSource)TransformInput.Builder contentType(String contentType)
The multipurpose internet mail extension (MIME) type of the data. Amazon SageMaker uses the MIME type with each http call to transfer data to the transform job.
contentType - The multipurpose internet mail extension (MIME) type of the data. Amazon SageMaker uses the MIME type
with each http call to transfer data to the transform job.TransformInput.Builder compressionType(String compressionType)
If your transform data is compressed, specify the compression type. Amazon SageMaker automatically
decompresses the data for the transform job accordingly. The default value is None.
compressionType - If your transform data is compressed, specify the compression type. Amazon SageMaker automatically
decompresses the data for the transform job accordingly. The default value is None.CompressionType,
CompressionTypeTransformInput.Builder compressionType(CompressionType compressionType)
If your transform data is compressed, specify the compression type. Amazon SageMaker automatically
decompresses the data for the transform job accordingly. The default value is None.
compressionType - If your transform data is compressed, specify the compression type. Amazon SageMaker automatically
decompresses the data for the transform job accordingly. The default value is None.CompressionType,
CompressionTypeTransformInput.Builder splitType(String splitType)
The method to use to split the transform job's data files into smaller batches. Splitting is necessary when
the total size of each object is too large to fit in a single request. You can also use data splitting to
improve performance by processing multiple concurrent mini-batches. The default value for
SplitType is None, which indicates that input data files are not split, and request
payloads contain the entire contents of an input object. Set the value of this parameter to Line
to split records on a newline character boundary. SplitType also supports a number of
record-oriented binary data formats. Currently, the supported record formats are:
RecordIO
TFRecord
When splitting is enabled, the size of a mini-batch depends on the values of the BatchStrategy
and MaxPayloadInMB parameters. When the value of BatchStrategy is
MultiRecord, Amazon SageMaker sends the maximum number of records in each request, up to the
MaxPayloadInMB limit. If the value of BatchStrategy is SingleRecord,
Amazon SageMaker sends individual records in each request.
Some data formats represent a record as a binary payload wrapped with extra padding bytes. When splitting is
applied to a binary data format, padding is removed if the value of BatchStrategy is set to
SingleRecord. Padding is not removed if the value of BatchStrategy is set to
MultiRecord.
For more information about RecordIO, see Create a Dataset Using RecordIO in the MXNet
documentation. For more information about TFRecord, see Consuming TFRecord data in the
TensorFlow documentation.
splitType - The method to use to split the transform job's data files into smaller batches. Splitting is necessary
when the total size of each object is too large to fit in a single request. You can also use data
splitting to improve performance by processing multiple concurrent mini-batches. The default value for
SplitType is None, which indicates that input data files are not split, and
request payloads contain the entire contents of an input object. Set the value of this parameter to
Line to split records on a newline character boundary. SplitType also
supports a number of record-oriented binary data formats. Currently, the supported record formats
are:
RecordIO
TFRecord
When splitting is enabled, the size of a mini-batch depends on the values of the
BatchStrategy and MaxPayloadInMB parameters. When the value of
BatchStrategy is MultiRecord, Amazon SageMaker sends the maximum number of
records in each request, up to the MaxPayloadInMB limit. If the value of
BatchStrategy is SingleRecord, Amazon SageMaker sends individual records in
each request.
Some data formats represent a record as a binary payload wrapped with extra padding bytes. When
splitting is applied to a binary data format, padding is removed if the value of
BatchStrategy is set to SingleRecord. Padding is not removed if the value of
BatchStrategy is set to MultiRecord.
For more information about RecordIO, see Create a Dataset Using RecordIO in the MXNet
documentation. For more information about TFRecord, see Consuming TFRecord data in
the TensorFlow documentation.
SplitType,
SplitTypeTransformInput.Builder splitType(SplitType splitType)
The method to use to split the transform job's data files into smaller batches. Splitting is necessary when
the total size of each object is too large to fit in a single request. You can also use data splitting to
improve performance by processing multiple concurrent mini-batches. The default value for
SplitType is None, which indicates that input data files are not split, and request
payloads contain the entire contents of an input object. Set the value of this parameter to Line
to split records on a newline character boundary. SplitType also supports a number of
record-oriented binary data formats. Currently, the supported record formats are:
RecordIO
TFRecord
When splitting is enabled, the size of a mini-batch depends on the values of the BatchStrategy
and MaxPayloadInMB parameters. When the value of BatchStrategy is
MultiRecord, Amazon SageMaker sends the maximum number of records in each request, up to the
MaxPayloadInMB limit. If the value of BatchStrategy is SingleRecord,
Amazon SageMaker sends individual records in each request.
Some data formats represent a record as a binary payload wrapped with extra padding bytes. When splitting is
applied to a binary data format, padding is removed if the value of BatchStrategy is set to
SingleRecord. Padding is not removed if the value of BatchStrategy is set to
MultiRecord.
For more information about RecordIO, see Create a Dataset Using RecordIO in the MXNet
documentation. For more information about TFRecord, see Consuming TFRecord data in the
TensorFlow documentation.
splitType - The method to use to split the transform job's data files into smaller batches. Splitting is necessary
when the total size of each object is too large to fit in a single request. You can also use data
splitting to improve performance by processing multiple concurrent mini-batches. The default value for
SplitType is None, which indicates that input data files are not split, and
request payloads contain the entire contents of an input object. Set the value of this parameter to
Line to split records on a newline character boundary. SplitType also
supports a number of record-oriented binary data formats. Currently, the supported record formats
are:
RecordIO
TFRecord
When splitting is enabled, the size of a mini-batch depends on the values of the
BatchStrategy and MaxPayloadInMB parameters. When the value of
BatchStrategy is MultiRecord, Amazon SageMaker sends the maximum number of
records in each request, up to the MaxPayloadInMB limit. If the value of
BatchStrategy is SingleRecord, Amazon SageMaker sends individual records in
each request.
Some data formats represent a record as a binary payload wrapped with extra padding bytes. When
splitting is applied to a binary data format, padding is removed if the value of
BatchStrategy is set to SingleRecord. Padding is not removed if the value of
BatchStrategy is set to MultiRecord.
For more information about RecordIO, see Create a Dataset Using RecordIO in the MXNet
documentation. For more information about TFRecord, see Consuming TFRecord data in
the TensorFlow documentation.
SplitType,
SplitTypeCopyright © 2022. All rights reserved.