@Generated(value="software.amazon.awssdk:codegen") public final class TransformInput extends Object implements SdkPojo, Serializable, ToCopyableBuilder<TransformInput.Builder,TransformInput>
Describes the input source of a transform job and the way the transform job consumes it.
| Modifier and Type | Class and Description |
|---|---|
static interface |
TransformInput.Builder |
| Modifier and Type | Method and Description |
|---|---|
static TransformInput.Builder |
builder() |
CompressionType |
compressionType()
If your transform data is compressed, specify the compression type.
|
String |
compressionTypeAsString()
If your transform data is compressed, specify the compression type.
|
String |
contentType()
The multipurpose internet mail extension (MIME) type of the data.
|
TransformDataSource |
dataSource()
Describes the location of the channel data, which is, the S3 location of the input data that the model can
consume.
|
boolean |
equals(Object obj) |
boolean |
equalsBySdkFields(Object obj) |
<T> Optional<T> |
getValueForField(String fieldName,
Class<T> clazz) |
int |
hashCode() |
List<SdkField<?>> |
sdkFields() |
static Class<? extends TransformInput.Builder> |
serializableBuilderClass() |
SplitType |
splitType()
The method to use to split the transform job's data files into smaller batches.
|
String |
splitTypeAsString()
The method to use to split the transform job's data files into smaller batches.
|
TransformInput.Builder |
toBuilder() |
String |
toString()
Returns a string representation of this object.
|
clone, finalize, getClass, notify, notifyAll, wait, wait, waitcopypublic TransformDataSource dataSource()
Describes the location of the channel data, which is, the S3 location of the input data that the model can consume.
public String contentType()
The multipurpose internet mail extension (MIME) type of the data. Amazon SageMaker uses the MIME type with each http call to transfer data to the transform job.
public CompressionType compressionType()
If your transform data is compressed, specify the compression type. Amazon SageMaker automatically decompresses
the data for the transform job accordingly. The default value is None.
If the service returns an enum value that is not available in the current SDK version, compressionType
will return CompressionType.UNKNOWN_TO_SDK_VERSION. The raw value returned by the service is available
from compressionTypeAsString().
None.CompressionTypepublic String compressionTypeAsString()
If your transform data is compressed, specify the compression type. Amazon SageMaker automatically decompresses
the data for the transform job accordingly. The default value is None.
If the service returns an enum value that is not available in the current SDK version, compressionType
will return CompressionType.UNKNOWN_TO_SDK_VERSION. The raw value returned by the service is available
from compressionTypeAsString().
None.CompressionTypepublic SplitType splitType()
The method to use to split the transform job's data files into smaller batches. Splitting is necessary when the
total size of each object is too large to fit in a single request. You can also use data splitting to improve
performance by processing multiple concurrent mini-batches. The default value for SplitType is
None, which indicates that input data files are not split, and request payloads contain the entire
contents of an input object. Set the value of this parameter to Line to split records on a newline
character boundary. SplitType also supports a number of record-oriented binary data formats.
Currently, the supported record formats are:
RecordIO
TFRecord
When splitting is enabled, the size of a mini-batch depends on the values of the BatchStrategy and
MaxPayloadInMB parameters. When the value of BatchStrategy is MultiRecord,
Amazon SageMaker sends the maximum number of records in each request, up to the MaxPayloadInMB
limit. If the value of BatchStrategy is SingleRecord, Amazon SageMaker sends individual
records in each request.
Some data formats represent a record as a binary payload wrapped with extra padding bytes. When splitting is
applied to a binary data format, padding is removed if the value of BatchStrategy is set to
SingleRecord. Padding is not removed if the value of BatchStrategy is set to
MultiRecord.
For more information about RecordIO, see Create
a Dataset Using RecordIO in the MXNet documentation. For more information about TFRecord, see Consuming TFRecord data in the
TensorFlow documentation.
If the service returns an enum value that is not available in the current SDK version, splitType will
return SplitType.UNKNOWN_TO_SDK_VERSION. The raw value returned by the service is available from
splitTypeAsString().
SplitType is None, which indicates that input data files are not split, and
request payloads contain the entire contents of an input object. Set the value of this parameter to
Line to split records on a newline character boundary. SplitType also supports
a number of record-oriented binary data formats. Currently, the supported record formats are:
RecordIO
TFRecord
When splitting is enabled, the size of a mini-batch depends on the values of the
BatchStrategy and MaxPayloadInMB parameters. When the value of
BatchStrategy is MultiRecord, Amazon SageMaker sends the maximum number of
records in each request, up to the MaxPayloadInMB limit. If the value of
BatchStrategy is SingleRecord, Amazon SageMaker sends individual records in
each request.
Some data formats represent a record as a binary payload wrapped with extra padding bytes. When splitting
is applied to a binary data format, padding is removed if the value of BatchStrategy is set
to SingleRecord. Padding is not removed if the value of BatchStrategy is set to
MultiRecord.
For more information about RecordIO, see Create a Dataset Using RecordIO in the MXNet
documentation. For more information about TFRecord, see Consuming TFRecord data in
the TensorFlow documentation.
SplitTypepublic String splitTypeAsString()
The method to use to split the transform job's data files into smaller batches. Splitting is necessary when the
total size of each object is too large to fit in a single request. You can also use data splitting to improve
performance by processing multiple concurrent mini-batches. The default value for SplitType is
None, which indicates that input data files are not split, and request payloads contain the entire
contents of an input object. Set the value of this parameter to Line to split records on a newline
character boundary. SplitType also supports a number of record-oriented binary data formats.
Currently, the supported record formats are:
RecordIO
TFRecord
When splitting is enabled, the size of a mini-batch depends on the values of the BatchStrategy and
MaxPayloadInMB parameters. When the value of BatchStrategy is MultiRecord,
Amazon SageMaker sends the maximum number of records in each request, up to the MaxPayloadInMB
limit. If the value of BatchStrategy is SingleRecord, Amazon SageMaker sends individual
records in each request.
Some data formats represent a record as a binary payload wrapped with extra padding bytes. When splitting is
applied to a binary data format, padding is removed if the value of BatchStrategy is set to
SingleRecord. Padding is not removed if the value of BatchStrategy is set to
MultiRecord.
For more information about RecordIO, see Create
a Dataset Using RecordIO in the MXNet documentation. For more information about TFRecord, see Consuming TFRecord data in the
TensorFlow documentation.
If the service returns an enum value that is not available in the current SDK version, splitType will
return SplitType.UNKNOWN_TO_SDK_VERSION. The raw value returned by the service is available from
splitTypeAsString().
SplitType is None, which indicates that input data files are not split, and
request payloads contain the entire contents of an input object. Set the value of this parameter to
Line to split records on a newline character boundary. SplitType also supports
a number of record-oriented binary data formats. Currently, the supported record formats are:
RecordIO
TFRecord
When splitting is enabled, the size of a mini-batch depends on the values of the
BatchStrategy and MaxPayloadInMB parameters. When the value of
BatchStrategy is MultiRecord, Amazon SageMaker sends the maximum number of
records in each request, up to the MaxPayloadInMB limit. If the value of
BatchStrategy is SingleRecord, Amazon SageMaker sends individual records in
each request.
Some data formats represent a record as a binary payload wrapped with extra padding bytes. When splitting
is applied to a binary data format, padding is removed if the value of BatchStrategy is set
to SingleRecord. Padding is not removed if the value of BatchStrategy is set to
MultiRecord.
For more information about RecordIO, see Create a Dataset Using RecordIO in the MXNet
documentation. For more information about TFRecord, see Consuming TFRecord data in
the TensorFlow documentation.
SplitTypepublic TransformInput.Builder toBuilder()
toBuilder in interface ToCopyableBuilder<TransformInput.Builder,TransformInput>public static TransformInput.Builder builder()
public static Class<? extends TransformInput.Builder> serializableBuilderClass()
public boolean equalsBySdkFields(Object obj)
equalsBySdkFields in interface SdkPojopublic String toString()
Copyright © 2020. All rights reserved.