public static interface DataProcessing.Builder extends SdkPojo, CopyableBuilder<DataProcessing.Builder,DataProcessing>
| Modifier and Type | Method and Description |
|---|---|
DataProcessing.Builder |
inputFilter(String inputFilter)
A JSONPath expression used to select a portion of the input data to pass to the algorithm.
|
DataProcessing.Builder |
joinSource(JoinSource joinSource)
Specifies the source of the data to join with the transformed data.
|
DataProcessing.Builder |
joinSource(String joinSource)
Specifies the source of the data to join with the transformed data.
|
DataProcessing.Builder |
outputFilter(String outputFilter)
A JSONPath expression used to select a portion of the joined dataset to save in the output file for a
batch transform job.
|
equalsBySdkFields, sdkFieldscopyapplyMutation, buildDataProcessing.Builder inputFilter(String inputFilter)
A JSONPath expression used to select a portion of the input data to pass to the algorithm. Use the
InputFilter parameter to exclude fields, such as an ID column, from the input. If you want
Amazon SageMaker to pass the entire input dataset to the algorithm, accept the default value $.
Examples: "$", "$[1:]", "$.features"
inputFilter - A JSONPath expression used to select a portion of the input data to pass to the algorithm. Use the
InputFilter parameter to exclude fields, such as an ID column, from the input. If you
want Amazon SageMaker to pass the entire input dataset to the algorithm, accept the default value
$.
Examples: "$", "$[1:]", "$.features"
DataProcessing.Builder outputFilter(String outputFilter)
A JSONPath expression used to select a portion of the joined dataset to save in the output file for a
batch transform job. If you want Amazon SageMaker to store the entire input dataset in the output file, leave
the default value, $. If you specify indexes that aren't within the dimension size of the joined
dataset, you get an error.
Examples: "$", "$[0,5:]", "$['id','SageMakerOutput']"
outputFilter - A JSONPath expression used to select a portion of the joined dataset to save in the output file for
a batch transform job. If you want Amazon SageMaker to store the entire input dataset in the output
file, leave the default value, $. If you specify indexes that aren't within the dimension
size of the joined dataset, you get an error.
Examples: "$", "$[0,5:]", "$['id','SageMakerOutput']"
DataProcessing.Builder joinSource(String joinSource)
Specifies the source of the data to join with the transformed data. The valid values are None
and Input. The default value is None, which specifies not to join the input with
the transformed data. If you want the batch transform job to join the original input data with the
transformed data, set JoinSource to Input. You can specify
OutputFilter as an additional filter to select a portion of the joined dataset and store it in
the output file.
For JSON or JSONLines objects, such as a JSON array, Amazon SageMaker adds the transformed data to the input
JSON object in an attribute called SageMakerOutput. The joined result for JSON must be a
key-value pair object. If the input is not a key-value pair object, Amazon SageMaker creates a new JSON file.
In the new JSON file, and the input data is stored under the SageMakerInput key and the results
are stored in SageMakerOutput.
For CSV data, Amazon SageMaker takes each row as a JSON array and joins the transformed data with the input by appending each transformed row to the end of the input. The joined data has the original input data followed by the transformed data and the output is a CSV file.
For information on how joining in applied, see Workflow for Associating Inferences with Input Records.
joinSource - Specifies the source of the data to join with the transformed data. The valid values are
None and Input. The default value is None, which specifies not
to join the input with the transformed data. If you want the batch transform job to join the original
input data with the transformed data, set JoinSource to Input. You can
specify OutputFilter as an additional filter to select a portion of the joined dataset
and store it in the output file.
For JSON or JSONLines objects, such as a JSON array, Amazon SageMaker adds the transformed data to the
input JSON object in an attribute called SageMakerOutput. The joined result for JSON must
be a key-value pair object. If the input is not a key-value pair object, Amazon SageMaker creates a
new JSON file. In the new JSON file, and the input data is stored under the
SageMakerInput key and the results are stored in SageMakerOutput.
For CSV data, Amazon SageMaker takes each row as a JSON array and joins the transformed data with the input by appending each transformed row to the end of the input. The joined data has the original input data followed by the transformed data and the output is a CSV file.
For information on how joining in applied, see Workflow for Associating Inferences with Input Records.
JoinSource,
JoinSourceDataProcessing.Builder joinSource(JoinSource joinSource)
Specifies the source of the data to join with the transformed data. The valid values are None
and Input. The default value is None, which specifies not to join the input with
the transformed data. If you want the batch transform job to join the original input data with the
transformed data, set JoinSource to Input. You can specify
OutputFilter as an additional filter to select a portion of the joined dataset and store it in
the output file.
For JSON or JSONLines objects, such as a JSON array, Amazon SageMaker adds the transformed data to the input
JSON object in an attribute called SageMakerOutput. The joined result for JSON must be a
key-value pair object. If the input is not a key-value pair object, Amazon SageMaker creates a new JSON file.
In the new JSON file, and the input data is stored under the SageMakerInput key and the results
are stored in SageMakerOutput.
For CSV data, Amazon SageMaker takes each row as a JSON array and joins the transformed data with the input by appending each transformed row to the end of the input. The joined data has the original input data followed by the transformed data and the output is a CSV file.
For information on how joining in applied, see Workflow for Associating Inferences with Input Records.
joinSource - Specifies the source of the data to join with the transformed data. The valid values are
None and Input. The default value is None, which specifies not
to join the input with the transformed data. If you want the batch transform job to join the original
input data with the transformed data, set JoinSource to Input. You can
specify OutputFilter as an additional filter to select a portion of the joined dataset
and store it in the output file.
For JSON or JSONLines objects, such as a JSON array, Amazon SageMaker adds the transformed data to the
input JSON object in an attribute called SageMakerOutput. The joined result for JSON must
be a key-value pair object. If the input is not a key-value pair object, Amazon SageMaker creates a
new JSON file. In the new JSON file, and the input data is stored under the
SageMakerInput key and the results are stored in SageMakerOutput.
For CSV data, Amazon SageMaker takes each row as a JSON array and joins the transformed data with the input by appending each transformed row to the end of the input. The joined data has the original input data followed by the transformed data and the output is a CSV file.
For information on how joining in applied, see Workflow for Associating Inferences with Input Records.
JoinSource,
JoinSourceCopyright © 2021. All rights reserved.