public static interface TabularJobConfig.Builder extends SdkPojo, CopyableBuilder<TabularJobConfig.Builder,TabularJobConfig>
| Modifier and Type | Method and Description |
|---|---|
TabularJobConfig.Builder |
candidateGenerationConfig(CandidateGenerationConfig candidateGenerationConfig)
The configuration information of how model candidates are generated.
|
default TabularJobConfig.Builder |
candidateGenerationConfig(Consumer<CandidateGenerationConfig.Builder> candidateGenerationConfig)
The configuration information of how model candidates are generated.
|
TabularJobConfig.Builder |
completionCriteria(AutoMLJobCompletionCriteria completionCriteria)
Sets the value of the CompletionCriteria property for this object.
|
default TabularJobConfig.Builder |
completionCriteria(Consumer<AutoMLJobCompletionCriteria.Builder> completionCriteria)
Sets the value of the CompletionCriteria property for this object.
|
TabularJobConfig.Builder |
featureSpecificationS3Uri(String featureSpecificationS3Uri)
A URL to the Amazon S3 data source containing selected features from the input data source to run an
Autopilot job V2.
|
TabularJobConfig.Builder |
generateCandidateDefinitionsOnly(Boolean generateCandidateDefinitionsOnly)
Generates possible candidates without training the models.
|
TabularJobConfig.Builder |
mode(AutoMLMode mode)
The method that Autopilot uses to train the data.
|
TabularJobConfig.Builder |
mode(String mode)
The method that Autopilot uses to train the data.
|
TabularJobConfig.Builder |
problemType(ProblemType problemType)
The type of supervised learning problem available for the model candidates of the AutoML job V2.
|
TabularJobConfig.Builder |
problemType(String problemType)
The type of supervised learning problem available for the model candidates of the AutoML job V2.
|
TabularJobConfig.Builder |
sampleWeightAttributeName(String sampleWeightAttributeName)
If specified, this column name indicates which column of the dataset should be treated as sample weights for
use by the objective metric during the training, evaluation, and the selection of the best model.
|
TabularJobConfig.Builder |
targetAttributeName(String targetAttributeName)
The name of the target variable in supervised learning, usually represented by 'y'.
|
equalsBySdkFields, sdkFieldscopyapplyMutation, buildTabularJobConfig.Builder candidateGenerationConfig(CandidateGenerationConfig candidateGenerationConfig)
The configuration information of how model candidates are generated.
candidateGenerationConfig - The configuration information of how model candidates are generated.default TabularJobConfig.Builder candidateGenerationConfig(Consumer<CandidateGenerationConfig.Builder> candidateGenerationConfig)
The configuration information of how model candidates are generated.
This is a convenience method that creates an instance of theCandidateGenerationConfig.Builder
avoiding the need to create one manually via CandidateGenerationConfig.builder().
When the Consumer completes, SdkBuilder.build() is called immediately
and its result is passed to candidateGenerationConfig(CandidateGenerationConfig).
candidateGenerationConfig - a consumer that will call methods on CandidateGenerationConfig.BuildercandidateGenerationConfig(CandidateGenerationConfig)TabularJobConfig.Builder completionCriteria(AutoMLJobCompletionCriteria completionCriteria)
completionCriteria - The new value for the CompletionCriteria property for this object.default TabularJobConfig.Builder completionCriteria(Consumer<AutoMLJobCompletionCriteria.Builder> completionCriteria)
AutoMLJobCompletionCriteria.Builder
avoiding the need to create one manually via AutoMLJobCompletionCriteria.builder().
When the Consumer completes, SdkBuilder.build() is called
immediately and its result is passed to completionCriteria(AutoMLJobCompletionCriteria).
completionCriteria - a consumer that will call methods on AutoMLJobCompletionCriteria.BuildercompletionCriteria(AutoMLJobCompletionCriteria)TabularJobConfig.Builder featureSpecificationS3Uri(String featureSpecificationS3Uri)
A URL to the Amazon S3 data source containing selected features from the input data source to run an
Autopilot job V2. You can input FeatureAttributeNames (optional) in JSON format as shown below:
{ "FeatureAttributeNames":["col1", "col2", ...] }.
You can also specify the data type of the feature (optional) in the format shown below:
{ "FeatureDataTypes":{"col1":"numeric", "col2":"categorical" ... } }
These column keys may not include the target column.
In ensembling mode, Autopilot only supports the following data types: numeric,
categorical, text, and datetime. In HPO mode, Autopilot can support
numeric, categorical, text, datetime, and
sequence.
If only FeatureDataTypes is provided, the column keys (col1, col2,..)
should be a subset of the column names in the input data.
If both FeatureDataTypes and FeatureAttributeNames are provided, then the column
keys should be a subset of the column names provided in FeatureAttributeNames.
The key name FeatureAttributeNames is fixed. The values listed in
["col1", "col2", ...] are case sensitive and should be a list of strings containing unique
values that are a subset of the column names in the input data. The list of columns provided must not include
the target column.
featureSpecificationS3Uri - A URL to the Amazon S3 data source containing selected features from the input data source to run an
Autopilot job V2. You can input FeatureAttributeNames (optional) in JSON format as shown
below:
{ "FeatureAttributeNames":["col1", "col2", ...] }.
You can also specify the data type of the feature (optional) in the format shown below:
{ "FeatureDataTypes":{"col1":"numeric", "col2":"categorical" ... } }
These column keys may not include the target column.
In ensembling mode, Autopilot only supports the following data types: numeric,
categorical, text, and datetime. In HPO mode, Autopilot can
support numeric, categorical, text, datetime, and
sequence.
If only FeatureDataTypes is provided, the column keys (col1,
col2,..) should be a subset of the column names in the input data.
If both FeatureDataTypes and FeatureAttributeNames are provided, then the
column keys should be a subset of the column names provided in FeatureAttributeNames.
The key name FeatureAttributeNames is fixed. The values listed in
["col1", "col2", ...] are case sensitive and should be a list of strings containing
unique values that are a subset of the column names in the input data. The list of columns provided
must not include the target column.
TabularJobConfig.Builder mode(String mode)
The method that Autopilot uses to train the data. You can either specify the mode manually or let Autopilot
choose for you based on the dataset size by selecting AUTO. In AUTO mode, Autopilot
chooses ENSEMBLING for datasets smaller than 100 MB, and HYPERPARAMETER_TUNING for
larger ones.
The ENSEMBLING mode uses a multi-stack ensemble model to predict classification and regression
tasks directly from your dataset. This machine learning mode combines several base models to produce an
optimal predictive model. It then uses a stacking ensemble method to combine predictions from contributing
members. A multi-stack ensemble model can provide better performance over a single model by combining the
predictive capabilities of multiple models. See Autopilot algorithm support for a list of algorithms supported by ENSEMBLING mode.
The HYPERPARAMETER_TUNING (HPO) mode uses the best hyperparameters to train the best version of
a model. HPO automatically selects an algorithm for the type of problem you want to solve. Then HPO finds the
best hyperparameters according to your objective metric. See Autopilot algorithm support for a list of algorithms supported by HYPERPARAMETER_TUNING
mode.
mode - The method that Autopilot uses to train the data. You can either specify the mode manually or let
Autopilot choose for you based on the dataset size by selecting AUTO. In
AUTO mode, Autopilot chooses ENSEMBLING for datasets smaller than 100 MB,
and HYPERPARAMETER_TUNING for larger ones.
The ENSEMBLING mode uses a multi-stack ensemble model to predict classification and
regression tasks directly from your dataset. This machine learning mode combines several base models
to produce an optimal predictive model. It then uses a stacking ensemble method to combine predictions
from contributing members. A multi-stack ensemble model can provide better performance over a single
model by combining the predictive capabilities of multiple models. See Autopilot algorithm support for a list of algorithms supported by ENSEMBLING mode.
The HYPERPARAMETER_TUNING (HPO) mode uses the best hyperparameters to train the best
version of a model. HPO automatically selects an algorithm for the type of problem you want to solve.
Then HPO finds the best hyperparameters according to your objective metric. See Autopilot algorithm support for a list of algorithms supported by
HYPERPARAMETER_TUNING mode.
AutoMLMode,
AutoMLModeTabularJobConfig.Builder mode(AutoMLMode mode)
The method that Autopilot uses to train the data. You can either specify the mode manually or let Autopilot
choose for you based on the dataset size by selecting AUTO. In AUTO mode, Autopilot
chooses ENSEMBLING for datasets smaller than 100 MB, and HYPERPARAMETER_TUNING for
larger ones.
The ENSEMBLING mode uses a multi-stack ensemble model to predict classification and regression
tasks directly from your dataset. This machine learning mode combines several base models to produce an
optimal predictive model. It then uses a stacking ensemble method to combine predictions from contributing
members. A multi-stack ensemble model can provide better performance over a single model by combining the
predictive capabilities of multiple models. See Autopilot algorithm support for a list of algorithms supported by ENSEMBLING mode.
The HYPERPARAMETER_TUNING (HPO) mode uses the best hyperparameters to train the best version of
a model. HPO automatically selects an algorithm for the type of problem you want to solve. Then HPO finds the
best hyperparameters according to your objective metric. See Autopilot algorithm support for a list of algorithms supported by HYPERPARAMETER_TUNING
mode.
mode - The method that Autopilot uses to train the data. You can either specify the mode manually or let
Autopilot choose for you based on the dataset size by selecting AUTO. In
AUTO mode, Autopilot chooses ENSEMBLING for datasets smaller than 100 MB,
and HYPERPARAMETER_TUNING for larger ones.
The ENSEMBLING mode uses a multi-stack ensemble model to predict classification and
regression tasks directly from your dataset. This machine learning mode combines several base models
to produce an optimal predictive model. It then uses a stacking ensemble method to combine predictions
from contributing members. A multi-stack ensemble model can provide better performance over a single
model by combining the predictive capabilities of multiple models. See Autopilot algorithm support for a list of algorithms supported by ENSEMBLING mode.
The HYPERPARAMETER_TUNING (HPO) mode uses the best hyperparameters to train the best
version of a model. HPO automatically selects an algorithm for the type of problem you want to solve.
Then HPO finds the best hyperparameters according to your objective metric. See Autopilot algorithm support for a list of algorithms supported by
HYPERPARAMETER_TUNING mode.
AutoMLMode,
AutoMLModeTabularJobConfig.Builder generateCandidateDefinitionsOnly(Boolean generateCandidateDefinitionsOnly)
Generates possible candidates without training the models. A model candidate is a combination of data preprocessors, algorithms, and algorithm parameter settings.
generateCandidateDefinitionsOnly - Generates possible candidates without training the models. A model candidate is a combination of data
preprocessors, algorithms, and algorithm parameter settings.TabularJobConfig.Builder problemType(String problemType)
The type of supervised learning problem available for the model candidates of the AutoML job V2. For more information, see Amazon SageMaker Autopilot problem types.
You must either specify the type of supervised learning problem in ProblemType and provide the
AutoMLJobObjective metric, or none at all.
problemType - The type of supervised learning problem available for the model candidates of the AutoML job V2. For
more information, see Amazon SageMaker Autopilot problem types.
You must either specify the type of supervised learning problem in ProblemType and
provide the AutoMLJobObjective metric, or none at all.
ProblemType,
ProblemTypeTabularJobConfig.Builder problemType(ProblemType problemType)
The type of supervised learning problem available for the model candidates of the AutoML job V2. For more information, see Amazon SageMaker Autopilot problem types.
You must either specify the type of supervised learning problem in ProblemType and provide the
AutoMLJobObjective metric, or none at all.
problemType - The type of supervised learning problem available for the model candidates of the AutoML job V2. For
more information, see Amazon SageMaker Autopilot problem types.
You must either specify the type of supervised learning problem in ProblemType and
provide the AutoMLJobObjective metric, or none at all.
ProblemType,
ProblemTypeTabularJobConfig.Builder targetAttributeName(String targetAttributeName)
The name of the target variable in supervised learning, usually represented by 'y'.
targetAttributeName - The name of the target variable in supervised learning, usually represented by 'y'.TabularJobConfig.Builder sampleWeightAttributeName(String sampleWeightAttributeName)
If specified, this column name indicates which column of the dataset should be treated as sample weights for use by the objective metric during the training, evaluation, and the selection of the best model. This column is not considered as a predictive feature. For more information on Autopilot metrics, see Metrics and validation.
Sample weights should be numeric, non-negative, with larger values indicating which rows are more important than others. Data points that have invalid or no weight value are excluded.
Support for sample weights is available in Ensembling mode only.
sampleWeightAttributeName - If specified, this column name indicates which column of the dataset should be treated as sample
weights for use by the objective metric during the training, evaluation, and the selection of the best
model. This column is not considered as a predictive feature. For more information on Autopilot
metrics, see Metrics and
validation.
Sample weights should be numeric, non-negative, with larger values indicating which rows are more important than others. Data points that have invalid or no weight value are excluded.
Support for sample weights is available in Ensembling mode only.
Copyright © 2023. All rights reserved.