public class TargetEncoder
extends java.lang.Object
| Modifier and Type | Class and Description |
|---|---|
static class |
TargetEncoder.AddNoiseTask |
static class |
TargetEncoder.DataLeakageHandlingStrategy |
static class |
TargetEncoder.SubtractCurrentRowForLeaveOneOutTask |
| Constructor and Description |
|---|
TargetEncoder(java.lang.String[] columnNamesToEncode) |
TargetEncoder(java.lang.String[] columnNamesToEncode,
BlendingParams blendingParams) |
| Modifier and Type | Method and Description |
|---|---|
water.fvec.Frame |
applyTargetEncoding(water.fvec.Frame data,
java.lang.String targetColumnName,
java.util.Map<java.lang.String,water.fvec.Frame> targetEncodingMap,
byte dataLeakageHandlingStrategy,
boolean withBlendedAvg,
boolean imputeNAs,
long seed) |
water.fvec.Frame |
applyTargetEncoding(water.fvec.Frame data,
java.lang.String targetColumnName,
java.util.Map<java.lang.String,water.fvec.Frame> targetEncodingMap,
byte dataLeakageHandlingStrategy,
boolean withBlendedAvg,
double noiseLevel,
boolean imputeNAs,
long seed) |
water.fvec.Frame |
applyTargetEncoding(water.fvec.Frame data,
java.lang.String targetColumnName,
java.util.Map<java.lang.String,water.fvec.Frame> targetEncodingMap,
byte dataLeakageHandlingStrategy,
java.lang.String foldColumn,
boolean withBlendedAvg,
boolean imputeNAs,
long seed) |
water.fvec.Frame |
applyTargetEncoding(water.fvec.Frame data,
java.lang.String targetColumnName,
java.util.Map<java.lang.String,water.fvec.Frame> columnToEncodingMap,
byte dataLeakageHandlingStrategy,
java.lang.String foldColumnName,
boolean withBlendedAvg,
double noiseLevel,
boolean imputeNAsWithNewCategory,
long seed)
Core method for applying pre-calculated encodings to the dataset.
|
java.util.Map<java.lang.String,water.fvec.Frame> |
prepareEncodingMap(water.fvec.Frame data,
java.lang.String targetColumnName,
java.lang.String foldColumnName) |
java.util.Map<java.lang.String,water.fvec.Frame> |
prepareEncodingMap(water.fvec.Frame data,
java.lang.String targetColumnName,
java.lang.String foldColumnName,
boolean imputeNAsWithNewCategory) |
public TargetEncoder(java.lang.String[] columnNamesToEncode,
BlendingParams blendingParams)
columnNamesToEncode - names of columns to apply target encoding toblendingParams - public TargetEncoder(java.lang.String[] columnNamesToEncode)
public java.util.Map<java.lang.String,water.fvec.Frame> prepareEncodingMap(water.fvec.Frame data,
java.lang.String targetColumnName,
java.lang.String foldColumnName,
boolean imputeNAsWithNewCategory)
targetColumnName - name of the target columnfoldColumnName - name of the column that contains fold number the row is belong toimputeNAsWithNewCategory - set to `true` to impute NAs with new category. // TODO probably we need to always set it to true bc we do not support null values on the right side of merge operation.public java.util.Map<java.lang.String,water.fvec.Frame> prepareEncodingMap(water.fvec.Frame data,
java.lang.String targetColumnName,
java.lang.String foldColumnName)
public water.fvec.Frame applyTargetEncoding(water.fvec.Frame data,
java.lang.String targetColumnName,
java.util.Map<java.lang.String,water.fvec.Frame> columnToEncodingMap,
byte dataLeakageHandlingStrategy,
java.lang.String foldColumnName,
boolean withBlendedAvg,
double noiseLevel,
boolean imputeNAsWithNewCategory,
long seed)
data - dataset that will be used as a base for creation of encodings .targetColumnName - name of the column with respect to which we were computing encodings.columnToEncodingMap - map of the prepared encodings with the keys being the names of the columns.dataLeakageHandlingStrategy - see TargetEncoding.DataLeakageHandlingStrategy //TODO use common interface for stronger type safety.foldColumnName - column's name that contains fold number the row is belong to.withBlendedAvg - whether to apply blending or not.noiseLevel - amount of noise to add to the final encodings.imputeNAsWithNewCategory - set to `true` to impute NAs with new category.seed - we might want to specify particular values for reproducibility in tests.public water.fvec.Frame applyTargetEncoding(water.fvec.Frame data,
java.lang.String targetColumnName,
java.util.Map<java.lang.String,water.fvec.Frame> targetEncodingMap,
byte dataLeakageHandlingStrategy,
java.lang.String foldColumn,
boolean withBlendedAvg,
boolean imputeNAs,
long seed)
public water.fvec.Frame applyTargetEncoding(water.fvec.Frame data,
java.lang.String targetColumnName,
java.util.Map<java.lang.String,water.fvec.Frame> targetEncodingMap,
byte dataLeakageHandlingStrategy,
boolean withBlendedAvg,
boolean imputeNAs,
long seed)
public water.fvec.Frame applyTargetEncoding(water.fvec.Frame data,
java.lang.String targetColumnName,
java.util.Map<java.lang.String,water.fvec.Frame> targetEncodingMap,
byte dataLeakageHandlingStrategy,
boolean withBlendedAvg,
double noiseLevel,
boolean imputeNAs,
long seed)