public abstract class JavaExecutionStrategy<T> extends ClusteringExecutionStrategy<T,org.apache.hudi.common.data.HoodieData<org.apache.hudi.common.model.HoodieRecord<T>>,org.apache.hudi.common.data.HoodieData<org.apache.hudi.common.model.HoodieKey>,org.apache.hudi.common.data.HoodieData<WriteStatus>>
recordType, writeConfig| Constructor and Description |
|---|
JavaExecutionStrategy(HoodieTable table,
org.apache.hudi.common.engine.HoodieEngineContext engineContext,
HoodieWriteConfig writeConfig) |
| Modifier and Type | Method and Description |
|---|---|
protected BulkInsertPartitioner<List<org.apache.hudi.common.model.HoodieRecord<T>>> |
getPartitioner(Map<String,String> strategyParams,
org.apache.avro.Schema schema)
Create
BulkInsertPartitioner based on strategy params. |
HoodieWriteMetadata<org.apache.hudi.common.data.HoodieData<WriteStatus>> |
performClustering(org.apache.hudi.avro.model.HoodieClusteringPlan clusteringPlan,
org.apache.avro.Schema schema,
String instantTime) |
abstract List<WriteStatus> |
performClusteringWithRecordList(List<org.apache.hudi.common.model.HoodieRecord<T>> inputRecords,
int numOutputGroups,
String instantTime,
Map<String,String> strategyParams,
org.apache.avro.Schema schema,
List<org.apache.hudi.common.model.HoodieFileGroupId> fileGroupIdList,
boolean preserveHoodieMetadata)
Execute clustering to write inputRecords into new files as defined by rules in strategy parameters.
|
getEngineContext, getHoodieTable, getWriteConfigpublic JavaExecutionStrategy(HoodieTable table, org.apache.hudi.common.engine.HoodieEngineContext engineContext, HoodieWriteConfig writeConfig)
public HoodieWriteMetadata<org.apache.hudi.common.data.HoodieData<WriteStatus>> performClustering(org.apache.hudi.avro.model.HoodieClusteringPlan clusteringPlan, org.apache.avro.Schema schema, String instantTime)
performClustering in class ClusteringExecutionStrategy<T,org.apache.hudi.common.data.HoodieData<org.apache.hudi.common.model.HoodieRecord<T>>,org.apache.hudi.common.data.HoodieData<org.apache.hudi.common.model.HoodieKey>,org.apache.hudi.common.data.HoodieData<WriteStatus>>public abstract List<WriteStatus> performClusteringWithRecordList(List<org.apache.hudi.common.model.HoodieRecord<T>> inputRecords, int numOutputGroups, String instantTime, Map<String,String> strategyParams, org.apache.avro.Schema schema, List<org.apache.hudi.common.model.HoodieFileGroupId> fileGroupIdList, boolean preserveHoodieMetadata)
inputRecords - List of HoodieRecord.numOutputGroups - Number of output file groups.instantTime - Clustering (replace commit) instant time.strategyParams - Strategy parameters containing columns to sort the data by when clustering.schema - Schema of the data including metadata fields.fileGroupIdList - File group id corresponding to each out group.preserveHoodieMetadata - Whether to preserve commit metadata while clustering.WriteStatus.protected BulkInsertPartitioner<List<org.apache.hudi.common.model.HoodieRecord<T>>> getPartitioner(Map<String,String> strategyParams, org.apache.avro.Schema schema)
BulkInsertPartitioner based on strategy params.strategyParams - Strategy parameters containing columns to sort the data by when clustering.schema - Schema of the data including metadata fields.Copyright © 2023 The Apache Software Foundation. All rights reserved.