public class SplitGrouper extends Object
| Constructor and Description |
|---|
SplitGrouper() |
| Modifier and Type | Method and Description |
|---|---|
List<org.apache.tez.dag.api.TaskLocationHint> |
createTaskLocationHints(org.apache.hadoop.mapred.InputSplit[] splits,
boolean consistentLocations)
Create task location hints from a set of input splits
|
com.google.common.collect.Multimap<Integer,org.apache.hadoop.mapred.InputSplit> |
generateGroupedSplits(org.apache.hadoop.mapred.JobConf jobConf,
org.apache.hadoop.conf.Configuration conf,
org.apache.hadoop.mapred.InputSplit[] splits,
float waves,
int availableSlots,
org.apache.hadoop.mapred.split.SplitLocationProvider locationProvider)
Generate groups of splits, separated by schema evolution boundaries
|
com.google.common.collect.Multimap<Integer,org.apache.hadoop.mapred.InputSplit> |
generateGroupedSplits(org.apache.hadoop.mapred.JobConf jobConf,
org.apache.hadoop.conf.Configuration conf,
org.apache.hadoop.mapred.InputSplit[] splits,
float waves,
int availableSlots,
String inputName,
boolean groupAcrossFiles,
org.apache.hadoop.mapred.split.SplitLocationProvider locationProvider)
Generate groups of splits, separated by schema evolution boundaries
OR
When used from compactor, group splits based on the bucket number of the input files
(in this case, splits for same logical bucket but different schema, end up in same group)
|
com.google.common.collect.Multimap<Integer,org.apache.hadoop.mapred.InputSplit> |
group(org.apache.hadoop.conf.Configuration conf,
com.google.common.collect.Multimap<Integer,org.apache.hadoop.mapred.InputSplit> bucketSplitMultimap,
int availableSlots,
float waves,
org.apache.hadoop.mapred.split.SplitLocationProvider splitLocationProvider)
group splits for each bucket separately - while evenly filling all the
available slots with tasks
|
public com.google.common.collect.Multimap<Integer,org.apache.hadoop.mapred.InputSplit> group(org.apache.hadoop.conf.Configuration conf, com.google.common.collect.Multimap<Integer,org.apache.hadoop.mapred.InputSplit> bucketSplitMultimap, int availableSlots, float waves, org.apache.hadoop.mapred.split.SplitLocationProvider splitLocationProvider) throws IOException
IOExceptionpublic List<org.apache.tez.dag.api.TaskLocationHint> createTaskLocationHints(org.apache.hadoop.mapred.InputSplit[] splits, boolean consistentLocations) throws IOException
splits - the actual splitsconsistentLocations - whether to re-order locations for each split, if it's a file splitIOExceptionpublic com.google.common.collect.Multimap<Integer,org.apache.hadoop.mapred.InputSplit> generateGroupedSplits(org.apache.hadoop.mapred.JobConf jobConf, org.apache.hadoop.conf.Configuration conf, org.apache.hadoop.mapred.InputSplit[] splits, float waves, int availableSlots, org.apache.hadoop.mapred.split.SplitLocationProvider locationProvider) throws Exception
Exceptionpublic com.google.common.collect.Multimap<Integer,org.apache.hadoop.mapred.InputSplit> generateGroupedSplits(org.apache.hadoop.mapred.JobConf jobConf, org.apache.hadoop.conf.Configuration conf, org.apache.hadoop.mapred.InputSplit[] splits, float waves, int availableSlots, String inputName, boolean groupAcrossFiles, org.apache.hadoop.mapred.split.SplitLocationProvider locationProvider) throws Exception
ExceptionCopyright © 2022 The Apache Software Foundation. All rights reserved.