public static class ListBucketingPruner.DynamicMultiDimensionalCollection extends Object
ListBucketingPruner.prune(org.apache.hadoop.hive.ql.parse.ParseContext, org.apache.hadoop.hive.ql.metadata.Partition, org.apache.hadoop.hive.ql.plan.ExprNodeDesc)
We will use a HasMap to represent the Dynamic-Multiple-Dimension collection:
1. Key is List<Integer> representing the index path to the cell
2. value represents the cell (Boolean for use case #1, List<String> for case #2)
For example:
1. skewed column (list): C1, C2, C3
2. skewed value (list of list): (1,a,x), (2,b,x), (1,c,x), (2,a,y)
From skewed value, we calculate the unique skewed element for each skewed column:
C1: (1,2)
C2: (a,b,c)
C3: (x,y)
We store them in list of list. We don't need to store skewed column name since we use order to
match:
1. Skewed column (list): C1, C2, C3
2. Unique skewed elements for each skewed column (list of list):
(1,2,other), (a,b,c,other), (x,y,other)
3. index (0,1,2) (0,1,2,3) (0,1,2)
We use the index,starting at 0. to construct hashmap representing dynamic-multi-dimension
collection:
key (what skewed value key represents) -> value (Boolean for use case #1, List<String> for case
#2).
(0,0,0) (1,a,x)
(0,0,1) (1,a,y)
(0,1,0) (1,b,x)
(0,1,1) (1,b,y)
(0,2,0) (1,c,x)
(0,2,1) (1,c,y)
(1,0,0) (2,a,x)
(1,0,1) (2,a,y)
(1,1,0) (2,b,x)
(1,1,1) (2,b,y)
(1,2,0) (2,c,x)
(1,2,1) (2,c,y)
...| Constructor and Description |
|---|
DynamicMultiDimensionalCollection() |
| Modifier and Type | Method and Description |
|---|---|
static List<List<String>> |
flat(List<List<String>> uniqSkewedElements)
Flat a dynamic-multi-dimension collection.
|
static List<List<String>> |
generateCollection(List<List<String>> values)
Find out complete skewed-element collection
For example:
1.
|
static List<List<String>> |
uniqueElementsList(List<List<String>> values,
String defaultDirName)
Convert value to unique element list.
|
static List<List<String>> |
uniqueSkewedValueList(List<List<String>> values)
Convert value to unique skewed value list.
|
public static List<List<String>> generateCollection(List<List<String>> values) throws SemanticException
SemanticExceptionpublic static List<List<String>> uniqueElementsList(List<List<String>> values, String defaultDirName)
values - skewed value listpublic static List<List<String>> uniqueSkewedValueList(List<List<String>> values)
ListBucketingPrunerUtils.evaluateExprOnCell(java.util.List<java.lang.String>, java.util.List<java.lang.String>, org.apache.hadoop.hive.ql.plan.ExprNodeDesc, java.util.List<java.util.List<java.lang.String>>)
For example:
1. skewed column (list): C1, C2, C3
2. skewed value (list of list): (1,a,x), (2,b,x), (1,c,x), (2,a,y)
Input: skewed value (list of list): (1,a,x), (2,b,x), (1,c,x), (2,a,y)
Output: Unique skewed value for each skewed column (list of list):
(1,2), (a,b,c), (x,y)
Output matches order of skewed column. Output can be read as:
C1 has unique skewed value list (1,2,)
C2 has unique skewed value list (a,b,c)
C3 has unique skewed value list (x,y)values - skewed value listpublic static List<List<String>> flat(List<List<String>> uniqSkewedElements) throws SemanticException
uniqSkewedElements - SemanticExceptionCopyright © 2024 The Apache Software Foundation. All rights reserved.