public class StatsSchemaHelper
extends Object
| Constructor and Description |
|---|
StatsSchemaHelper(StructType dataSchema) |
| Modifier and Type | Method and Description |
|---|---|
Column |
getMaxColumn(Column column)
Given a logical column in the data schema provided when creating
this, return
the corresponding MAX column in the statistic schema that stores the MAX values for the
provided logical column. |
Column |
getMinColumn(Column column)
Given a logical column in the data schema provided when creating
this, return
the corresponding MIN column in the statistic schema that stores the MIN values for the
provided logical column. |
Column |
getNullCountColumn(Column column)
Given a logical column in the data schema provided when creating
this, return
the corresponding NULL_COUNT column in the statistic schema that stores the null count values
for the provided logical column. |
Column |
getNumRecordsColumn()
Returns the NUM_RECORDS column in the statistic schema
|
static StructType |
getStatsSchema(StructType dataSchema)
Returns the expected statistics schema given a table schema.
|
static boolean |
isSkippingEligibleLiteral(Literal literal)
Returns true if the given literal is skipping-eligible.
|
boolean |
isSkippingEligibleMinMaxColumn(Column column)
Returns true if the given column is skipping-eligible using min/max statistics.
|
boolean |
isSkippingEligibleNullCountColumn(Column column)
Returns true if the given column is skipping-eligible using null count statistics.
|
public StatsSchemaHelper(StructType dataSchema)
public static boolean isSkippingEligibleLiteral(Literal literal)
public static StructType getStatsSchema(StructType dataSchema)
public Column getMinColumn(Column column)
this, return
the corresponding MIN column in the statistic schema that stores the MIN values for the
provided logical column.public Column getMaxColumn(Column column)
this, return
the corresponding MAX column in the statistic schema that stores the MAX values for the
provided logical column.public Column getNullCountColumn(Column column)
this, return
the corresponding NULL_COUNT column in the statistic schema that stores the null count values
for the provided logical column.public Column getNumRecordsColumn()
public boolean isSkippingEligibleMinMaxColumn(Column column)
public boolean isSkippingEligibleNullCountColumn(Column column)