public class DataSkippingUtils
extends Object
| Constructor and Description |
|---|
DataSkippingUtils() |
| Modifier and Type | Method and Description |
|---|---|
static java.util.Optional<DataSkippingPredicate> |
constructDataSkippingFilter(Predicate dataFilters,
StructType dataSchema)
Constructs a data skipping filter to prune files using column statistics given
a query data filter if possible.
|
static ColumnarBatch |
parseJsonStats(TableClient tableClient,
FilteredColumnarBatch scanFileBatch,
StructType statsSchema)
Given a
FilteredColumnarBatch of scan files and the statistics schema to parse,
return the parsed JSON stats from the scan files. |
static StructType |
pruneStatsSchema(StructType schema,
java.util.Set<Column> referencedLeafCols)
Prunes the given schema to only include the referenced leaf columns.
|
public static ColumnarBatch parseJsonStats(TableClient tableClient, FilteredColumnarBatch scanFileBatch, StructType statsSchema)
FilteredColumnarBatch of scan files and the statistics schema to parse,
return the parsed JSON stats from the scan files.public static StructType pruneStatsSchema(StructType schema, java.util.Set<Column> referencedLeafCols)
schema - the schema to prunereferencedLeafCols - set of leaf columns in schemapublic static java.util.Optional<DataSkippingPredicate> constructDataSkippingFilter(Predicate dataFilters, StructType dataSchema)
dataFilters - query filters on the data columnsdataSchema - the data schema of the tableDataSkippingPredicate