Class DataSkippingUtils
Object
io.delta.kernel.internal.skipping.DataSkippingUtils
-
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionstatic Optional<DataSkippingPredicate>constructDataSkippingFilter(Predicate dataFilters, StructType dataSchema) Constructs a data skipping filter to prune files using column statistics given a query data filter if possible.static ColumnarBatchparseJsonStats(Engine engine, FilteredColumnarBatch scanFileBatch, StructType statsSchema) Given aFilteredColumnarBatchof scan files and the statistics schema to parse, return the parsed JSON stats from the scan files.static StructTypepruneStatsSchema(StructType schema, Set<Column> referencedLeafCols) Prunes the given schema to only include the referenced leaf columns.
-
Constructor Details
-
DataSkippingUtils
public DataSkippingUtils()
-
-
Method Details
-
parseJsonStats
public static ColumnarBatch parseJsonStats(Engine engine, FilteredColumnarBatch scanFileBatch, StructType statsSchema) Given aFilteredColumnarBatchof scan files and the statistics schema to parse, return the parsed JSON stats from the scan files. -
pruneStatsSchema
Prunes the given schema to only include the referenced leaf columns. If a leaf column is a nested column it must be referenced using the full column path, e.g. "C_0.C_1.C_leaf"- Parameters:
schema- the schema to prunereferencedLeafCols- set of leaf columns inschema
-
constructDataSkippingFilter
public static Optional<DataSkippingPredicate> constructDataSkippingFilter(Predicate dataFilters, StructType dataSchema) Constructs a data skipping filter to prune files using column statistics given a query data filter if possible. The returned filter will evaluate to FALSE for any files that can be safely skipped. If the filter evaluates to NULL or TRUE, the file should not be skipped.- Parameters:
dataFilters- query filters on the data columnsdataSchema- the data schema of the table- Returns:
- data skipping filter to prune files if it exists as a
DataSkippingPredicate
-