Class StatsSchemaHelper
Object
io.delta.kernel.internal.skipping.StatsSchemaHelper
-
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptiongetMaxColumn(Column column) Given a logical column in the data schema provided when creatingthis, return the corresponding MAX column in the statistic schema that stores the MAX values for the provided logical column.getMinColumn(Column column) Given a logical column in the data schema provided when creatingthis, return the corresponding MIN column in the statistic schema that stores the MIN values for the provided logical column.getNullCountColumn(Column column) Given a logical column in the data schema provided when creatingthis, return the corresponding NULL_COUNT column in the statistic schema that stores the null count values for the provided logical column.Returns the NUM_RECORDS column in the statistic schemastatic StructTypegetStatsSchema(StructType dataSchema) Returns the expected statistics schema given a table schema.static booleanisSkippingEligibleLiteral(Literal literal) Returns true if the given literal is skipping-eligible.booleanisSkippingEligibleMinMaxColumn(Column column) Returns true if the given column is skipping-eligible using min/max statistics.booleanReturns true if the given column is skipping-eligible using null count statistics.
-
Constructor Details
-
StatsSchemaHelper
-
-
Method Details
-
isSkippingEligibleLiteral
Returns true if the given literal is skipping-eligible. Delta tracks min/max stats for a limited set of data types and only literals of those types are skipping eligible. -
getStatsSchema
Returns the expected statistics schema given a table schema. Here is an example of a data schema along with the schema of the statistics that would be collected. Data Schema: {{{ |-- a: struct (nullable = true) | |-- b: struct (nullable = true) | | |-- c: long (nullable = true) }}} Collected Statistics: {{{ |-- stats: struct (nullable = true) | |-- numRecords: long (nullable = false) | |-- minValues: struct (nullable = false) | | |-- a: struct (nullable = false) | | | |-- b: struct (nullable = false) | | | | |-- c: long (nullable = true) | |-- maxValues: struct (nullable = false) | | |-- a: struct (nullable = false) | | | |-- b: struct (nullable = false) | | | | |-- c: long (nullable = true) | |-- nullCount: struct (nullable = false) | | |-- a: struct (nullable = false) | | | |-- b: struct (nullable = false) | | | | |-- c: long (nullable = true) }}} -
getMinColumn
Given a logical column in the data schema provided when creatingthis, return the corresponding MIN column in the statistic schema that stores the MIN values for the provided logical column. -
getMaxColumn
Given a logical column in the data schema provided when creatingthis, return the corresponding MAX column in the statistic schema that stores the MAX values for the provided logical column. -
getNullCountColumn
Given a logical column in the data schema provided when creatingthis, return the corresponding NULL_COUNT column in the statistic schema that stores the null count values for the provided logical column. -
getNumRecordsColumn
Returns the NUM_RECORDS column in the statistic schema -
isSkippingEligibleMinMaxColumn
Returns true if the given column is skipping-eligible using min/max statistics. This means the column exists, is a leaf column, and is of a skipping-eligible data-type. -
isSkippingEligibleNullCountColumn
Returns true if the given column is skipping-eligible using null count statistics. This means the column exists and is a leaf column as we only collect stats for leaf columns.
-