Package io.delta.kernel.internal.util
Class SchemaUtils
Object
io.delta.kernel.internal.util.SchemaUtils
Utility methods for schema related operations such as validating the schema has no duplicate
columns and the names contain only valid characters.
-
Method Summary
Modifier and TypeMethodDescriptioncasePreservingPartitionColNames(StructType tableSchema, List<String> partitionColumns) Delta expects partition column names to be same case preserving as the name in the schema.casePreservingPartitionColNames(List<String> partitionColNames, Map<String, Literal> partitionValues) Convert the partition column names inpartitionValuesmap into the same case as the column in the table metadata.static intfindColIndex(StructType schema, String colName) Search (case-insensitive) for the givencolNamein theschemaand return its position in theschema.static voidvalidatePartitionColumns(StructType schema, List<String> partitionCols) Verify the partition columns exists in the table schema and a supported data type for a partition column.static voidvalidateSchema(StructType schema, boolean isColumnMappingEnabled) Validate the schema.
-
Method Details
-
validateSchema
Validate the schema. This method checks if the schema has no duplicate columns, the names contain only valid characters and the data types are supported.- Parameters:
schema- the schema to validateisColumnMappingEnabled- whether column mapping is enabled. When column mapping is enabled, the column names in the schema can contain special characters that are allowed as column names in the Parquet file- Throws:
IllegalArgumentException- if the schema is invalid
-
validatePartitionColumns
Verify the partition columns exists in the table schema and a supported data type for a partition column.- Parameters:
schema-partitionCols-
-
casePreservingPartitionColNames
public static List<String> casePreservingPartitionColNames(StructType tableSchema, List<String> partitionColumns) Delta expects partition column names to be same case preserving as the name in the schema. E.g: Schema: (a INT, B STRING) and partition columns: (b). In this case we store the schema as (a INT, B STRING) and partition columns as (B). This method expects the inputs are already validated (i.e. schema contains all the partition columns). -
casePreservingPartitionColNames
public static Map<String,Literal> casePreservingPartitionColNames(List<String> partitionColNames, Map<String, Literal> partitionValues) Convert the partition column names inpartitionValuesmap into the same case as the column in the table metadata. Delta expects the partition column names to preserve the case same as the table schema.- Parameters:
partitionColNames- List of partition columns in the table metadata. The names preserve the case as given by the connector when the table is created.partitionValues- Map of partition column name to partition value. Convert the partition column name to be same case preserving name as its equivalent column in thepartitionColName. Column name comparison is case-insensitive.- Returns:
- Rewritten
partitionValuesmap with names case preserved.
-
findColIndex
Search (case-insensitive) for the givencolNamein theschemaand return its position in theschema.- Parameters:
schema-StructTypecolName- Name of the column whose index is needed.- Returns:
- Valid index or -1 if not found.
-