Packages

package commands

Ordering
  1. Alphabetic
Visibility
  1. Public
  2. All

Type Members

  1. trait AlterDeltaTableCommand extends DeltaCommand

    A super trait for alter table commands that modify Delta tables.

  2. case class AlterTableAddColumnsDeltaCommand(table: DeltaTableV2, colsToAddWithPosition: Seq[QualifiedColType]) extends LogicalPlan with LeafRunnableCommand with AlterDeltaTableCommand with IgnoreCachedData with Product with Serializable

    A command that add columns to a Delta table.

    A command that add columns to a Delta table. The syntax of using this command in SQL is:

    ALTER TABLE table_identifier
    ADD COLUMNS (col_name data_type [COMMENT col_comment], ...);
  3. case class AlterTableAddConstraintDeltaCommand(table: DeltaTableV2, name: String, exprText: String) extends LogicalPlan with AlterTableConstraintDeltaCommand with Product with Serializable

    Command to add a constraint to a Delta table.

    Command to add a constraint to a Delta table. Currently only CHECK constraints are supported.

    Adding a constraint will scan all data in the table to verify the constraint currently holds.

    table

    The table to which the constraint should be added.

    name

    The name of the new constraint.

    exprText

    The contents of the new CHECK constraint, to be parsed and evaluated.

  4. case class AlterTableChangeColumnDeltaCommand(table: DeltaTableV2, columnPath: Seq[String], columnName: String, newColumn: StructField, colPosition: Option[ColumnPosition], syncIdentity: Boolean) extends LogicalPlan with LeafRunnableCommand with AlterDeltaTableCommand with IgnoreCachedData with Product with Serializable

    A command to change the column for a Delta table, support changing the comment of a column and reordering columns.

    A command to change the column for a Delta table, support changing the comment of a column and reordering columns.

    The syntax of using this command in SQL is:

    ALTER TABLE table_identifier
    CHANGE [COLUMN] column_old_name column_new_name column_dataType [COMMENT column_comment]
    [FIRST | AFTER column_name];
  5. case class AlterTableClusterByDeltaCommand(table: DeltaTableV2, clusteringColumns: Seq[Seq[String]]) extends LogicalPlan with LeafRunnableCommand with AlterDeltaTableCommand with IgnoreCachedData with Product with Serializable

    Command for altering clustering columns for clustered tables.

    Command for altering clustering columns for clustered tables. - ALTER TABLE .. CLUSTER BY (col1, col2, ...) - ALTER TABLE .. CLUSTER BY NONE

    Note that the given clusteringColumns are empty when CLUSTER BY NONE is specified. Also, clusteringColumns are validated (e.g., duplication / existence check) in DeltaCatalog.alterTable().

  6. trait AlterTableConstraintDeltaCommand extends LogicalPlan with LeafRunnableCommand with AlterDeltaTableCommand with IgnoreCachedData
  7. case class AlterTableDropColumnsDeltaCommand(table: DeltaTableV2, columnsToDrop: Seq[Seq[String]]) extends LogicalPlan with LeafRunnableCommand with AlterDeltaTableCommand with IgnoreCachedData with Product with Serializable

    A command that drop columns from a Delta table.

    A command that drop columns from a Delta table. The syntax of using this command in SQL is:

    ALTER TABLE table_identifier
    DROP COLUMN(S) (col_name_1, col_name_2, ...);
  8. case class AlterTableDropConstraintDeltaCommand(table: DeltaTableV2, name: String, ifExists: Boolean) extends LogicalPlan with AlterTableConstraintDeltaCommand with Product with Serializable

    Command to drop a constraint from a Delta table.

    Command to drop a constraint from a Delta table. No-op if a constraint with the given name doesn't exist.

    Currently only CHECK constraints are supported.

    table

    The table from which the constraint should be dropped

    name

    The name of the constraint to drop

  9. case class AlterTableDropFeatureDeltaCommand(table: DeltaTableV2, featureName: String, truncateHistory: Boolean = false) extends LogicalPlan with LeafRunnableCommand with AlterDeltaTableCommand with IgnoreCachedData with Product with Serializable

    A command that removes an existing feature from the table.

    A command that removes an existing feature from the table. The feature needs to implement the RemovableFeature trait.

    The syntax of the command is:

    ALTER TABLE t DROP FEATURE f [TRUNCATE HISTORY]

    The operation consists of two stages (see RemovableFeature): 1) preDowngradeCommand. This command is responsible for removing any data and metadata related to the feature. 2) Protocol downgrade. Removes the feature from the current version's protocol. During this stage we also validate whether all traces of the feature-to-be-removed are gone.

    For removing writer features the 2 steps above are sufficient. However, for removing reader+writer features we also need to ensure the history does not contain any traces of the removed feature. The user journey is the following:

    1) The user runs the remove feature command which removes any traces of the feature from the latest version. The removal command throws a message that there was partial success and the retention period must pass before a protocol downgrade is possible. 2) The user runs again the command after the retention period is over. The command checks the current state again and the history. If everything is clean, it proceeds with the protocol downgrade. The TRUNCATE HISTORY option may be used here to automatically set the log retention period to a minimum of 24 hours before clearing the logs. The minimum value is based on the expected duration of the longest running transaction. This is the lowest retention period we can set without endangering concurrent transactions. If transactions do run for longer than this period while this command is run, then this can lead to data corruption.

    Note, legacy features can be removed as well, as long as the protocol supports Table Features. This will not downgrade protocol versions but only remove the feature from the supported features list. For example, removing legacyRWFeature from (3, 7, [legacyRWFeature], [legacyRWFeature]) will result in (3, 7, [], []) and not (1, 1).

  10. case class AlterTableReplaceColumnsDeltaCommand(table: DeltaTableV2, columns: Seq[StructField]) extends LogicalPlan with LeafRunnableCommand with AlterDeltaTableCommand with IgnoreCachedData with Product with Serializable

    A command to replace columns for a Delta table, support changing the comment of a column, reordering columns, and loosening nullabilities.

    A command to replace columns for a Delta table, support changing the comment of a column, reordering columns, and loosening nullabilities.

    The syntax of using this command in SQL is:

    ALTER TABLE table_identifier REPLACE COLUMNS (col_spec[, col_spec ...]);
  11. case class AlterTableSetLocationDeltaCommand(table: DeltaTableV2, location: String) extends LogicalPlan with LeafRunnableCommand with AlterDeltaTableCommand with IgnoreCachedData with Product with Serializable

    A command to change the location of a Delta table.

    A command to change the location of a Delta table. Effectively, this only changes the symlink in the Hive MetaStore from one Delta table to another.

    This command errors out if the new location is not a Delta table. By default, the new Delta table must have the same schema as the old table, but we have a SQL conf that allows users to bypass this schema check.

    The syntax of using this command in SQL is:

    ALTER TABLE table_identifier SET LOCATION 'path/to/new/delta/table';
  12. case class AlterTableSetPropertiesDeltaCommand(table: DeltaTableV2, configuration: Map[String, String]) extends LogicalPlan with LeafRunnableCommand with AlterDeltaTableCommand with IgnoreCachedData with Product with Serializable

    A command that sets Delta table configuration.

    A command that sets Delta table configuration.

    The syntax of this command is:

    ALTER TABLE table1 SET TBLPROPERTIES ('key1' = 'val1', 'key2' = 'val2', ...);
  13. case class AlterTableUnsetPropertiesDeltaCommand(table: DeltaTableV2, propKeys: Seq[String], ifExists: Boolean) extends LogicalPlan with LeafRunnableCommand with AlterDeltaTableCommand with IgnoreCachedData with Product with Serializable

    A command that unsets Delta table configuration.

    A command that unsets Delta table configuration. If ifExists is false, each individual key will be checked if it exists or not, it's a one-by-one operation, not an all or nothing check. Otherwise, non-existent keys will be ignored.

    The syntax of this command is:

    ALTER TABLE table1 UNSET TBLPROPERTIES [IF EXISTS] ('key1', 'key2', ...);
  14. abstract class CloneConvertedSource extends CloneSource

    A convertible non-delta table source to be cloned from

  15. class CloneDeltaSource extends CloneSource

    A delta table source to be cloned from

  16. case class CloneIcebergSource(tableIdentifier: TableIdentifier, sparkTable: Option[Table], tableSchema: Option[StructType], spark: SparkSession) extends CloneConvertedSource with Product with Serializable

    A iceberg table source to be cloned from

  17. case class CloneParquetSource(tableIdentifier: TableIdentifier, catalogTable: Option[CatalogTable], spark: SparkSession) extends CloneConvertedSource with Product with Serializable

    A parquet table source to be cloned from

  18. trait CloneSource extends Closeable

    An interface of the source table to be cloned from.

  19. abstract class CloneTableBase extends LogicalPlan with LeafCommand with CloneTableBaseUtils with SQLConfHelper
  20. trait CloneTableBaseUtils extends DeltaLogging
  21. case class CloneTableCommand(sourceTable: CloneSource, targetIdent: TableIdentifier, tablePropertyOverrides: Map[String, String], targetPath: Path) extends CloneTableBase with Product with Serializable

    Clones a Delta table to a new location with a new table id.

    Clones a Delta table to a new location with a new table id. The clone can be performed as a shallow clone (i.e. shallow = true), where we do not copy the files, but just point to them. If a table exists at the given targetPath, that table will be replaced.

    sourceTable

    is the table to be cloned

    targetIdent

    destination table identifier to clone to

    tablePropertyOverrides

    user-defined table properties that should override any properties with the same key from the source table

    targetPath

    the actual destination

  22. case class ClusteringStrategy(sparkSession: SparkSession, clusteringColumns: Seq[String]) extends OptimizeTableStrategy with Product with Serializable

    Implements clustering strategy for clustered tables

  23. case class CompactionStrategy(sparkSession: SparkSession, optimizeContext: DeltaOptimizeContext) extends OptimizeTableStrategy with Product with Serializable

    Implements compaction strategy

  24. case class ConvertToDeltaCommand(tableIdentifier: TableIdentifier, partitionSchema: Option[StructType], collectStats: Boolean, deltaPath: Option[String]) extends ConvertToDeltaCommandBase with Product with Serializable
  25. abstract class ConvertToDeltaCommandBase extends LogicalPlan with LeafRunnableCommand with DeltaCommand

    Convert an existing parquet table to a delta table by creating delta logs based on existing files.

    Convert an existing parquet table to a delta table by creating delta logs based on existing files. Here are the main components:

    • File Listing: Launch a spark job to list files from a given directory in parallel.
    • Schema Inference: Given an iterator on the file list result, we group the iterator into sequential batches and launch a spark job to infer schema for each batch, and finally merge schemas from all batches.
    • Stats collection: Again, we group the iterator on file list results into sequential batches and launch a spark job to collect stats for each batch.
    • Commit the files: We take the iterator of files with stats and write out a delta log file as the first commit. This bypasses the transaction protocol, but it's ok as this would be the very first commit.
  26. case class CreateDeltaTableCommand(table: CatalogTable, existingTableOpt: Option[CatalogTable], mode: SaveMode, query: Option[LogicalPlan], operation: CreationMode = TableCreationModes.Create, tableByPath: Boolean = false, output: Seq[Attribute] = Nil, protocol: Option[Protocol] = None) extends LogicalPlan with LeafRunnableCommand with DeltaCommand with DeltaLogging with Product with Serializable

    Single entry point for all write or declaration operations for Delta tables accessed through the table name.

    Single entry point for all write or declaration operations for Delta tables accessed through the table name.

    table

    The table identifier for the Delta table

    existingTableOpt

    The existing table for the same identifier if exists

    mode

    The save mode when writing data. Relevant when the query is empty or set to Ignore with CREATE TABLE IF NOT EXISTS.

    query

    The query to commit into the Delta table if it exist. This can come from

    • CTAS
    • saveAsTable
    protocol

    This is used to create a table with specific protocol version

  27. case class DeleteCommand(deltaLog: DeltaLog, catalogTable: Option[CatalogTable], target: LogicalPlan, condition: Option[Expression]) extends LogicalPlan with LeafRunnableCommand with DeltaCommand with DeleteCommandMetrics with Product with Serializable

    Performs a Delete based on the search condition

    Performs a Delete based on the search condition

    Algorithm: 1) Scan all the files and determine which files have the rows that need to be deleted. 2) Traverse the affected files and rebuild the touched files. 3) Use the Delta protocol to atomically write the remaining rows to new files and remove the affected files that are identified in step 1.

  28. trait DeleteCommandMetrics extends AnyRef
  29. case class DeleteMetric(condition: String, numFilesTotal: Long, numTouchedFiles: Long, numRewrittenFiles: Long, numRemovedFiles: Long, numAddedFiles: Long, numAddedChangeFiles: Long, numFilesBeforeSkipping: Long, numBytesBeforeSkipping: Long, numFilesAfterSkipping: Long, numBytesAfterSkipping: Long, numPartitionsAfterSkipping: Option[Long], numPartitionsAddedTo: Option[Long], numPartitionsRemovedFrom: Option[Long], numCopiedRows: Option[Long], numDeletedRows: Option[Long], numBytesAdded: Long, numBytesRemoved: Long, changeFileBytes: Long, scanTimeMs: Long, rewriteTimeMs: Long, numDeletionVectorsAdded: Long, numDeletionVectorsRemoved: Long, numDeletionVectorsUpdated: Long) extends Product with Serializable

    Used to report details about delete.

    Used to report details about delete.

    Note

    All the time units are milliseconds.

  30. case class DeletionVectorData(filePath: String, deletionVectorId: Option[String], deletedRowIndexSet: Array[Byte], deletedRowIndexCount: Long) extends Sizing with Product with Serializable

    Row containing the file path and its new deletion vector bitmap in memory

    Row containing the file path and its new deletion vector bitmap in memory

    filePath

    Absolute path of the data file this DV result is generated for.

    deletionVectorId

    Existing DeletionVectorDescriptor serialized in JSON format. This info is used to load the existing DV with the new DV.

    deletedRowIndexSet

    In-memory Deletion vector bitmap generated containing the newly deleted row indexes from data file.

    deletedRowIndexCount

    Count of rows marked as deleted using the deletedRowIndexSet.

  31. case class DeletionVectorResult(filePath: String, deletionVector: DeletionVectorDescriptor, matchedRowCount: Long) extends Product with Serializable

    Final output for each file containing the file path, DeletionVectorDescriptor and how many rows are marked as deleted in this file as part of the this operation (doesn't include rows that are already marked as deleted).

    Final output for each file containing the file path, DeletionVectorDescriptor and how many rows are marked as deleted in this file as part of the this operation (doesn't include rows that are already marked as deleted).

    filePath

    Absolute path of the data file this DV result is generated for.

    deletionVector

    Deletion vector generated containing the newly deleted row indices from data file.

    matchedRowCount

    Number of rows marked as deleted using the deletionVector.

  32. trait DeletionVectorUtils extends AnyRef
  33. trait DeltaCommand extends DeltaLogging

    Helper trait for all delta commands.

  34. case class DeltaGenerateCommand(modeName: String, tableId: TableIdentifier, options: Map[String, String]) extends LogicalPlan with LeafRunnableCommand with Product with Serializable
  35. case class DeltaOptimizeContext(reorg: Option[DeltaReorgOperation] = None, minFileSize: Option[Long] = None, maxFileSize: Option[Long] = None, maxDeletedRowsRatio: Option[Double] = None) extends Product with Serializable

    Stored all runtime context information that can control the execution of optimize.

    Stored all runtime context information that can control the execution of optimize.

    reorg

    The REORG operation that triggered the rewriting task, if any.

    minFileSize

    Files which are smaller than this threshold will be selected for compaction. If not specified, DeltaSQLConf.DELTA_OPTIMIZE_MIN_FILE_SIZE will be used. This parameter must be set to 0 when reorg is set.

    maxDeletedRowsRatio

    Files with a ratio of soft-deleted rows to the total rows larger than this threshold will be rewritten by the OPTIMIZE command. If not specified, DeltaSQLConf.DELTA_OPTIMIZE_MAX_DELETED_ROWS_RATIO will be used. This parameter must be set to 0 when reorg is set.

  36. class DeltaPurgeOperation extends DeltaReorgOperation

    Reorg operation to purge files with soft deleted rows.

  37. sealed trait DeltaReorgOperation extends AnyRef

    Defines a Reorg operation to be applied during optimize.

  38. case class DeltaReorgTable(target: LogicalPlan, reorgTableSpec: DeltaReorgTableSpec = ...)(predicates: Seq[String]) extends LogicalPlan with UnaryCommand with Product with Serializable
  39. case class DeltaReorgTableCommand(target: LogicalPlan, reorgTableSpec: DeltaReorgTableSpec = ...)(predicates: Seq[String]) extends OptimizeTableCommandBase with ReorgTableForUpgradeUniformHelper with LeafCommand with IgnoreCachedData with Product with Serializable

    The REORG TABLE command.

  40. case class DeltaReorgTableSpec(reorgTableMode: DeltaReorgTableMode.Value, icebergCompatVersionOpt: Option[Int]) extends Product with Serializable
  41. class DeltaRewriteTypeWideningOperation extends DeltaReorgOperation

    Internal reorg operation to rewrite files to conform to the current table schema when dropping the type widening table feature.

  42. class DeltaUpgradeUniformOperation extends DeltaReorgOperation

    Reorg operation to upgrade the iceberg compatibility version of a table.

  43. case class DeltaVacuumStats(isDryRun: Boolean, specifiedRetentionMillis: Option[Long], defaultRetentionMillis: Long, minRetainedTimestamp: Long, dirsPresentBeforeDelete: Long, filesAndDirsPresentBeforeDelete: Long, objectsDeleted: Long, sizeOfDataToDelete: Long, timeTakenToIdentifyEligibleFiles: Long, timeTakenForDelete: Long, vacuumStartTime: Long, vacuumEndTime: Long, numPartitionColumns: Long) extends Product with Serializable
  44. case class DescribeDeltaDetailCommand(child: LogicalPlan, hadoopConf: Map[String, String]) extends LogicalPlan with RunnableCommand with UnaryNode with DeltaLogging with DeltaCommand with Product with Serializable

    A command for describing the details of a table such as the format, name, and size.

  45. case class DescribeDeltaHistory(child: LogicalPlan, limit: Option[Int], output: Seq[Attribute] = ...) extends LogicalPlan with UnaryNode with MultiInstanceRelation with DeltaCommand with Product with Serializable

    A logical placeholder for describing a Delta table's history, so that the history can be leveraged in subqueries.

    A logical placeholder for describing a Delta table's history, so that the history can be leveraged in subqueries. Replaced with DescribeDeltaHistoryCommand during planning.

  46. case class DescribeDeltaHistoryCommand(table: DeltaTableV2, limit: Option[Int], output: Seq[Attribute] = ...) extends LogicalPlan with LeafRunnableCommand with MultiInstanceRelation with DeltaLogging with Product with Serializable

    A command for describing the history of a Delta table.

  47. case class FileToDvDescriptor(path: String, deletionVectorId: Option[String]) extends Product with Serializable

    Holds a mapping from a file path (url-encoded) to an (optional) serialized Deletion Vector descriptor.

  48. case class MergeIntoCommand(source: LogicalPlan, target: LogicalPlan, catalogTable: Option[CatalogTable], targetFileIndex: TahoeFileIndex, condition: Expression, matchedClauses: Seq[DeltaMergeIntoMatchedClause], notMatchedClauses: Seq[DeltaMergeIntoNotMatchedClause], notMatchedBySourceClauses: Seq[DeltaMergeIntoNotMatchedBySourceClause], migratedSchema: Option[StructType], schemaEvolutionEnabled: Boolean = false) extends LogicalPlan with MergeIntoCommandBase with InsertOnlyMergeExecutor with ClassicMergeExecutor with Product with Serializable

    Performs a merge of a source query/table into a Delta table.

    Performs a merge of a source query/table into a Delta table.

    Issues an error message when the ON search_condition of the MERGE statement can match a single row from the target table with multiple rows of the source table-reference.

    Algorithm:

    Phase 1: Find the input files in target that are touched by the rows that satisfy the condition and verify that no two source rows match with the same target row. This is implemented as an inner-join using the given condition. See ClassicMergeExecutor for more details.

    Phase 2: Read the touched files again and write new files with updated and/or inserted rows.

    Phase 3: Use the Delta protocol to atomically remove the touched files and add the new files.

    source

    Source data to merge from

    target

    Target table to merge into

    targetFileIndex

    TahoeFileIndex of the target table

    condition

    Condition for a source row to match with a target row

    matchedClauses

    All info related to matched clauses.

    notMatchedClauses

    All info related to not matched clauses.

    notMatchedBySourceClauses

    All info related to not matched by source clauses.

    migratedSchema

    The final schema of the target - may be changed by schema evolution.

  49. trait MergeIntoCommandBase extends LogicalPlan with LeafRunnableCommand with DeltaCommand with DeltaLogging with PredicateHelper with ImplicitMetadataOperation with MergeIntoMaterializeSource with UpdateExpressionsSupport
  50. class OptimizeExecutor extends DeltaCommand with SQLMetricsReporting with Serializable

    Optimize job which compacts small files into larger files to reduce the number of files and potentially allow more efficient reads.

  51. case class OptimizeTableCommand(child: LogicalPlan, userPartitionPredicates: Seq[String], optimizeContext: DeltaOptimizeContext)(zOrderBy: Seq[UnresolvedAttribute]) extends OptimizeTableCommandBase with RunnableCommand with UnaryNode with Product with Serializable

    The optimize command implementation for Spark SQL.

    The optimize command implementation for Spark SQL. Example SQL:

    OPTIMIZE ('/path/to/dir' | delta.table) [WHERE part = 25];
  52. abstract class OptimizeTableCommandBase extends LogicalPlan with RunnableCommand with DeltaCommand

    Base class defining abstract optimize command

  53. trait OptimizeTableStrategy extends AnyRef

    Defines set of utilities used in OptimizeTableCommand.

    Defines set of utilities used in OptimizeTableCommand. The behavior of these utilities will change based on the OptimizeTableMode: COMPACTION, ZORDER and CLUSTERING.

  54. trait ReorgTableForUpgradeUniformHelper extends DeltaLogging

    Helper trait for ReorgTableCommand to rewrite the table to be Iceberg compatible.

  55. case class RestoreTableCommand(sourceTable: DeltaTableV2) extends LogicalPlan with LeafRunnableCommand with DeltaCommand with RestoreTableCommandBase with Product with Serializable

    Perform restore of delta table to a specified version or timestamp

    Perform restore of delta table to a specified version or timestamp

    Algorithm: 1) Read the latest snapshot of the table. 2) Read snapshot for version or timestamp to restore 3) Compute files available in snapshot for restoring (files were removed by some commit) but missed in the latest. Add these files into commit as AddFile action. 4) Compute files available in the latest snapshot (files were added after version to restore) but missed in the snapshot to restore. Add these files into commit as RemoveFile action. 5) If SQLConf.IGNORE_MISSING_FILES option is false (default value) check availability of AddFile in file system. 6) Commit metadata, Protocol, all RemoveFile and AddFile actions into delta log using commitLarge (commit will be failed in case of parallel transaction) 7) If table was modified in parallel then ignore restore and raise exception.

  56. trait RestoreTableCommandBase extends AnyRef

    Base trait class for RESTORE.

    Base trait class for RESTORE. Defines command output schema and metrics.

  57. case class ShowDeltaTableColumnsCommand(child: LogicalPlan) extends LogicalPlan with RunnableCommand with UnaryNode with DeltaCommand with Product with Serializable

    A command for listing all column names of a Delta table.

    A command for listing all column names of a Delta table.

    child

    The resolved Delta table

  58. case class SnapshotOverwriteOperationMetrics(sourceSnapshotSizeInBytes: Long, sourceSnapshotFileCount: Long, destSnapshotAddedFileCount: Long, destSnapshotAddedFilesSizeInBytes: Long) extends Product with Serializable

    Metrics of snapshot overwrite operation.

    Metrics of snapshot overwrite operation.

    sourceSnapshotSizeInBytes

    Total size of the data in the source snapshot.

    sourceSnapshotFileCount

    Number of data files in the source snapshot.

    destSnapshotAddedFileCount

    Number of new data files added to the destination snapshot as part of the execution.

    destSnapshotAddedFilesSizeInBytes

    Total size (in bytes) of the data files that were added to the destination snapshot.

  59. case class TableColumns(col_name: String) extends Product with Serializable

    The column format of the result returned by the SHOW COLUMNS command.

  60. case class TableDetail(format: String, id: String, name: String, description: String, location: String, createdAt: Timestamp, lastModified: Timestamp, partitionColumns: Seq[String], clusteringColumns: Seq[String], numFiles: Long, sizeInBytes: Long, properties: Map[String, String], minReaderVersion: Integer, minWriterVersion: Integer, tableFeatures: Seq[String]) extends Product with Serializable

    The result returned by the describe detail command.

  61. case class TouchedFileWithDV(inputFilePath: String, fileLogEntry: AddFile, newDeletionVector: DeletionVectorDescriptor, deletedRows: Long) extends Product with Serializable
  62. case class UpdateCommand(tahoeFileIndex: TahoeFileIndex, catalogTable: Option[CatalogTable], target: LogicalPlan, updateExpressions: Seq[Expression], condition: Option[Expression]) extends LogicalPlan with LeafRunnableCommand with DeltaCommand with Product with Serializable

    Performs an Update using updateExpression on the rows that match condition

    Performs an Update using updateExpression on the rows that match condition

    Algorithm: 1) Identify the affected files, i.e., the files that may have the rows to be updated. 2) Scan affected files, apply the updates, and generate a new DF with updated rows. 3) Use the Delta protocol to atomically write the new DF as new files and remove the affected files that are identified in step 1.

  63. case class UpdateMetric(condition: String, numFilesTotal: Long, numTouchedFiles: Long, numRewrittenFiles: Long, numAddedChangeFiles: Long, changeFileBytes: Long, scanTimeMs: Long, rewriteTimeMs: Long, numDeletionVectorsAdded: Long, numDeletionVectorsRemoved: Long, numDeletionVectorsUpdated: Long) extends Product with Serializable

    Used to report details about update.

    Used to report details about update.

    Note

    All the time units are milliseconds.

  64. trait VacuumCommandImpl extends DeltaCommand
  65. case class WriteIntoDelta(deltaLog: DeltaLog, mode: SaveMode, options: DeltaOptions, partitionColumns: Seq[String], configuration: Map[String, String], data: DataFrame, catalogTableOpt: Option[CatalogTable] = None, schemaInCatalog: Option[StructType] = None) extends LogicalPlan with LeafRunnableCommand with ImplicitMetadataOperation with DeltaCommand with WriteIntoDeltaLike with Product with Serializable

    Used to write a DataFrame into a delta table.

    Used to write a DataFrame into a delta table.

    New Table Semantics

    • The schema of the DataFrame is used to initialize the table.
    • The partition columns will be used to partition the table.

    Existing Table Semantics

    • The save mode will control how existing data is handled (i.e. overwrite, append, etc)
    • The schema of the DataFrame will be checked and if there are new columns present they will be added to the tables schema. Conflicting columns (i.e. a INT, and a STRING) will result in an exception
    • The partition columns, if present are validated against the existing metadata. If not present, then the partitioning of the table is respected.

    In combination with Overwrite, a replaceWhere option can be used to transactionally replace data that matches a predicate.

    In combination with Overwrite dynamic partition overwrite mode (option partitionOverwriteMode set to dynamic, or in spark conf spark.sql.sources.partitionOverwriteMode set to dynamic) is also supported.

    Dynamic partition overwrite mode conflicts with replaceWhere:

    • If a replaceWhere option is provided, and dynamic partition overwrite mode is enabled in the DataFrameWriter options, an error will be thrown.
    • If a replaceWhere option is provided, and dynamic partition overwrite mode is enabled in the spark conf, data will be overwritten according to the replaceWhere expression
    catalogTableOpt

    Should explicitly be set when table is accessed from catalog

    schemaInCatalog

    The schema created in Catalog. We will use this schema to update metadata when it is set (in CTAS code path), and otherwise use schema from data.

  66. trait WriteIntoDeltaLike extends AnyRef

    An interface for writing data into Delta tables.

  67. case class ZOrderStrategy(sparkSession: SparkSession, zOrderColumns: Seq[String]) extends OptimizeTableStrategy with Product with Serializable

    Implements ZOrder strategy

Value Members

  1. object CloneSourceFormat
  2. object CloneTableBase extends Logging
  3. object CloneTableCommand extends Serializable
  4. object ConvertToDeltaCommand extends DeltaLogging with Serializable
  5. object DMLUtils
  6. object DMLWithDeletionVectorsHelper extends DeltaCommand

    Contains utility classes and method for performing DML operations with Deletion Vectors.

  7. object DeleteCommand extends Serializable
  8. object DeletionVectorBitmapGenerator
  9. object DeletionVectorData extends Serializable
  10. object DeletionVectorResult extends Serializable
  11. object DeletionVectorUtils extends DeletionVectorUtils
  12. object DeletionVectorWriter extends DeltaLogging

    Utility methods to write the deletion vector to storage.

    Utility methods to write the deletion vector to storage. If a particular file already has an existing DV, it will be merged with the new deletion vector and written to storage.

  13. object DeltaGenerateCommand extends Serializable
  14. object DeltaReorgTableMode extends Enumeration
  15. object DescribeDeltaDetailCommand extends Serializable
  16. object DescribeDeltaHistory extends Serializable
  17. object FileToDvDescriptor extends Serializable
  18. object MergeIntoCommandBase
  19. object OptimizeTableCommand extends Serializable
  20. object OptimizeTableMode extends Enumeration
  21. object OptimizeTableStrategy
  22. object TableCreationModes
  23. object TableDetail extends Serializable
  24. object UpdateCommand extends Serializable
  25. object VacuumCommand extends VacuumCommandImpl with Serializable

    Vacuums the table by clearing all untracked files and folders within this table.

    Vacuums the table by clearing all untracked files and folders within this table. First lists all the files and directories in the table, and gets the relative paths with respect to the base of the table. Then it gets the list of all tracked files for this table, which may or may not be within the table base path, and gets the relative paths of all the tracked files with respect to the base of the table. Files outside of the table path will be ignored. Then we take a diff of the files and delete directories that were already empty, and all files that are within the table that are no longer tracked.

Ungrouped