Packages

package commands

Ordering
  1. Alphabetic
Visibility
  1. Public
  2. All

Type Members

  1. trait AlterDeltaTableCommand extends DeltaCommand

    A super trait for alter table commands that modify Delta tables.

  2. case class AlterTableAddColumnsDeltaCommand(table: DeltaTableV2, colsToAddWithPosition: Seq[QualifiedColType]) extends LogicalPlan with LeafRunnableCommand with AlterDeltaTableCommand with IgnoreCachedData with Product with Serializable

    A command that add columns to a Delta table.

    A command that add columns to a Delta table. The syntax of using this command in SQL is:

    ALTER TABLE table_identifier
    ADD COLUMNS (col_name data_type [COMMENT col_comment], ...);
  3. case class AlterTableAddConstraintDeltaCommand(table: DeltaTableV2, name: String, exprText: String) extends LogicalPlan with AlterTableConstraintDeltaCommand with Product with Serializable

    Command to add a constraint to a Delta table.

    Command to add a constraint to a Delta table. Currently only CHECK constraints are supported.

    Adding a constraint will scan all data in the table to verify the constraint currently holds.

    table

    The table to which the constraint should be added.

    name

    The name of the new constraint.

    exprText

    The contents of the new CHECK constraint, to be parsed and evaluated.

  4. case class AlterTableChangeColumnDeltaCommand(table: DeltaTableV2, columnPath: Seq[String], columnName: String, newColumn: StructField, colPosition: Option[ColumnPosition], syncIdentity: Boolean) extends LogicalPlan with LeafRunnableCommand with AlterDeltaTableCommand with IgnoreCachedData with Product with Serializable

    A command to change the column for a Delta table, support changing the comment of a column and reordering columns.

    A command to change the column for a Delta table, support changing the comment of a column and reordering columns.

    The syntax of using this command in SQL is:

    ALTER TABLE table_identifier
    CHANGE [COLUMN] column_old_name column_new_name column_dataType [COMMENT column_comment]
    [FIRST | AFTER column_name];
  5. case class AlterTableClusterByDeltaCommand(table: DeltaTableV2, clusteringColumns: Seq[Seq[String]]) extends LogicalPlan with LeafRunnableCommand with AlterDeltaTableCommand with IgnoreCachedData with Product with Serializable

    Command for altering clustering columns for clustered tables.

    Command for altering clustering columns for clustered tables. - ALTER TABLE .. CLUSTER BY (col1, col2, ...) - ALTER TABLE .. CLUSTER BY NONE

    Note that the given clusteringColumns are empty when CLUSTER BY NONE is specified. Also, clusteringColumns are validated (e.g., duplication / existence check) in DeltaCatalog.alterTable().

  6. trait AlterTableConstraintDeltaCommand extends LogicalPlan with LeafRunnableCommand with AlterDeltaTableCommand with IgnoreCachedData
  7. case class AlterTableDropColumnsDeltaCommand(table: DeltaTableV2, columnsToDrop: Seq[Seq[String]]) extends LogicalPlan with LeafRunnableCommand with AlterDeltaTableCommand with IgnoreCachedData with Product with Serializable

    A command that drop columns from a Delta table.

    A command that drop columns from a Delta table. The syntax of using this command in SQL is:

    ALTER TABLE table_identifier
    DROP COLUMN(S) (col_name_1, col_name_2, ...);
  8. case class AlterTableDropConstraintDeltaCommand(table: DeltaTableV2, name: String, ifExists: Boolean) extends LogicalPlan with AlterTableConstraintDeltaCommand with Product with Serializable

    Command to drop a constraint from a Delta table.

    Command to drop a constraint from a Delta table. No-op if a constraint with the given name doesn't exist.

    Currently only CHECK constraints are supported.

    table

    The table from which the constraint should be dropped

    name

    The name of the constraint to drop

  9. case class AlterTableDropFeatureDeltaCommand(table: DeltaTableV2, featureName: String, truncateHistory: Boolean = false) extends LogicalPlan with LeafRunnableCommand with AlterDeltaTableCommand with IgnoreCachedData with Product with Serializable

    A command that removes an existing feature from the table.

    A command that removes an existing feature from the table. The feature needs to implement the RemovableFeature trait.

    The syntax of the command is:

    ALTER TABLE t DROP FEATURE f [TRUNCATE HISTORY]

    When dropping a feature, remove the feature traces from the latest version. However, the table history still contains feature traces. This creates two problems:

    1) Reconstructing the state of the latest version may require replaying log records prior to feature removal. Log replay is based on checkpoints which is used by clients as a starting point for replaying history. Any actions before the checkpoint do not need to be replayed. However, checkpoints may be deleted at any time, which can then expose readers to older log records. 2) Clients could create checkpoints in past versions. These could lead to incorrect behavior if the client that created the checkpoint did not support all features.

    To address these issues, we currently provide two implementations:

    1) DropFeatureWithHistoryTruncation. We truncate history at the boundary of version of the dropped feature (when required). Requires two executions of the drop feature command with a waiting time in between the two executions. 2) executeDropFeatureWithCheckpointProtection, i.e. fast drop feature. We create barrier checkpoints to protect against log replay and checkpoint creation. The behavior is enforced with the aid of CheckpointProtectionTableFeature.

    Config tableFeatures.fastDropFeature.enabled can be used to control which implementation is used. Furthermore, please note the option [TRUNCATE HISTORY] in the SQL syntax is only relevant for DropFeatureWithHistoryTruncation. When used, we always fallback to that implementation.

    At a high level, dropping a feature consists of two stages (see RemovableFeature):

    1) preDowngradeCommand. This command is responsible for removing any data and metadata related to the feature. 2) Protocol downgrade. Removes the feature from the current version's protocol. During this stage we also validate whether all traces of the feature-to-be-removed are gone.

    For removing features with requiresHistoryProtection=false the two steps above are sufficient. For features that require history protection, we follow a different approach for each of the implementations listed above. Please see the corresponding functions for more details.

    Note, legacy features can be removed as well. When removing a legacy feature from a legacy protocol, if the result cannot be represented with a legacy representation we use the table features representation. For example, removing Invariants from (1, 3) results to (1, 7, None, [AppendOnly, CheckConstraints]). Adding back Invariants to the protocol is normalized back to (1, 3). This allows to consistently transition back and forth between legacy protocols and table feature protocols.

  10. case class AlterTableReplaceColumnsDeltaCommand(table: DeltaTableV2, columns: Seq[StructField]) extends LogicalPlan with LeafRunnableCommand with AlterDeltaTableCommand with IgnoreCachedData with Product with Serializable

    A command to replace columns for a Delta table, support changing the comment of a column, reordering columns, and loosening nullabilities.

    A command to replace columns for a Delta table, support changing the comment of a column, reordering columns, and loosening nullabilities.

    The syntax of using this command in SQL is:

    ALTER TABLE table_identifier REPLACE COLUMNS (col_spec[, col_spec ...]);
  11. case class AlterTableSetLocationDeltaCommand(table: DeltaTableV2, location: String) extends LogicalPlan with LeafRunnableCommand with AlterDeltaTableCommand with IgnoreCachedData with Product with Serializable

    A command to change the location of a Delta table.

    A command to change the location of a Delta table. Effectively, this only changes the symlink in the Hive MetaStore from one Delta table to another.

    This command errors out if the new location is not a Delta table. By default, the new Delta table must have the same schema as the old table, but we have a SQL conf that allows users to bypass this schema check.

    The syntax of using this command in SQL is:

    ALTER TABLE table_identifier SET LOCATION 'path/to/new/delta/table';
  12. case class AlterTableSetPropertiesDeltaCommand(table: DeltaTableV2, configuration: Map[String, String]) extends LogicalPlan with LeafRunnableCommand with AlterDeltaTableCommand with IgnoreCachedData with Product with Serializable

    A command that sets Delta table configuration.

    A command that sets Delta table configuration.

    The syntax of this command is:

    ALTER TABLE table1 SET TBLPROPERTIES ('key1' = 'val1', 'key2' = 'val2', ...);
  13. case class AlterTableUnsetPropertiesDeltaCommand(table: DeltaTableV2, propKeys: Seq[String], ifExists: Boolean, fromDropFeatureCommand: Boolean = false) extends LogicalPlan with LeafRunnableCommand with AlterDeltaTableCommand with IgnoreCachedData with Product with Serializable

    A command that unsets Delta table configuration.

    A command that unsets Delta table configuration. If ifExists is false, each individual key will be checked if it exists or not, it's a one-by-one operation, not an all or nothing check. Otherwise, non-existent keys will be ignored.

    The syntax of this command is:

    ALTER TABLE table1 UNSET TBLPROPERTIES [IF EXISTS] ('key1', 'key2', ...);
  14. case class Batch(bins: Seq[Bin]) extends Product with Serializable

    A batch represents all the bins that will be processed and commited in a single transaction.

    A batch represents all the bins that will be processed and commited in a single transaction.

    bins

    The set of bins to process in this transaction

  15. case class Bin(partitionValues: Map[String, String], files: Seq[AddFile]) extends Product with Serializable

    A bin represents a single set of files that are being re-written in a single Spark job.

    A bin represents a single set of files that are being re-written in a single Spark job. For compaction, this represents a single file being written. For clustering, this is an entire partition for Z-ordering, or an entire ZCube for liquid clustering.

    partitionValues

    The partition this set of files is in

    files

    The list of files being re-written

  16. abstract class CloneConvertedSource extends CloneSource

    A convertible non-delta table source to be cloned from

  17. class CloneDeltaSource extends CloneSource

    A delta table source to be cloned from

  18. case class CloneIcebergSource(tableIdentifier: TableIdentifier, sparkTable: Option[Table], tableSchema: Option[StructType], spark: SparkSession) extends CloneConvertedSource with Product with Serializable

    A iceberg table source to be cloned from

  19. case class CloneParquetSource(tableIdentifier: TableIdentifier, catalogTable: Option[CatalogTable], spark: SparkSession) extends CloneConvertedSource with Product with Serializable

    A parquet table source to be cloned from

  20. trait CloneSource extends Closeable

    An interface of the source table to be cloned from.

  21. abstract class CloneTableBase extends LogicalPlan with LeafCommand with CloneTableBaseUtils with SQLConfHelper
  22. trait CloneTableBaseUtils extends DeltaLogging
  23. case class CloneTableCommand(sourceTable: CloneSource, targetIdent: TableIdentifier, tablePropertyOverrides: Map[String, String], targetPath: Path) extends CloneTableBase with Product with Serializable

    Clones a Delta table to a new location with a new table id.

    Clones a Delta table to a new location with a new table id. The clone can be performed as a shallow clone (i.e. shallow = true), where we do not copy the files, but just point to them. If a table exists at the given targetPath, that table will be replaced.

    sourceTable

    is the table to be cloned

    targetIdent

    destination table identifier to clone to

    tablePropertyOverrides

    user-defined table properties that should override any properties with the same key from the source table

    targetPath

    the actual destination

  24. case class ClusteringStrategy(sparkSession: SparkSession, clusteringColumns: Seq[String], optimizeContext: DeltaOptimizeContext) extends OptimizeTableStrategy with Product with Serializable

    Implements clustering strategy for clustered tables

  25. case class CompactionStrategy(sparkSession: SparkSession, optimizeContext: DeltaOptimizeContext) extends OptimizeTableStrategy with Product with Serializable

    Implements compaction strategy

  26. case class ConvertToDeltaCommand(tableIdentifier: TableIdentifier, partitionSchema: Option[StructType], collectStats: Boolean, deltaPath: Option[String]) extends ConvertToDeltaCommandBase with Product with Serializable
  27. abstract class ConvertToDeltaCommandBase extends LogicalPlan with LeafRunnableCommand with DeltaCommand

    Convert an existing parquet table to a delta table by creating delta logs based on existing files.

    Convert an existing parquet table to a delta table by creating delta logs based on existing files. Here are the main components:

    • File Listing: Launch a spark job to list files from a given directory in parallel.
    • Schema Inference: Given an iterator on the file list result, we group the iterator into sequential batches and launch a spark job to infer schema for each batch, and finally merge schemas from all batches.
    • Stats collection: Again, we group the iterator on file list results into sequential batches and launch a spark job to collect stats for each batch.
    • Commit the files: We take the iterator of files with stats and write out a delta log file as the first commit. This bypasses the transaction protocol, but it's ok as this would be the very first commit.
  28. case class CreateDeltaTableCommand(table: CatalogTable, existingTableOpt: Option[CatalogTable], mode: SaveMode, query: Option[LogicalPlan], operation: CreationMode = TableCreationModes.Create, tableByPath: Boolean = false, output: Seq[Attribute] = Nil, protocol: Option[Protocol] = None, createTableFunc: Option[(CatalogTable) ⇒ Unit] = None) extends LogicalPlan with LeafRunnableCommand with DeltaCommand with DeltaLogging with Product with Serializable

    Single entry point for all write or declaration operations for Delta tables accessed through the table name.

    Single entry point for all write or declaration operations for Delta tables accessed through the table name.

    table

    The table identifier for the Delta table

    existingTableOpt

    The existing table for the same identifier if exists

    mode

    The save mode when writing data. Relevant when the query is empty or set to Ignore with CREATE TABLE IF NOT EXISTS.

    query

    The query to commit into the Delta table if it exist. This can come from

    • CTAS
    • saveAsTable
    protocol

    This is used to create a table with specific protocol version

    createTableFunc

    If specified, call this function to create the table, instead of Spark SessionCatalog#createTable which is backed by Hive Metastore.

  29. case class DeleteCommand(deltaLog: DeltaLog, catalogTable: Option[CatalogTable], target: LogicalPlan, condition: Option[Expression]) extends LogicalPlan with LeafRunnableCommand with DeltaCommand with DeleteCommandMetrics with Product with Serializable

    Performs a Delete based on the search condition

    Performs a Delete based on the search condition

    Algorithm: 1) Scan all the files and determine which files have the rows that need to be deleted. 2) Traverse the affected files and rebuild the touched files. 3) Use the Delta protocol to atomically write the remaining rows to new files and remove the affected files that are identified in step 1.

  30. trait DeleteCommandMetrics extends AnyRef
  31. case class DeleteMetric(condition: String, numFilesTotal: Long, numTouchedFiles: Long, numRewrittenFiles: Long, numRemovedFiles: Long, numAddedFiles: Long, numAddedChangeFiles: Long, numFilesBeforeSkipping: Long, numBytesBeforeSkipping: Long, numFilesAfterSkipping: Long, numBytesAfterSkipping: Long, numPartitionsAfterSkipping: Option[Long], numPartitionsAddedTo: Option[Long], numPartitionsRemovedFrom: Option[Long], numCopiedRows: Option[Long], numDeletedRows: Option[Long], numBytesAdded: Long, numBytesRemoved: Long, changeFileBytes: Long, scanTimeMs: Long, rewriteTimeMs: Long, numDeletionVectorsAdded: Long, numDeletionVectorsRemoved: Long, numDeletionVectorsUpdated: Long, commitVersion: Option[Long] = None, isWriteCommand: Boolean = false, numLogicalRecordsAdded: Option[Long] = None, numLogicalRecordsRemoved: Option[Long] = None) extends Product with Serializable

    Used to report details about delete.

    Used to report details about delete.

    Note

    All the time units are milliseconds.

  32. case class DeletionVectorData(filePath: String, deletionVectorId: Option[String], deletedRowIndexSet: Array[Byte], deletedRowIndexCount: Long) extends Sizing with Product with Serializable

    Row containing the file path and its new deletion vector bitmap in memory

    Row containing the file path and its new deletion vector bitmap in memory

    filePath

    Absolute path of the data file this DV result is generated for.

    deletionVectorId

    Existing DeletionVectorDescriptor serialized in JSON format. This info is used to load the existing DV with the new DV.

    deletedRowIndexSet

    In-memory Deletion vector bitmap generated containing the newly deleted row indexes from data file.

    deletedRowIndexCount

    Count of rows marked as deleted using the deletedRowIndexSet.

  33. case class DeletionVectorResult(filePath: String, deletionVector: DeletionVectorDescriptor, matchedRowCount: Long) extends Product with Serializable

    Final output for each file containing the file path, DeletionVectorDescriptor and how many rows are marked as deleted in this file as part of the this operation (doesn't include rows that are already marked as deleted).

    Final output for each file containing the file path, DeletionVectorDescriptor and how many rows are marked as deleted in this file as part of the this operation (doesn't include rows that are already marked as deleted).

    filePath

    Absolute path of the data file this DV result is generated for.

    deletionVector

    Deletion vector generated containing the newly deleted row indices from data file.

    matchedRowCount

    Number of rows marked as deleted using the deletionVector.

  34. trait DeletionVectorUtils extends DeltaLogging
  35. trait DeltaCommand extends DeltaLogging

    Helper trait for all delta commands.

  36. case class DeltaGenerateCommand(child: LogicalPlan, modeName: String) extends LogicalPlan with RunnableCommand with UnaryNode with DeltaCommand with Product with Serializable
  37. case class DeltaOptimizeContext(reorg: Option[DeltaReorgOperation] = None, minFileSize: Option[Long] = None, maxFileSize: Option[Long] = None, maxDeletedRowsRatio: Option[Double] = None, isFull: Boolean = false) extends Product with Serializable

    Stored all runtime context information that can control the execution of optimize.

    Stored all runtime context information that can control the execution of optimize.

    reorg

    The REORG operation that triggered the rewriting task, if any.

    minFileSize

    Files which are smaller than this threshold will be selected for compaction. If not specified, DeltaSQLConf.DELTA_OPTIMIZE_MIN_FILE_SIZE will be used. This parameter must be set to 0 when reorg is set.

    maxDeletedRowsRatio

    Files with a ratio of soft-deleted rows to the total rows larger than this threshold will be rewritten by the OPTIMIZE command. If not specified, DeltaSQLConf.DELTA_OPTIMIZE_MAX_DELETED_ROWS_RATIO will be used. This parameter must be set to 0 when reorg is set.

    isFull

    whether OPTIMIZE FULL is run. This is only for clustered tables.

  38. class DeltaPurgeOperation extends DeltaReorgOperation with ReorgTableHelper

    Reorg operation to purge files with soft deleted rows.

    Reorg operation to purge files with soft deleted rows. This operation will also try finding and removing the dropped columns from parquet files, if ever exists such column that does not present in the current table schema.

  39. sealed trait DeltaReorgOperation extends AnyRef

    Defines a Reorg operation to be applied during optimize.

  40. case class DeltaReorgTable(target: LogicalPlan, reorgTableSpec: DeltaReorgTableSpec = ...)(predicates: Seq[String]) extends LogicalPlan with UnaryCommand with Product with Serializable
  41. case class DeltaReorgTableCommand(target: LogicalPlan, reorgTableSpec: DeltaReorgTableSpec = ...)(predicates: Seq[String]) extends OptimizeTableCommandBase with ReorgTableForUpgradeUniformHelper with LeafCommand with IgnoreCachedData with Product with Serializable

    The REORG TABLE command.

  42. case class DeltaReorgTableSpec(reorgTableMode: DeltaReorgTableMode.Value, icebergCompatVersionOpt: Option[Int]) extends Product with Serializable
  43. class DeltaRewriteTypeWideningOperation extends DeltaReorgOperation with ReorgTableHelper

    Internal reorg operation to rewrite files to conform to the current table schema when dropping the type widening table feature.

  44. class DeltaUpgradeUniformOperation extends DeltaReorgOperation

    Reorg operation to upgrade the iceberg compatibility version of a table.

  45. case class DeltaVacuumStats(isDryRun: Boolean, specifiedRetentionMillis: Option[Long], defaultRetentionMillis: Long, minRetainedTimestamp: Long, dirsPresentBeforeDelete: Long, filesAndDirsPresentBeforeDelete: Long, objectsDeleted: Long, sizeOfDataToDelete: Long, timeTakenToIdentifyEligibleFiles: Long, timeTakenForDelete: Long, vacuumStartTime: Long, vacuumEndTime: Long, numPartitionColumns: Long, latestCommitVersion: Long, eligibleStartCommitVersion: Option[Long], eligibleEndCommitVersion: Option[Long], typeOfVacuum: String) extends Product with Serializable
  46. case class DescribeDeltaDetailCommand(child: LogicalPlan, hadoopConf: Map[String, String]) extends LogicalPlan with RunnableCommand with UnaryNode with DeltaLogging with DeltaCommand with Product with Serializable

    A command for describing the details of a table such as the format, name, and size.

  47. case class DescribeDeltaHistory(child: LogicalPlan, limit: Option[Int], output: Seq[Attribute] = ...) extends LogicalPlan with UnaryNode with MultiInstanceRelation with DeltaCommand with Product with Serializable

    A logical placeholder for describing a Delta table's history, so that the history can be leveraged in subqueries.

    A logical placeholder for describing a Delta table's history, so that the history can be leveraged in subqueries. Replaced with DescribeDeltaHistoryCommand during planning.

  48. case class DescribeDeltaHistoryCommand(table: DeltaTableV2, limit: Option[Int], output: Seq[Attribute] = ...) extends LogicalPlan with LeafRunnableCommand with MultiInstanceRelation with DeltaLogging with Product with Serializable

    A command for describing the history of a Delta table.

  49. case class FileToDvDescriptor(path: String, deletionVectorId: Option[String]) extends Product with Serializable

    Holds a mapping from a file path (url-encoded) to an (optional) serialized Deletion Vector descriptor.

  50. case class LastVacuumInfo(latestCommitVersionOutsideOfRetentionWindow: Option[Long] = None) extends Product with Serializable
  51. case class MergeIntoCommand(source: LogicalPlan, target: LogicalPlan, catalogTable: Option[CatalogTable], targetFileIndex: TahoeFileIndex, condition: Expression, matchedClauses: Seq[DeltaMergeIntoMatchedClause], notMatchedClauses: Seq[DeltaMergeIntoNotMatchedClause], notMatchedBySourceClauses: Seq[DeltaMergeIntoNotMatchedBySourceClause], migratedSchema: Option[StructType], trackHighWaterMarks: Set[String] = Set.empty, schemaEvolutionEnabled: Boolean = false) extends LogicalPlan with MergeIntoCommandBase with InsertOnlyMergeExecutor with ClassicMergeExecutor with Product with Serializable

    Performs a merge of a source query/table into a Delta table.

    Performs a merge of a source query/table into a Delta table.

    Issues an error message when the ON search_condition of the MERGE statement can match a single row from the target table with multiple rows of the source table-reference.

    Algorithm:

    Phase 1: Find the input files in target that are touched by the rows that satisfy the condition and verify that no two source rows match with the same target row. This is implemented as an inner-join using the given condition. See ClassicMergeExecutor for more details.

    Phase 2: Read the touched files again and write new files with updated and/or inserted rows.

    Phase 3: Use the Delta protocol to atomically remove the touched files and add the new files.

    source

    Source data to merge from

    target

    Target table to merge into

    targetFileIndex

    TahoeFileIndex of the target table

    condition

    Condition for a source row to match with a target row

    matchedClauses

    All info related to matched clauses.

    notMatchedClauses

    All info related to not matched clauses.

    notMatchedBySourceClauses

    All info related to not matched by source clauses.

    migratedSchema

    The final schema of the target - may be changed by schema evolution.

    trackHighWaterMarks

    The column names for which we will track IDENTITY high water marks.

  52. trait MergeIntoCommandBase extends LogicalPlan with LeafRunnableCommand with DeltaCommand with DeltaLogging with PredicateHelper with ImplicitMetadataOperation with MergeIntoMaterializeSource with UpdateExpressionsSupport with SupportsNonDeterministicExpression
  53. class OptimizeExecutor extends DeltaCommand with SQLMetricsReporting with Serializable

    Optimize job which compacts small files into larger files to reduce the number of files and potentially allow more efficient reads.

  54. case class OptimizeTableCommand(child: LogicalPlan, userPartitionPredicates: Seq[String], optimizeContext: DeltaOptimizeContext)(zOrderBy: Seq[UnresolvedAttribute]) extends OptimizeTableCommandBase with UnaryNode with Product with Serializable

    The optimize command implementation for Spark SQL.

    The optimize command implementation for Spark SQL. Example SQL:

    OPTIMIZE ('/path/to/dir' | delta.table) [WHERE part = 25] [FULL];

    Note FULL and WHERE clauses are set exclusively.

  55. abstract class OptimizeTableCommandBase extends LogicalPlan with RunnableCommand with DeltaCommand

    Base class defining abstract optimize command

  56. trait OptimizeTableStrategy extends AnyRef

    Defines set of utilities used in OptimizeTableCommand.

    Defines set of utilities used in OptimizeTableCommand. The behavior of these utilities will change based on the OptimizeTableMode: COMPACTION, ZORDER and CLUSTERING.

  57. trait ReorgTableForUpgradeUniformHelper extends DeltaLogging

    Helper trait for ReorgTableCommand to rewrite the table to be Iceberg compatible.

  58. trait ReorgTableHelper extends Serializable
  59. case class RestoreTableCommand(sourceTable: DeltaTableV2) extends LogicalPlan with LeafRunnableCommand with DeltaCommand with RestoreTableCommandBase with Product with Serializable

    Perform restore of delta table to a specified version or timestamp

    Perform restore of delta table to a specified version or timestamp

    Algorithm: 1) Read the latest snapshot of the table. 2) Read snapshot for version or timestamp to restore 3) Compute files available in snapshot for restoring (files were removed by some commit) but missed in the latest. Add these files into commit as AddFile action. 4) Compute files available in the latest snapshot (files were added after version to restore) but missed in the snapshot to restore. Add these files into commit as RemoveFile action. 5) If SQLConf.IGNORE_MISSING_FILES option is false (default value) check availability of AddFile in file system. 6) Commit metadata, Protocol, all RemoveFile and AddFile actions into delta log using commitLarge (commit will be failed in case of parallel transaction) 7) If table was modified in parallel then ignore restore and raise exception.

  60. trait RestoreTableCommandBase extends AnyRef

    Base trait class for RESTORE.

    Base trait class for RESTORE. Defines command output schema and metrics.

  61. case class ShowDeltaTableColumnsCommand(child: LogicalPlan) extends LogicalPlan with RunnableCommand with UnaryNode with DeltaCommand with Product with Serializable

    A command for listing all column names of a Delta table.

    A command for listing all column names of a Delta table.

    child

    The resolved Delta table

  62. case class SnapshotOverwriteOperationMetrics(sourceSnapshotSizeInBytes: Long, sourceSnapshotFileCount: Long, destSnapshotAddedFileCount: Long, destSnapshotAddedFilesSizeInBytes: Long) extends Product with Serializable

    Metrics of snapshot overwrite operation.

    Metrics of snapshot overwrite operation.

    sourceSnapshotSizeInBytes

    Total size of the data in the source snapshot.

    sourceSnapshotFileCount

    Number of data files in the source snapshot.

    destSnapshotAddedFileCount

    Number of new data files added to the destination snapshot as part of the execution.

    destSnapshotAddedFilesSizeInBytes

    Total size (in bytes) of the data files that were added to the destination snapshot.

  63. case class TableColumns(col_name: String) extends Product with Serializable

    The column format of the result returned by the SHOW COLUMNS command.

  64. case class TableDetail(format: String, id: String, name: String, description: String, location: String, createdAt: Timestamp, lastModified: Timestamp, partitionColumns: Seq[String], clusteringColumns: Seq[String], numFiles: Long, sizeInBytes: Long, properties: Map[String, String], minReaderVersion: Integer, minWriterVersion: Integer, tableFeatures: Seq[String]) extends Product with Serializable

    The result returned by the describe detail command.

  65. case class TouchedFileWithDV(inputFilePath: String, fileLogEntry: AddFile, newDeletionVector: DeletionVectorDescriptor, deletedRows: Long) extends Product with Serializable
  66. case class UpdateCommand(tahoeFileIndex: TahoeFileIndex, catalogTable: Option[CatalogTable], target: LogicalPlan, updateExpressions: Seq[Expression], condition: Option[Expression]) extends LogicalPlan with LeafRunnableCommand with DeltaCommand with Product with Serializable

    Performs an Update using updateExpression on the rows that match condition

    Performs an Update using updateExpression on the rows that match condition

    Algorithm: 1) Identify the affected files, i.e., the files that may have the rows to be updated. 2) Scan affected files, apply the updates, and generate a new DF with updated rows. 3) Use the Delta protocol to atomically write the new DF as new files and remove the affected files that are identified in step 1.

  67. case class UpdateMetric(condition: String, numFilesTotal: Long, numTouchedFiles: Long, numRewrittenFiles: Long, numAddedChangeFiles: Long, changeFileBytes: Long, scanTimeMs: Long, rewriteTimeMs: Long, numDeletionVectorsAdded: Long, numDeletionVectorsRemoved: Long, numDeletionVectorsUpdated: Long, commitVersion: Option[Long] = None, numLogicalRecordsAdded: Option[Long] = None, numLogicalRecordsRemoved: Option[Long] = None) extends Product with Serializable

    Used to report details about update.

    Used to report details about update.

    Note

    All the time units are milliseconds.

  68. trait VacuumCommandImpl extends DeltaCommand
  69. case class WriteIntoDelta(deltaLog: DeltaLog, mode: SaveMode, options: DeltaOptions, partitionColumns: Seq[String], configuration: Map[String, String], data: DataFrame, catalogTableOpt: Option[CatalogTable] = None, schemaInCatalog: Option[StructType] = None) extends LogicalPlan with LeafRunnableCommand with ImplicitMetadataOperation with DeltaCommand with WriteIntoDeltaLike with Product with Serializable

    Used to write a DataFrame into a delta table.

    Used to write a DataFrame into a delta table.

    New Table Semantics

    • The schema of the DataFrame is used to initialize the table.
    • The partition columns will be used to partition the table.

    Existing Table Semantics

    • The save mode will control how existing data is handled (i.e. overwrite, append, etc)
    • The schema of the DataFrame will be checked and if there are new columns present they will be added to the tables schema. Conflicting columns (i.e. a INT, and a STRING) will result in an exception
    • The partition columns, if present are validated against the existing metadata. If not present, then the partitioning of the table is respected.

    In combination with Overwrite, a replaceWhere option can be used to transactionally replace data that matches a predicate.

    In combination with Overwrite dynamic partition overwrite mode (option partitionOverwriteMode set to dynamic, or in spark conf spark.sql.sources.partitionOverwriteMode set to dynamic) is also supported.

    Dynamic partition overwrite mode conflicts with replaceWhere:

    • If a replaceWhere option is provided, and dynamic partition overwrite mode is enabled in the DataFrameWriter options, an error will be thrown.
    • If a replaceWhere option is provided, and dynamic partition overwrite mode is enabled in the spark conf, data will be overwritten according to the replaceWhere expression
    catalogTableOpt

    Should explicitly be set when table is accessed from catalog

    schemaInCatalog

    The schema created in Catalog. We will use this schema to update metadata when it is set (in CTAS code path), and otherwise use schema from data.

  70. trait WriteIntoDeltaLike extends AnyRef

    An interface for writing data into Delta tables.

  71. case class ZOrderStrategy(sparkSession: SparkSession, zOrderColumns: Seq[String]) extends OptimizeTableStrategy with Product with Serializable

    Implements ZOrder strategy

Value Members

  1. object CloneSourceFormat
  2. object CloneTableBase extends Logging
  3. object CloneTableCommand extends Serializable
  4. object ConvertToDeltaCommand extends DeltaLogging with Serializable
  5. object DMLUtils
  6. object DMLWithDeletionVectorsHelper extends DeltaCommand

    Contains utility classes and method for performing DML operations with Deletion Vectors.

  7. object DeleteCommand extends Serializable
  8. object DeletionVectorBitmapGenerator
  9. object DeletionVectorData extends Serializable
  10. object DeletionVectorResult extends Serializable
  11. object DeletionVectorUtils extends DeletionVectorUtils
  12. object DeletionVectorWriter extends DeltaLogging

    Utility methods to write the deletion vector to storage.

    Utility methods to write the deletion vector to storage. If a particular file already has an existing DV, it will be merged with the new deletion vector and written to storage.

  13. object DeltaGenerateCommand extends Serializable
  14. object DeltaReorgTableMode extends Enumeration
  15. object DescribeDeltaDetailCommand extends Serializable
  16. object DescribeDeltaHistory extends Serializable
  17. object FileToDvDescriptor extends Serializable
  18. object LastVacuumInfo extends DeltaCommand with Serializable
  19. object MergeIntoCommandBase
  20. object OptimizeTableCommand extends Serializable
  21. object OptimizeTableMode extends Enumeration
  22. object OptimizeTableStrategy
  23. object TableCreationModes
  24. object TableDetail extends Serializable
  25. object UpdateCommand extends Serializable
  26. object VacuumCommand extends VacuumCommandImpl with Serializable

    Vacuums the table by clearing all untracked files and folders within this table.

    Vacuums the table by clearing all untracked files and folders within this table. First lists all the files and directories in the table, and gets the relative paths with respect to the base of the table. Then it gets the list of all tracked files for this table, which may or may not be within the table base path, and gets the relative paths of all the tracked files with respect to the base of the table. Files outside of the table path will be ignored. Then we take a diff of the files and delete directories that were already empty, and all files that are within the table that are no longer tracked.

Ungrouped