package delta
- Alphabetic
- Public
- All
Type Members
- class BatchedDeltaMergeActionResolver extends DeltaMergeActionResolverBase
-
case class
CDCNameBased(functionArgs: Seq[Expression]) extends LogicalPlan with CDCStatementBase with Product with Serializable
Plan for the "table_changes" function
-
case class
CDCPathBased(functionArgs: Seq[Expression]) extends LogicalPlan with CDCStatementBase with Product with Serializable
Plan for the "table_changes_by_path" function
-
trait
CDCStatementBase extends LogicalPlan with DeltaTableValueFunction
Base trait for analyzing
table_changesandtable_changes_for_path.Base trait for analyzing
table_changesandtable_changes_for_path. The resolution works as follows:- The TVF logical plan is resolved using the TableFunctionRegistry in the Analyzer. This uses
reflection to create one of
CDCNameBasedorCDCPathBasedby passing all the arguments. 2. DeltaAnalysis turns the plans to aTableChangesnode to resolve the DeltaTable. This can be resolved by the DeltaCatalog for tables or DeltaAnalysis for the path based use. 3. TableChanges then turns into a LogicalRelation that returns the CDC relation.
- The TVF logical plan is resolved using the TableFunctionRegistry in the Analyzer. This uses
reflection to create one of
-
case class
CapturedSnapshot(snapshot: Snapshot, updateTimestamp: Long) extends Product with Serializable
Wraps the most recently updated snapshot along with the timestamp the update was started.
Wraps the most recently updated snapshot along with the timestamp the update was started. Defined outside the class since it's used in tests.
- trait ChainableExecutionObserver[O] extends AnyRef
- case class CheckConstraintsPreDowngradeTableFeatureCommand(table: DeltaTableV2) extends PreDowngradeTableFeatureCommand with Product with Serializable
- case class CheckOverflowInTableWrite(child: Expression, columnName: String) extends UnaryExpression with Product with Serializable
-
class
CheckUnresolvedRelationTimeTravel extends (LogicalPlan) ⇒ Unit
Custom check rule that compensates for [SPARK-45383].
Custom check rule that compensates for [SPARK-45383]. It checks the (unresolved) child relation of each RelationTimeTravel in the plan, in order to trigger a helpful table-not-found AnalysisException instead of the internal spark error that would otherwise result.
-
case class
CheckpointInstance(version: Long, format: Format, fileName: Option[String] = None, numParts: Option[Int] = None) extends Ordered[CheckpointInstance] with Product with Serializable
A class to help with comparing checkpoints with each other, where we may have had concurrent writers that checkpoint with different number of parts.
A class to help with comparing checkpoints with each other, where we may have had concurrent writers that checkpoint with different number of parts. The
numPartsfield will be present only for multipart checkpoints (represented by Format.WITH_PARTS). ThefileNamefield is present only for V2 Checkpoints (represented by Format.V2) These additional fields are used as a tie breaker when comparing multiple checkpoint instance of same Format for the sameversion. - case class CheckpointProtectionPreDowngradeCommand(table: DeltaTableV2) extends PreDowngradeTableFeatureCommand with Product with Serializable
-
trait
CheckpointProvider extends UninitializedCheckpointProvider
A trait which provides information about a checkpoint to the Snapshot.
- trait Checkpoints extends DeltaLogging
- case class ColumnMappingException(msg: String, mode: DeltaColumnMappingMode) extends AnalysisException with Product with Serializable
- case class ColumnMappingPreDowngradeCommand(table: DeltaTableV2) extends PreDowngradeTableFeatureCommand with DeltaLogging with Product with Serializable
-
class
ColumnMappingUnsupportedException extends UnsupportedOperationException
Errors thrown around column mapping.
-
class
CommitCoordinatorGetCommitsFailedException extends Exception
Exception thrown When TableCommitCoordinatorClient.getCommits fails due to any reason.
-
case class
CommitStats(startVersion: Long, commitVersion: Long, readVersion: Long, txnDurationMs: Long, commitDurationMs: Long, fsWriteDurationMs: Long, stateReconstructionDurationMs: Long, numAdd: Int, numRemove: Int, numSetTransaction: Int, bytesNew: Long, numFilesTotal: Long, sizeInBytesTotal: Long, numCdcFiles: Long, cdcBytesNew: Long, protocol: Protocol, commitSizeBytes: Long, checkpointSizeBytes: Long, totalCommitsSizeSinceLastCheckpoint: Long, checkpointAttempt: Boolean, info: CommitInfo, newMetadata: Option[Metadata], numAbsolutePathsInAdd: Int, numDistinctPartitionsInAdd: Int, numPartitionColumnsInTable: Int, isolationLevel: String, coordinatedCommitsInfo: CoordinatedCommitsStats, fileSizeHistogram: Option[FileSizeHistogram] = None, addFilesHistogram: Option[FileSizeHistogram] = None, removeFilesHistogram: Option[FileSizeHistogram] = None, numOfDomainMetadatas: Long = 0, txnId: Option[String] = None) extends Product with Serializable
Record metrics about a successful commit.
-
class
ConcurrentAppendException extends io.delta.exceptions.DeltaConcurrentModificationException
This class is kept for backward compatibility.
This class is kept for backward compatibility. Use io.delta.exceptions.ConcurrentAppendException instead.
-
class
ConcurrentDeleteDeleteException extends io.delta.exceptions.DeltaConcurrentModificationException
This class is kept for backward compatibility.
This class is kept for backward compatibility. Use io.delta.exceptions.ConcurrentDeleteDeleteException instead.
-
class
ConcurrentDeleteReadException extends io.delta.exceptions.DeltaConcurrentModificationException
This class is kept for backward compatibility.
This class is kept for backward compatibility. Use io.delta.exceptions.ConcurrentDeleteReadException instead.
-
class
ConcurrentTransactionException extends io.delta.exceptions.DeltaConcurrentModificationException
This class is kept for backward compatibility.
This class is kept for backward compatibility. Use io.delta.exceptions.ConcurrentTransactionException instead.
-
class
ConcurrentWriteException extends io.delta.exceptions.DeltaConcurrentModificationException
This class is kept for backward compatibility.
This class is kept for backward compatibility. Use io.delta.exceptions.ConcurrentWriteException instead.
- case class CoordinatedCommitsPreDowngradeCommand(table: DeltaTableV2) extends PreDowngradeTableFeatureCommand with DeltaLogging with Product with Serializable
- case class CoordinatedCommitsStats(coordinatedCommitsType: String, commitCoordinatorName: String, commitCoordinatorConf: Map[String, String]) extends Product with Serializable
-
case class
DateFormatPartitionExpr(partitionColumn: String, format: String) extends OptimizablePartitionExpression with Product with Serializable
The rules for the generation expression
DATE_FORMAT(col, format), such as: DATE_FORMAT(timestamp, 'yyyy-MM'), DATE_FORMAT(timestamp, 'yyyy-MM-dd-HH')The rules for the generation expression
DATE_FORMAT(col, format), such as: DATE_FORMAT(timestamp, 'yyyy-MM'), DATE_FORMAT(timestamp, 'yyyy-MM-dd-HH')- partitionColumn
the partition column name using DATE_FORMAT in its generation expression.
- format
the
formatparameter of DATE_FORMAT in the generation expression. unix_timestamp('12345-12', 'yyyy-MM') | unix_timestamp('+12345-12', 'yyyy-MM') EXCEPTION fail | 327432240000 CORRECTED null | 327432240000 LEGACY 327432240000 | null
-
case class
DatePartitionExpr(partitionColumn: String) extends OptimizablePartitionExpression with Product with Serializable
The rules for the generation expression
CAST(col AS DATE). -
case class
DayPartitionExpr(dayPart: String) extends OptimizablePartitionExpression with Product with Serializable
This is a placeholder to catch
day(col)so that we can merge YearPartitionExpr, MonthPartitionExpr and DayPartitionExpr to YearMonthDayPartitionExpr.This is a placeholder to catch
day(col)so that we can merge YearPartitionExpr, MonthPartitionExpr and DayPartitionExpr to YearMonthDayPartitionExpr.- dayPart
the day partition column name.
-
class
DeltaAnalysis extends Rule[LogicalPlan] with AnalysisHelper with DeltaLogging
Analysis rules for Delta.
Analysis rules for Delta. Currently, these rules enable schema enforcement / evolution with INSERT INTO.
- class DeltaAnalysisException extends AnalysisException with DeltaThrowable
- class DeltaArithmeticException extends ArithmeticException with DeltaThrowable
-
sealed
trait
DeltaBatchCDFSchemaMode extends AnyRef
Definitions for the batch read schema mode for CDF
- class DeltaChecksumException extends ChecksumException with DeltaThrowable
- trait DeltaColumnMappingBase extends DeltaLogging
-
sealed
trait
DeltaColumnMappingMode extends AnyRef
A trait for Delta column mapping modes.
- class DeltaColumnMappingUnsupportedException extends ColumnMappingUnsupportedException with DeltaThrowable
- class DeltaCommandUnsupportedWithDeletionVectorsException extends UnsupportedOperationException with DeltaThrowable
-
sealed
trait
DeltaCommitTag extends AnyRef
Marker trait for a commit tag used by delta.
-
abstract
class
DeltaConcurrentModificationException extends ConcurrentModificationException
The basic class for all Tahoe commit conflict exceptions.
- case class DeltaConfig[T](key: String, defaultValue: String, fromString: (String) ⇒ T, validationFunction: (T) ⇒ Boolean, helpMessage: String, editable: Boolean = true, alternateKeys: Seq[String] = Seq.empty) extends Product with Serializable
-
trait
DeltaConfigsBase extends DeltaLogging
Contains list of reservoir configs and validation checks.
-
case class
DeltaDynamicPartitionOverwriteCommand(table: NamedRelation, deltaTable: DeltaTableV2, query: LogicalPlan, writeOptions: Map[String, String], isByName: Boolean, analyzedQuery: Option[LogicalPlan] = None) extends LogicalPlan with RunnableCommand with V2WriteCommand with Product with Serializable
A
RunnableCommandthat will execute dynamic partition overwrite using WriteIntoDelta.A
RunnableCommandthat will execute dynamic partition overwrite using WriteIntoDelta.This is a workaround of Spark not supporting V1 fallback for dynamic partition overwrite. Note the following details: - Extends
V2WriteCommmandso that Spark can transform this plan in the same as other commands likeAppendData. - Exposes the query as a child so that the Spark optimizer can optimize it. -
trait
DeltaErrorsBase extends DocsPath with DeltaLogging with QueryErrorsBase
A holder object for Delta errors.
A holder object for Delta errors.
IMPORTANT: Any time you add a test that references the docs, add to the Seq defined in DeltaErrorsSuite so that the doc links that are generated can be verified to work in docs.delta.io
- class DeltaFileAlreadyExistsException extends FileAlreadyExistsException with DeltaThrowable
- trait DeltaFileFormat extends AnyRef
- class DeltaFileNotFoundException extends FileNotFoundException with DeltaThrowable
-
case class
DeltaHistory(version: Option[Long], timestamp: Timestamp, userId: Option[String], userName: Option[String], operation: String, operationParameters: Map[String, String], job: Option[JobInfo], notebook: Option[NotebookInfo], clusterId: Option[String], readVersion: Option[Long], isolationLevel: Option[String], isBlindAppend: Option[Boolean], operationMetrics: Option[Map[String, String]], userMetadata: Option[String], engineInfo: Option[String]) extends CommitMarker with Product with Serializable
class describing the output schema of org.apache.spark.sql.delta.commands.DescribeDeltaHistoryCommand
-
class
DeltaHistoryManager extends DeltaLogging
This class keeps tracks of the version of commits and their timestamps for a Delta table to help with operations like describing the history of a table.
- class DeltaIOException extends IOException with DeltaThrowable
-
class
DeltaIdentityColumnStatsTracker extends DeltaJobStatisticsTracker
Stats tracker for IDENTITY column high water marks.
Stats tracker for IDENTITY column high water marks. The only difference between this class and
DeltaJobStatisticsTrackeris how the stats are aggregated on the driver. - class DeltaIllegalArgumentException extends IllegalArgumentException with DeltaThrowable
- class DeltaIllegalStateException extends IllegalStateException with DeltaThrowable
- class DeltaIndexOutOfBoundsException extends IndexOutOfBoundsException with DeltaThrowable
-
class
DeltaLog extends Checkpoints with MetadataCleanup with LogStoreProvider with SnapshotManagement with DeltaFileFormat with ProvidesUniFormConverters with ReadChecksum
Used to query the current state of the log as well as modify it by adding new atomic collections of actions.
Used to query the current state of the log as well as modify it by adding new atomic collections of actions.
Internally, this class implements an optimistic concurrency control algorithm to handle multiple readers or writers. Any single read is guaranteed to see a consistent snapshot of the table.
-
case class
DeltaLogFileIndex extends FileIndex with LoggingShims with Product with Serializable
A specialized file index for files found in the _delta_log directory.
A specialized file index for files found in the _delta_log directory. By using this file index, we avoid any additional file listing, partitioning inference, and file existence checks when computing the state of a Delta table.
-
trait
DeltaMergeActionResolverBase extends AnyRef
Base trait with helpers for resolving DeltaMergeAction.
- class DeltaNoSuchTableException extends AnalysisException with DeltaThrowable
- trait DeltaOptionParser extends AnyRef
-
class
DeltaOptions extends DeltaWriteOptions with DeltaReadOptions with Serializable
Options for the Delta data source.
-
case class
DeltaParquetFileFormat(protocol: Protocol, metadata: Metadata, nullableRowTrackingFields: Boolean = false, optimizationsEnabled: Boolean = true, tablePath: Option[String] = None, isCDCRead: Boolean = false) extends ParquetFileFormat with LoggingShims with Product with Serializable
A thin wrapper over the Parquet file format to support
A thin wrapper over the Parquet file format to support
- columns names without restrictions.
- populated a column from the deletion vector of this file (if exists) to indicate whether the row is deleted or not according to the deletion vector. Consumers of this scan can use the column values to filter out the deleted rows.
- class DeltaParquetWriteSupport extends ParquetWriteSupport
- class DeltaParseException extends ParseException with DeltaThrowable
- trait DeltaReadOptions extends DeltaOptionParser
- class DeltaRuntimeException extends RuntimeException with DeltaThrowable
- class DeltaSparkException extends SparkException with DeltaThrowable
-
sealed
trait
DeltaStartingVersion extends AnyRef
Definitions for the starting version of a Delta stream.
-
class
DeltaStreamingColumnMappingSchemaIncompatibleException extends DeltaUnsupportedOperationException
Errors thrown when an operation is not supported with column mapping schema changes (rename / drop column).
Errors thrown when an operation is not supported with column mapping schema changes (rename / drop column).
To make compatible with existing behavior for those who accidentally has already used this operation, user should always be able to use
escapeConfigNameto fall back at own risk. - class DeltaTableFeatureException extends DeltaRuntimeException
-
case class
DeltaTableIdentifier(path: Option[String] = None, table: Option[TableIdentifier] = None) extends Product with Serializable
An identifier for a Delta table containing one of the path or the table identifier.
- class DeltaTablePropertyValidationFailedException extends RuntimeException with DeltaThrowable
- sealed trait DeltaTablePropertyValidationFailedSubClass extends AnyRef
-
trait
DeltaTableValueFunction extends LogicalPlan with UnresolvedLeafNode
Represents an unresolved Delta Table Value Function
-
trait
DeltaThrowable extends SparkThrowable
The trait for all exceptions of Delta code path.
-
case class
DeltaTimeTravelSpec(timestamp: Option[Expression], version: Option[Long], creationSource: Option[String]) extends DeltaLogging with Product with Serializable
The specification to time travel a Delta Table to the given
timestamporversion.The specification to time travel a Delta Table to the given
timestamporversion.- timestamp
An expression that can be evaluated into a timestamp. The expression cannot be a subquery.
- version
The version of the table to time travel to. Must be >= 0.
- creationSource
The API used to perform time travel, e.g.
atSyntax,dfReaderor SQL
- class DeltaUnsupportedOperationException extends UnsupportedOperationException with DeltaThrowable
-
case class
DeltaUnsupportedOperationsCheck(spark: SparkSession) extends (LogicalPlan) ⇒ Unit with DeltaLogging with Product with Serializable
A rule to add helpful error messages when Delta is being used with unsupported Hive operations or if an unsupported operation is being made, e.g.
A rule to add helpful error messages when Delta is being used with unsupported Hive operations or if an unsupported operation is being made, e.g. a DML operation like INSERT/UPDATE/DELETE/MERGE when a table doesn't exist.
- case class DeltaUnsupportedTableFeatureException(errorClass: String, tableNameOrPath: String, unsupported: Iterable[String]) extends DeltaTableFeatureException with Product with Serializable
- trait DeltaWriteOptions extends DeltaWriteOptionsImpl with DeltaOptionParser
- trait DeltaWriteOptionsImpl extends DeltaOptionParser
- trait DocsPath extends AnyRef
-
trait
DomainMetadataUtilsBase extends DeltaLogging
Domain metadata utility functions.
-
class
DummySnapshot extends Snapshot
A dummy snapshot with only metadata and protocol specified.
A dummy snapshot with only metadata and protocol specified. It is used for a targeted table version that does not exist yet before commiting a change. This can be used to create a DataFrame, or to derive the stats schema from an existing Parquet table when converting it to Delta or cloning it to a Delta table prior to the actual snapshot being available after a commit.
Note that the snapshot state reconstruction contains only the protocol and metadata - it does not include add/remove actions, appids, or metadata domains, even if the actual table currently has or will have them in the future.
-
sealed
trait
FeatureAutomaticallyEnabledByMetadata extends AnyRef
A trait indicating this feature can be automatically enabled via a change in a table's metadata, e.g., through setting particular values of certain feature-specific table properties.
A trait indicating this feature can be automatically enabled via a change in a table's metadata, e.g., through setting particular values of certain feature-specific table properties.
When the feature's metadata requirements are satisfied for new tables, or for existing tables when [[automaticallyUpdateProtocolOfExistingTables]] set to `true`, the client will silently add the feature to the protocol's
readerFeaturesand/orwriterFeatures. Otherwise, a proper protocol version bump must be present in the same transaction. -
case class
FileMetadataMaterializationMetrics(filesMaterializedCount: Long = 0L, overAllocWaitCount: Long = 0L, overAllocWaitTimeMs: Long = 0L, overAllocFilesMaterializedCount: Long = 0L) extends Product with Serializable
Instance of this class is used for recording metrics of the FileMetadataMaterializationTracker
-
class
FileMetadataMaterializationTracker extends LoggingShims
An instance of this class tracks and controls the materialization usage of a single command query (e.g.
An instance of this class tracks and controls the materialization usage of a single command query (e.g. Backfill) with respect to the driver limits. Each query must use one instance of the FileMaterializationTracker.
tasks - tasks are the basic unit of computation. For example, in Backfill, each task bins multiple files into batches to be executed.
A task has to be materialized in its entirety, so in the case where we are unable to acquire permits to materialize a task we acquire an over allocation lock that will allow tasks to complete materializing. Over allocation is only allowed for one thread at once in the driver. This allows us to restrict the amount of file metadata being materialized at once on the driver.
Accessed by the thread materializing files and by the thread releasing resources after execution.
-
case class
GenerateIdentityValues(generator: PartitionIdentityValueGenerator) extends LeafExpression with Nondeterministic with Product with Serializable
Returns the next generated IDENTITY column value based on the underlying PartitionIdentityValueGenerator.
-
case class
HourPartitionExpr(hourPart: String) extends OptimizablePartitionExpression with Product with Serializable
This is a placeholder to catch
hour(col)so that we can merge YearPartitionExpr, MonthPartitionExpr, DayPartitionExpr and HourPartitionExpr to YearMonthDayHourPartitionExpr. -
case class
IcebergCompat(version: Integer, config: DeltaConfig[Option[Boolean]], requiredTableFeatures: Seq[TableFeature], requiredTableProperties: Seq[RequiredDeltaTableProperty[_]], checks: Seq[IcebergCompatCheck]) extends DeltaLogging with Product with Serializable
All IcebergCompatVx should extend from this base class
All IcebergCompatVx should extend from this base class
- version
the compat version number
- config
the DeltaConfig for this IcebergCompat version
- requiredTableFeatures
a list of table features it relies on
- requiredTableProperties
a list of table properties it relies on. See RequiredDeltaTableProperty
- checks
a list of checks this IcebergCompatVx will perform.
- See also
- trait IcebergCompatCheck extends (IcebergCompatContext) ⇒ Unit
- case class IcebergCompatContext(prevSnapshot: Snapshot, newestProtocol: Protocol, newestMetadata: Metadata, operation: Option[Operation], actions: Seq[Action], tableId: String, version: Integer) extends Product with Serializable
-
case class
IdentityPartitionExpr(partitionColumn: String) extends OptimizablePartitionExpression with Product with Serializable
The rules for the generation of identity expressions, used for partitioning on a nested column.
The rules for the generation of identity expressions, used for partitioning on a nested column. Note: - Writing an empty string to a partition column would become
null(SPARK-24438) so generated partition filters always pick up thenullpartition for safety.- partitionColumn
the partition column name used in the generation expression.
- case class InCommitTimestampsPreDowngradeCommand(table: DeltaTableV2) extends PreDowngradeTableFeatureCommand with DeltaLogging with Product with Serializable
- class IndividualDeltaMergeActionResolver extends DeltaMergeActionResolverBase
-
case class
InvalidProtocolVersionException(tableNameOrPath: String, readerRequiredVersion: Int, writerRequiredVersion: Int, supportedReaderVersions: Seq[Int], supportedWriterVersions: Seq[Int]) extends RuntimeException with DeltaThrowable with Product with Serializable
Thrown when the protocol version of a table is greater than supported by this client.
-
sealed
trait
IsolationLevel extends AnyRef
Trait that defines the level consistency guarantee is going to be provided by
OptimisticTransaction.commit().Trait that defines the level consistency guarantee is going to be provided by
OptimisticTransaction.commit(). Serializable is the most strict level and SnapshotIsolation is the least strict one.- See also
IsolationLevel.allLevelsInDescOrder for all the levels in the descending order of strictness and IsolationLevel.DEFAULT for the default table isolation level.
-
trait
JsonMetadataDomain[T] extends AnyRef
A trait for capturing metadata domain of type T.
- abstract class JsonMetadataDomainUtils[T] extends AnyRef
-
case class
LastCheckpointInfo(version: Long, size: Long, parts: Option[Int], sizeInBytes: Option[Long], numOfAddFiles: Option[Long], checkpointSchema: Option[StructType], v2Checkpoint: Option[LastCheckpointV2] = None, checksum: Option[String] = None) extends Product with Serializable
Records information about a checkpoint.
Records information about a checkpoint.
This class provides the checksum validation logic, needed to ensure that content of LAST_CHECKPOINT file points to a valid json. The readers might read some part from old file and some part from the new file (if the file is read across multiple requests). In some rare scenarios, the split read might produce a valid json and readers will be able to parse it and convert it into a LastCheckpointInfo object that contains invalid data. In order to prevent using it, we do a checksum match on the read json to validate that it is consistent.
For old Delta versions, which do not have checksum logic, we want to make sure that the old fields (i.e. version, size, parts) are together in the beginning of last_checkpoint json. All these fields together are less than 50 bytes, so even in split read scenario, we want to make sure that old delta readers which do not do have checksum validation logic, gets all 3 fields from one read request. For this reason, we use
JsonPropertyOrderto force them in the beginning together.- version
the version of this checkpoint
- size
the number of actions in the checkpoint, -1 if the information is unavailable.
- parts
the number of parts when the checkpoint has multiple parts. None if this is a singular checkpoint
- sizeInBytes
the number of bytes of the checkpoint
- numOfAddFiles
the number of AddFile actions in the checkpoint
- checkpointSchema
the schema of the underlying checkpoint files
- checksum
the checksum of the LastCheckpointInfo.
- Annotations
- @JsonPropertyOrder()
-
case class
LastCheckpointV2(path: String, sizeInBytes: Long, modificationTime: Long, nonFileActions: Option[Seq[SingleAction]], sidecarFiles: Option[Seq[SidecarFile]]) extends Product with Serializable
Information about the V2 Checkpoint in the LAST_CHECKPOINT file
Information about the V2 Checkpoint in the LAST_CHECKPOINT file
- path
file name corresponding to the uuid-named v2 checkpoint
- sizeInBytes
size in bytes for the uuid-named v2 checkpoint
- modificationTime
modification time for the uuid-named v2 checkpoint
- nonFileActions
all non file actions for the v2 checkpoint. This info may or may not be available. A None value means that info is missing. If it is not None, then it should have all the non-FileAction corresponding to the checkpoint.
- sidecarFiles
sidecar files corresponding to the v2 checkpoint. This info may or may not be available. A None value means that this info is missing. An empty list denotes that the v2 checkpoint has no sidecars.
-
abstract
class
LazyCompleteCheckpointProvider extends CheckpointProvider
A wrapper implementation of CheckpointProvider which wraps
underlyingCheckpointProviderFutureanduninitializedCheckpointProviderfor implementing all the UninitializedCheckpointProvider and CheckpointProvider APIs. -
sealed
trait
LegacyFeatureType extends AnyRef
A trait to indicate a feature is legacy, i.e., released before Table Features.
-
sealed abstract
class
LegacyReaderWriterFeature extends LegacyWriterFeature with ReaderWriterFeatureType
A base class for all legacy writer-only table features.
-
sealed abstract
class
LegacyWriterFeature extends TableFeature with LegacyFeatureType
A base class for all table legacy writer-only features.
-
case class
LogSegment(logPath: Path, version: Long, deltas: Seq[FileStatus], checkpointProvider: UninitializedCheckpointProvider, lastCommitFileModificationTimestamp: Long) extends Product with Serializable
Provides information around which files in the transaction log need to be read to create the given version of the log.
Provides information around which files in the transaction log need to be read to create the given version of the log.
- logPath
The path to the _delta_log directory
- version
The Snapshot version to generate
- deltas
The delta commit files (.json) to read
- checkpointProvider
provider to give information about Checkpoint files.
- lastCommitFileModificationTimestamp
The "unadjusted" file modification timestamp of the last commit within this segment. By unadjusted, we mean that the commit timestamps may not necessarily be monotonically increasing for the commits within this segment.
-
abstract
class
MaterializedRowTrackingColumn extends AnyRef
Represents a materialized row tracking column.
Represents a materialized row tracking column. Concrete implementations are MaterializedRowId and MaterializedRowCommitVersion.
-
class
MetadataChangedException extends io.delta.exceptions.DeltaConcurrentModificationException
This class is kept for backward compatibility.
This class is kept for backward compatibility. Use io.delta.exceptions.MetadataChangedException instead.
-
trait
MetadataCleanup extends DeltaLogging
Cleans up expired Delta table metadata.
-
class
MetadataMismatchErrorBuilder extends AnyRef
A helper class in building a helpful error message in case of metadata mismatches.
-
case class
MonthPartitionExpr(monthPart: String) extends OptimizablePartitionExpression with Product with Serializable
This is a placeholder to catch
month(col)so that we can merge YearPartitionExpr and MonthPartitionExprto YearMonthDayPartitionExpr.This is a placeholder to catch
month(col)so that we can merge YearPartitionExpr and MonthPartitionExprto YearMonthDayPartitionExpr.- monthPart
the month partition column name.
-
case class
NumRecordsStats(numLogicalRecordsAddedPartial: Long, numLogicalRecordsRemovedPartial: Long, numDeletionVectorRecordsAdded: Long, numDeletionVectorRecordsRemoved: Long, numFilesAddedWithoutNumRecords: Long, numFilesRemovedWithoutNumRecords: Long) extends Product with Serializable
Container class for statistics related to number of records in a Delta commit.
-
class
OptimisticTransaction extends OptimisticTransactionImpl with DeltaLogging
Used to perform a set of reads in a transaction and then commit a set of updates to the state of the log.
Used to perform a set of reads in a transaction and then commit a set of updates to the state of the log. All reads from the DeltaLog, MUST go through this instance rather than directly to the DeltaLog otherwise they will not be check for logical conflicts with concurrent updates.
This class is not thread-safe.
-
trait
OptimisticTransactionImpl extends TransactionalWrite with SQLMetricsReporting with DeltaScanGenerator with RecordChecksum with DeltaLogging
Used to perform a set of reads in a transaction and then commit a set of updates to the state of the log.
Used to perform a set of reads in a transaction and then commit a set of updates to the state of the log. All reads from the DeltaLog, MUST go through this instance rather than directly to the DeltaLog otherwise they will not be check for logical conflicts with concurrent updates.
This trait is not thread-safe.
-
sealed
trait
OptimizablePartitionExpression extends AnyRef
Defines rules to convert a data filter to a partition filter for a special generation expression of a partition column.
Defines rules to convert a data filter to a partition filter for a special generation expression of a partition column.
Note: - This may be shared cross multiple
SparkSessions, implementations should not store any state (such as expressions) referring to a specificSparkSession. - Partition columns may have different behaviors than data columns. For example, writing an empty string to a partition column would becomenull(SPARK-24438). We need to pay attention to these slight behavior differences and make sure applying the auto generated partition filters would still return the same result as if they were not applied. -
case class
PartitionIdentityValueGenerator(start: Long, step: Long, highWaterMarkOpt: Option[Long]) extends Product with Serializable
Generator of IDENTITY value for one partition.
Generator of IDENTITY value for one partition.
- start
The configured start value for the identity column.
- step
IDENTITY value increment.
- highWaterMarkOpt
The optional high watermark for the identity value generation. If this is None, that means that no identity values has been generated in the past and we should start the identity value generation from the
start.
-
case class
PostHocResolveUpCast(spark: SparkSession) extends Rule[LogicalPlan] with Product with Serializable
Post-hoc resolution rules PreprocessTableMerge and PreprocessTableUpdate may introduce new unresolved UpCast expressions that won't be resolved by ResolveUpCast that ran in the previous resolution phase.
Post-hoc resolution rules PreprocessTableMerge and PreprocessTableUpdate may introduce new unresolved UpCast expressions that won't be resolved by ResolveUpCast that ran in the previous resolution phase. This rule ensures these UpCast expressions get resolved in the Post-hoc resolution phase.
Note: we can't inject ResolveUpCast directly because we need an initialized analyzer instance for that which is not available at the time Delta rules are injected. PostHocResolveUpCast is delaying the access to the analyzer until after it's initialized.
-
sealed abstract
class
PreDowngradeTableFeatureCommand extends AnyRef
A base class for implementing a preparation command for removing table features.
A base class for implementing a preparation command for removing table features. Must implement a run method. Note, the run method must be implemented in a way that when it finishes, the table does not use the feature that is being removed, and nobody is allowed to start using it again implicitly. One way to achieve this is by disabling the feature on the table before proceeding to the actual removal. See RemovableFeature.preDowngradeCommand.
-
case class
PreloadedCheckpointProvider(topLevelFiles: Seq[FileStatus], lastCheckpointInfoOpt: Option[LastCheckpointInfo]) extends CheckpointProvider with DeltaLogging with Product with Serializable
An implementation of CheckpointProvider where the information about checkpoint files (i.e.
An implementation of CheckpointProvider where the information about checkpoint files (i.e. Seq[FileStatus]) is already known in advance.
- topLevelFiles
- file statuses that describes the checkpoint
- lastCheckpointInfoOpt
- optional LastCheckpointInfo corresponding to this checkpoint. This comes from _last_checkpoint file
-
case class
PreprocessTableDelete(sqlConf: SQLConf) extends Rule[LogicalPlan] with Product with Serializable
Preprocess the DeltaDelete plan to convert to DeleteCommand.
- case class PreprocessTableMerge(conf: SQLConf) extends Rule[LogicalPlan] with UpdateExpressionsSupport with Product with Serializable
-
case class
PreprocessTableUpdate(sqlConf: SQLConf) extends Rule[LogicalPlan] with UpdateExpressionsSupport with Product with Serializable
Preprocesses the DeltaUpdateTable logical plan before converting it to UpdateCommand.
Preprocesses the DeltaUpdateTable logical plan before converting it to UpdateCommand. - Adjusts the column order, which could be out of order, based on the destination table - Generates expressions to compute the value of all target columns in Delta table, while taking into account that the specified SET clause may only update some columns or nested fields of columns.
-
trait
PreprocessTableWithDVs extends SubqueryTransformerHelper
Plan transformer to inject a filter that removes the rows marked as deleted according to deletion vectors.
Plan transformer to inject a filter that removes the rows marked as deleted according to deletion vectors. For tables with no deletion vectors, this transformation has no effect.
It modifies for plan for tables with deletion vectors as follows: Before rule: <Parent Node> -> Delta Scan (key, value).
- Here we are reading
key,valuecolumns from the Delta table After rule: <Parent Node> -> Project(key, value) -> Filter (skip_row == 0) -> Delta Scan (key, value, skip_row) - Here we insert a new column
skip_rowin Delta scan. This value is populated by the Parquet reader using the DV corresponding to the Parquet file read (See DeltaParquetFileFormat) and it contains 0 if we want to keep the row. - Filter created filters out rows with skip_row equals to 0
- And at the end we have a Project to keep the plan node output same as before the rule is applied.
- Here we are reading
-
case class
PreprocessTableWithDVsStrategy(session: SparkSession) extends Strategy with PreprocessTableWithDVs with Product with Serializable
Strategy to process tables with DVs and add the skip row column and filters.
Strategy to process tables with DVs and add the skip row column and filters.
This strategy will apply all transformations needed to tables with DVs and delegate to FileSourceStrategy to create the final plan. The DV filter will be the bottom-most filter in the plan and so it will be pushed down to the FileSourceScanExec at the beginning of the filter list.
-
case class
PreprocessTimeTravel(sparkSession: SparkSession) extends Rule[LogicalPlan] with Product with Serializable
Resolves the UnresolvedRelation in command 's child TimeTravel.
Resolves the UnresolvedRelation in command 's child TimeTravel. Currently Delta depends on Spark 3.2 which does not resolve the UnresolvedRelation in TimeTravel. Once Delta upgrades to Spark 3.3, this code can be removed.
TODO: refactoring this analysis using Spark's native TimeTravelRelation logical plan
-
class
ProtocolChangedException extends io.delta.exceptions.DeltaConcurrentModificationException
This class is kept for backward compatibility.
This class is kept for backward compatibility. Use io.delta.exceptions.ProtocolChangedException instead.
- class ProtocolDowngradeException extends RuntimeException with DeltaThrowable
- trait ProvidesUniFormConverters extends AnyRef
-
trait
ReadChecksum extends DeltaLogging
Read checksum files.
-
sealed abstract
class
ReaderWriterFeature extends WriterFeature with ReaderWriterFeatureType
A base class for all reader-writer table features that can only be explicitly supported.
-
sealed
trait
ReaderWriterFeatureType extends AnyRef
A trait to indicate a feature applies to readers and writers.
-
trait
RecordChecksum extends DeltaLogging
Record the state of the table as a checksum file along with a commit.
-
sealed
trait
RemovableFeature extends AnyRef
A trait indicating a feature can be removed.
A trait indicating a feature can be removed. Classes that extend the trait need to implement the following four functions:
a) preDowngradeCommand. This is where all required actions for removing the feature are implemented. For example, to remove the DVs feature we need to remove metadata config and purge all DVs from table. This action takes place before the protocol downgrade in separate commit(s). Note, the command needs to be implemented in a way concurrent transactions do not nullify the effect. For example, disabling DVs on a table before purging will stop concurrent transactions from adding DVs. During protocol downgrade we perform a validation in validateRemoval to make sure all invariants still hold.
b) validateRemoval. Add any feature-specific checks before proceeding to the protocol downgrade. This function is guaranteed to be called at the latest version before the protocol downgrade is committed to the table. When the protocol downgrade txn conflicts, the validation is repeated against the winning txn snapshot. As soon as the protocol downgrade succeeds, all subsequent interleaved txns are aborted. The implementation should return true if there are no feature traces in the latest version. False otherwise.
c) requiresHistoryProtection. It indicates whether the feature leaves traces in the table history that may result in incorrect behaviour if the table is read/written by a client that does not support the feature. This is by default true for all reader+writer features and false for writer features. WARNING: Disabling requiresHistoryProtection for relevant features could result in incorrect snapshot reconstruction.
d) actionUsesFeature. For features that require history truncation we verify whether past versions contain any traces of the removed feature. This is achieved by calling actionUsesFeature for every action of every reachable commit version in the log. Note, a feature may leave traces in both data and metadata. Depending on the feature, we need to check several types of actions such as Metadata, AddFile, RemoveFile etc.
WARNING: actionUsesFeature should not check Protocol actions for the feature being removed, because at the time actionUsesFeature is invoked the protocol downgrade did not happen yet. Thus, the feature-to-remove is still active. As a result, any unrelated operations that produce a protocol action (while we are waiting for the retention period to expire) will "carry" the feature-to-remove. Checking protocol for that feature would result in an unnecessary failure during the history validation of the next DROP FEATURE call. Note, while the feature-to-remove is supported in the protocol we cannot generate a legit protocol action that adds support for that feature since it is already supported.
-
case class
RequiredDeltaTableProperty[T](deltaConfig: DeltaConfig[T], validator: (T) ⇒ Boolean, autoSetValue: String) extends Product with Serializable
Wrapper class for table property validation
Wrapper class for table property validation
- deltaConfig
DeltaConfig we are checking
- validator
A generic method to validate the given value
- autoSetValue
The value to set if we can auto-set this value (e.g. during table creation)
-
case class
ResolveDeltaPathTable(sparkSession: SparkSession) extends Rule[LogicalPlan] with Product with Serializable
Replaces UnresolvedTables if the plan is for direct query on files.
-
case class
ResolvedPathBasedNonDeltaTable(path: String, options: Map[String, String], commandName: String) extends LogicalPlan with LeafNode with Product with Serializable
This operator is a placeholder that identifies a non-Delta path-based table.
This operator is a placeholder that identifies a non-Delta path-based table. Given the fact that some Delta commands (e.g. DescribeDeltaDetail) support non-Delta table, we introduced ResolvedPathBasedNonDeltaTable as the resolved placeholder after analysis on a non delta path from UnresolvedPathBasedTable.
-
trait
RowIndexFilter extends AnyRef
Provides filtering information for each row index within given range.
Provides filtering information for each row index within given range. Specific filters are implemented in subclasses.
-
sealed abstract final
class
RowIndexFilterType extends Enum[RowIndexFilterType]
Filter types corresponding to every row index filter implementations.
-
case class
SerializableFileStatus(path: String, length: Long, isDir: Boolean, modificationTime: Long) extends Product with Serializable
A serializable variant of HDFS's FileStatus.
-
class
Snapshot extends SnapshotDescriptor with SnapshotStateManager with StateCache with StatisticsCollection with DataSkippingReader with ValidateChecksum with DeltaLogging
An immutable snapshot of the state of the log at some delta version.
An immutable snapshot of the state of the log at some delta version. Internally this class manages the replay of actions stored in checkpoint or delta files.
After resolving any new actions, it caches the result and collects the following basic information to the driver:
- Protocol Version
- Metadata
- Transaction state
-
trait
SnapshotDescriptor extends AnyRef
A description of a Delta Snapshot, including basic information such its DeltaLog metadata, protocol, and version.
-
trait
SnapshotManagement extends AnyRef
Manages the creation, computation, and access of Snapshot's for Delta tables.
Manages the creation, computation, and access of Snapshot's for Delta tables. Responsibilities include:
- Figuring out the set of files that are required to compute a specific version of a table
- Updating and exposing the latest snapshot of the Delta table in a thread-safe manner
-
case class
SnapshotState(sizeInBytes: Long, numOfSetTransactions: Long, numOfFiles: Long, numOfRemoves: Long, numDeletedRecordsOpt: Option[Long], numDeletionVectorsOpt: Option[Long], numOfMetadata: Long, numOfProtocol: Long, setTransactions: Seq[SetTransaction], domainMetadata: Seq[DomainMetadata], metadata: Metadata, protocol: Protocol, fileSizeHistogram: Option[FileSizeHistogram] = None, deletedRecordCountsHistogramOpt: Option[DeletedRecordCountsHistogram] = None) extends Product with Serializable
Metrics and metadata computed around the Delta table.
Metrics and metadata computed around the Delta table.
- sizeInBytes
The total size of the table (of active files, not including tombstones).
- numOfSetTransactions
Number of streams writing to this table.
- numOfFiles
The number of files in this table.
- numOfRemoves
The number of tombstones in the state.
- numDeletedRecordsOpt
The total number of records deleted with Deletion Vectors.
- numDeletionVectorsOpt
The number of Deletion Vectors present in the table.
- numOfMetadata
The number of metadata actions in the state. Should be 1.
- numOfProtocol
The number of protocol actions in the state. Should be 1.
- setTransactions
The streaming queries writing to this table.
- metadata
The metadata of the table.
- protocol
The protocol version of the Delta table.
- fileSizeHistogram
A Histogram class tracking the file counts and total bytes in different size ranges.
- deletedRecordCountsHistogramOpt
A histogram of deletion records counts distribution for all files.
-
trait
SnapshotStateManager extends DeltaLogging
A helper class that manages the SnapshotState for a given snapshot.
A helper class that manages the SnapshotState for a given snapshot. Will generate it only when necessary.
- case class StartingVersion(version: Long) extends DeltaStartingVersion with Product with Serializable
-
trait
SubqueryTransformerHelper extends AnyRef
Trait to allow processing a special transformation of SubqueryExpression instances in a query plan.
-
case class
SubstringPartitionExpr(partitionColumn: String, substringPos: Int, substringLen: Int) extends OptimizablePartitionExpression with Product with Serializable
The rules for the generation expression
SUBSTRING(col, pos, len).The rules for the generation expression
SUBSTRING(col, pos, len). Note: - Writing an empty string to a partition column would becomenull(SPARK-24438) so generated partition filters always pick up thenullpartition for safety. - Whenposis 0, we also support optimizations for comparison operators. Whenposis not 0, we only support optimizations for EqualTo.- partitionColumn
the partition column name using SUBSTRING in its generation expression.
- substringPos
the
posparameter of SUBSTRING in the generation expression.- substringLen
the
lenparameter of SUBSTRING in the generation expression.
- case class TableChanges(child: LogicalPlan, fnName: String, cdcAttr: Seq[Attribute] = CDCReader.cdcAttributes) extends LogicalPlan with UnaryNode with Product with Serializable
-
sealed abstract
class
TableFeature extends Serializable
A base class for all table features.
A base class for all table features.
A feature can be explicitly supported by a table's protocol when the protocol contains a feature's
name. Writers (for writer-only features) or readers and writers (for reader-writer features) must recognize supported features and must handle them appropriately.A table feature that released before Delta Table Features (reader version 3 and writer version 7) is considered as a legacy feature. Legacy features are implicitly supported when (a) the protocol does not support table features, i.e., has reader version less than 3 or writer version less than 7 and (b) the feature's minimum reader/writer version is less than or equal to the current protocol's reader/writer version.
Separately, a feature can be automatically supported by a table's metadata when certain feature-specific table properties are set. For example,
changeDataFeedis automatically supported when there's a table propertydelta.enableChangeDataFeed=true. This is independent of the table's enabled features. When a feature is supported (explicitly or implicitly) by the table protocol but its metadata requirements are not satisfied, then clients still have to understand the feature (at least to the extent that they can read and preserve the existing data in the table that uses the feature). See the documentation of FeatureAutomaticallyEnabledByMetadata for more information. - case class TargetTableResolutionResult(unresolvedAttribute: UnresolvedAttribute, expr: Expression) extends Product with Serializable
- case class TestLegacyReaderWriterFeaturePreDowngradeCommand(table: DeltaTableV2) extends PreDowngradeTableFeatureCommand with Product with Serializable
- case class TestLegacyWriterFeaturePreDowngradeCommand(table: DeltaTableV2) extends PreDowngradeTableFeatureCommand with Product with Serializable
- case class TestReaderWriterFeaturePreDowngradeCommand(table: DeltaTableV2) extends PreDowngradeTableFeatureCommand with DeltaLogging with Product with Serializable
- case class TestWriterFeaturePreDowngradeCommand(table: DeltaTableV2) extends PreDowngradeTableFeatureCommand with DeltaLogging with Product with Serializable
- case class TestWriterWithHistoryValidationFeaturePreDowngradeCommand(table: DeltaTableV2) extends PreDowngradeTableFeatureCommand with DeltaLogging with Product with Serializable
- trait ThreadStorageExecutionObserver[T <: ChainableExecutionObserver[T]] extends AnyRef
-
case class
TimestampTruncPartitionExpr(format: String, partitionColumn: String) extends OptimizablePartitionExpression with Product with Serializable
The rules for the generation expression
date_trunc(field, col). -
trait
TransactionExecutionObserver extends ChainableExecutionObserver[TransactionExecutionObserver]
Track different stages of the execution of a transaction.
Track different stages of the execution of a transaction.
This is mostly meant for test instrumentation.
The default is a no-op implementation.
-
case class
TruncDatePartitionExpr(partitionColumn: String, format: String) extends OptimizablePartitionExpression with Product with Serializable
The rules for generation expression that use the function trunc(col, format) such as trunc(timestamp, 'year'), trunc(date, 'week') and trunc(timestampStr, 'hour')
The rules for generation expression that use the function trunc(col, format) such as trunc(timestamp, 'year'), trunc(date, 'week') and trunc(timestampStr, 'hour')
- partitionColumn
partition column using trunc function in the generation expression
- format
the format that specifies the unit of truncation applied to the partitionColumn
- case class TypeWideningPreDowngradeCommand(table: DeltaTableV2) extends PreDowngradeTableFeatureCommand with DeltaLogging with Product with Serializable
-
abstract
class
TypeWideningTableFeatureBase extends ReaderWriterFeature with RemovableFeature
Common base shared by the preview and stable type widening table features.
-
trait
UninitializedCheckpointProvider extends AnyRef
Represents basic information about a checkpoint.
Represents basic information about a checkpoint. This is the info we always can know about a checkpoint, without doing any additional I/O.
-
case class
UninitializedV1OrV2ParquetCheckpointProvider(version: Long, fileStatus: FileStatus, logPath: Path, lastCheckpointInfoOpt: Option[LastCheckpointInfo]) extends UninitializedV2LikeCheckpointProvider with Product with Serializable
An implementation of UninitializedCheckpointProvider to represent a parquet checkpoint which could be either a v1 checkpoint or v2 checkpoint.
An implementation of UninitializedCheckpointProvider to represent a parquet checkpoint which could be either a v1 checkpoint or v2 checkpoint. This needs to be resolved into a PreloadedCheckpointProvider or a V2CheckpointProvider depending on whether the CheckpointMetadata action is present or not in the underlying parquet file.
-
case class
UninitializedV2CheckpointProvider(version: Long, fileStatus: FileStatus, logPath: Path, hadoopConf: Configuration, deltaLogOptions: Map[String, String], logStore: LogStore, lastCheckpointInfoOpt: Option[LastCheckpointInfo]) extends UninitializedV2LikeCheckpointProvider with Product with Serializable
An implementation of UninitializedCheckpointProvider to for v2 checkpoints.
An implementation of UninitializedCheckpointProvider to for v2 checkpoints. This needs to be resolved into a V2CheckpointProvider. This class starts an I/O to fetch the V2 actions (CheckpointMetadata, SidecarFile) as soon as the class is initialized so that the extra overhead could be parallelized with other operations like reading CRC.
-
trait
UninitializedV2LikeCheckpointProvider extends UninitializedCheckpointProvider
A trait representing a v2 UninitializedCheckpointProvider
-
abstract
class
UniversalFormatConverter extends AnyRef
Class to facilitate the conversion of Delta into other table formats.
-
case class
UnresolvedPathBasedDeltaTable(path: String, options: Map[String, String], commandName: String) extends UnresolvedPathBasedDeltaTableBase with Product with Serializable
Resolves to a ResolvedTable if the DeltaTable exists
- sealed abstract class UnresolvedPathBasedDeltaTableBase extends LogicalPlan with UnresolvedLeafNode
-
case class
UnresolvedPathBasedDeltaTableRelation(path: String, options: CaseInsensitiveStringMap) extends UnresolvedPathBasedDeltaTableBase with Product with Serializable
Resolves to a DataSourceV2Relation if the DeltaTable exists
-
case class
UnresolvedPathBasedTable(path: String, options: Map[String, String], commandName: String) extends LogicalPlan with LeafNode with Product with Serializable
This operator represents path-based tables in general including both Delta or non-Delta tables.
This operator represents path-based tables in general including both Delta or non-Delta tables. It resolves to a ResolvedTable if the path is for delta table, ResolvedPathBasedNonDeltaTable if the path is for a non-Delta table.
-
trait
UpdateExpressionsSupport extends SQLConfHelper with AnalysisHelper with DeltaLogging
Trait with helper functions to generate expressions to update target columns, even if they are nested fields.
- case class V2CheckpointPreDowngradeCommand(table: DeltaTableV2) extends PreDowngradeTableFeatureCommand with DeltaLogging with Product with Serializable
-
case class
V2CheckpointProvider(version: Long, v2CheckpointFile: FileStatus, v2CheckpointFormat: Format, checkpointMetadata: CheckpointMetadata, sidecarFiles: Seq[SidecarFile], lastCheckpointInfoOpt: Option[LastCheckpointInfo], logPath: Path) extends CheckpointProvider with DeltaLogging with Product with Serializable
CheckpointProvider implementation for Json/Parquet V2 checkpoints.
CheckpointProvider implementation for Json/Parquet V2 checkpoints.
- version
checkpoint version for the underlying checkpoint
- v2CheckpointFile
FileStatus for the json/parquet v2 checkpoint file
- v2CheckpointFormat
format (json/parquet) for the v2 checkpoint
- checkpointMetadata
CheckpointMetadata for the v2 checkpoint
- sidecarFiles
seq of SidecarFile for the v2 checkpoint
- lastCheckpointInfoOpt
optional last checkpoint info for the v2 checkpoint
- logPath
delta log path for the underlying delta table
- case class VacuumProtocolCheckPreDowngradeCommand(table: DeltaTableV2) extends PreDowngradeTableFeatureCommand with DeltaLogging with Product with Serializable
-
trait
ValidateChecksum extends DeltaLogging
Verify the state of the table using the checksum information.
-
case class
VersionChecksum(txnId: Option[String], tableSizeBytes: Long, numFiles: Long, numDeletedRecordsOpt: Option[Long], numDeletionVectorsOpt: Option[Long], numMetadata: Long, numProtocol: Long, inCommitTimestampOpt: Option[Long], setTransactions: Option[Seq[SetTransaction]], domainMetadata: Option[Seq[DomainMetadata]], metadata: Metadata, protocol: Protocol, histogramOpt: Option[FileSizeHistogram], deletedRecordCountsHistogramOpt: Option[DeletedRecordCountsHistogram], allFiles: Option[Seq[AddFile]]) extends Product with Serializable
Stats calculated within a snapshot, which we store along individual transactions for verification.
Stats calculated within a snapshot, which we store along individual transactions for verification.
- txnId
Optional transaction identifier
- tableSizeBytes
The size of the table in bytes
- numFiles
Number of
AddFileactions in the snapshot- numDeletedRecordsOpt
The number of deleted records with Deletion Vectors.
- numDeletionVectorsOpt
The number of Deletion Vectors present in the snapshot.
- numMetadata
Number of
Metadataactions in the snapshot- numProtocol
Number of
Protocolactions in the snapshot- histogramOpt
Optional file size histogram
- deletedRecordCountsHistogramOpt
A histogram of the deleted records count distribution for all the files in the snapshot.
-
case class
VersionNotFoundException(userVersion: Long, earliest: Long, latest: Long) extends AnalysisException with Product with Serializable
Thrown when time travelling to a version that does not exist in the Delta Log.
Thrown when time travelling to a version that does not exist in the Delta Log.
- userVersion
- the version time travelling to
- earliest
- earliest version available in the Delta Log
- latest
- The latest version available in the Delta Log
-
sealed abstract
class
WriterFeature extends TableFeature
A base class for all writer-only table features that can only be explicitly supported.
-
case class
YearMonthDayHourPartitionExpr(yearPart: String, monthPart: String, dayPart: String, hourPart: String) extends OptimizablePartitionExpression with Product with Serializable
Optimize the case that four partition columns uses YEAR, MONTH, DAY and HOUR using the same column, such as
YEAR(eventTime),MONTH(eventTime),DAY(eventTime),HOUR(eventTime).Optimize the case that four partition columns uses YEAR, MONTH, DAY and HOUR using the same column, such as
YEAR(eventTime),MONTH(eventTime),DAY(eventTime),HOUR(eventTime).- yearPart
the year partition column name
- monthPart
the month partition column name
- dayPart
the day partition column name
- hourPart
the hour partition column name
-
case class
YearMonthDayPartitionExpr(yearPart: String, monthPart: String, dayPart: String) extends OptimizablePartitionExpression with Product with Serializable
Optimize the case that three partition columns uses YEAR, MONTH and DAY using the same column, such as
YEAR(eventTime),MONTH(eventTime)andDAY(eventTime).Optimize the case that three partition columns uses YEAR, MONTH and DAY using the same column, such as
YEAR(eventTime),MONTH(eventTime)andDAY(eventTime).- yearPart
the year partition column name
- monthPart
the month partition column name
- dayPart
the day partition column name
-
case class
YearMonthPartitionExpr(yearPart: String, monthPart: String) extends OptimizablePartitionExpression with Product with Serializable
Optimize the case that two partition columns uses YEAR and MONTH using the same column, such as
YEAR(eventTime)andMONTH(eventTime).Optimize the case that two partition columns uses YEAR and MONTH using the same column, such as
YEAR(eventTime)andMONTH(eventTime).- yearPart
the year partition column name
- monthPart
the month partition column name
-
case class
YearPartitionExpr(yearPart: String) extends OptimizablePartitionExpression with Product with Serializable
The rules for the generation expression
YEAR(col).The rules for the generation expression
YEAR(col).- yearPart
the year partition column name.
Value Members
-
object
AllowColumnDefaultsTableFeature extends WriterFeature
This table feature represents support for column DEFAULT values for Delta Lake.
This table feature represents support for column DEFAULT values for Delta Lake. With this feature, it is possible to assign default values to columns either at table creation time or later by using commands of the form: ALTER TABLE t ALTER COLUMN c SET DEFAULT v. Thereafter, queries from the table will return the specified default value instead of NULL when the corresponding field is not present in storage.
We create this as a writer-only feature rather than a reader/writer feature in order to simplify the query execution implementation for scanning Delta tables. This means that commands of the following form are not allowed: ALTER TABLE t ADD COLUMN c DEFAULT v. The reason is that when commands of that form execute (such as for other data sources like CSV or JSON), then the data source scan implementation must take responsibility to return the supplied default value for all rows, including those previously present in the table before the command executed. We choose to avoid this complexity for Delta table scans, so we make this a writer-only feature instead. Therefore, the analyzer can take care of the entire job when processing commands that introduce new rows into the table by injecting the column default value (if present) into the corresponding query plan. This comes at the expense of preventing ourselves from easily adding a default value to an existing non-empty table, because all data files would need to be rewritten to include the new column value in an expensive backfill.
- object AppendDelta
- object AppendOnlyTableFeature extends LegacyWriterFeature with FeatureAutomaticallyEnabledByMetadata
-
object
BatchCDFSchemaEndVersion extends DeltaBatchCDFSchemaMode with Product with Serializable
endVersionbatch CDF schema mode specifies that the query range's end version's schema should be used for serving the CDF batch.endVersionbatch CDF schema mode specifies that the query range's end version's schema should be used for serving the CDF batch. This is the current default for column mapping enabled tables so we could read using the exact schema at the versions being queried to reduce schema read compatibility mismatches. -
object
BatchCDFSchemaLatest extends DeltaBatchCDFSchemaMode with Product with Serializable
latestbatch CDF schema mode specifies that the latest schema should be used when serving the CDF batch. -
object
BatchCDFSchemaLegacy extends DeltaBatchCDFSchemaMode with Product with Serializable
legacybatch CDF schema mode specifies that neither latest nor end version's schema is strictly used for serving the CDF batch, e.g.legacybatch CDF schema mode specifies that neither latest nor end version's schema is strictly used for serving the CDF batch, e.g. when user uses TimeTravel with batch CDF and wants to respect the time travelled schema. This is the current default for non-column mapping tables. - object ChangeDataFeedTableFeature extends LegacyWriterFeature with FeatureAutomaticallyEnabledByMetadata
- object CheckAddFileHasStats extends IcebergCompatCheck
- object CheckConstraintsTableFeature extends LegacyWriterFeature with FeatureAutomaticallyEnabledByMetadata with RemovableFeature
-
object
CheckDeletionVectorDisabled extends IcebergCompatCheck
Check if the deletion vector has been disabled by previous snapshot or newest metadata and protocol depending on whether the operation is REORG UPGRADE UNIFORM or not.
- object CheckNoListMapNullType extends IcebergCompatCheck
- object CheckNoPartitionEvolution extends IcebergCompatCheck
-
object
CheckOnlySingleVersionEnabled extends IcebergCompatCheck
Checks that ensures no more than one IcebergCompatVx is enabled.
- object CheckPartitionDataTypeInV2AllowList extends IcebergCompatCheck
- object CheckTypeInV2AllowList extends IcebergCompatCheck
- object CheckpointInstance extends Serializable
- object CheckpointPolicy
-
object
CheckpointProtectionTableFeature extends WriterFeature with RemovableFeature
Writer feature that enforces writers to cleanup metadata iff metadata can be cleaned up to requireCheckpointProtectionBeforeVersion in one go.
Writer feature that enforces writers to cleanup metadata iff metadata can be cleaned up to requireCheckpointProtectionBeforeVersion in one go. This means that a single cleanup operation should truncate up to requireCheckpointProtectionBeforeVersion as opposed to several cleanup operations truncating in chunks.
The are two exceptions to this rule. If any of the two holds, the rule above can be ignored:
a) The writer verifies it supports all protocols between [start, min(requireCheckpointProtectionBeforeVersion, targetCleanupVersion)] versions it intends to truncate. b) The writer does not create any checkpoints during history cleanup and does not erase any checkpoints after the truncation version.
The CheckpointProtectionTableFeature can only be removed if history is truncated up to at least requireCheckpointProtectionBeforeVersion.
- object CheckpointProvider extends DeltaLogging
- object Checkpoints extends DeltaLogging
-
object
ClusteringTableFeature extends WriterFeature
Clustering table feature is enabled when a table is created with CLUSTER BY clause.
- object ColumnMappingTableFeature extends LegacyReaderWriterFeature with RemovableFeature with FeatureAutomaticallyEnabledByMetadata
-
object
ColumnWithDefaultExprUtils extends DeltaLogging
Provide utilities to handle columns with default expressions.
Provide utilities to handle columns with default expressions. Currently we support three types of such columns: (1) GENERATED columns. (2) IDENTITY columns. (3) Columns with user-specified default value expression.
- object CommitConflictFailure
- object ConcurrencyHelpers
- object CoordinatedCommitType extends Enumeration
-
object
CoordinatedCommitsTableFeature extends WriterFeature with FeatureAutomaticallyEnabledByMetadata with RemovableFeature
Table feature to represent tables whose commits are managed by separate commit-coordinator
- object DefaultRowCommitVersion
- object DeletionVectorsTableFeature extends ReaderWriterFeature with FeatureAutomaticallyEnabledByMetadata
- object DeltaBatchCDFSchemaMode
- object DeltaColumnMapping extends DeltaColumnMappingBase
- object DeltaColumnMappingMode
- object DeltaCommitTag
- object DeltaConfigs extends DeltaConfigsBase
- object DeltaErrors extends DeltaErrorsBase
- object DeltaFileProviderUtils
-
object
DeltaFullTable
Extractor Object for pulling out the full table scan of a Delta table.
- object DeltaHistory extends Serializable
-
object
DeltaHistoryManager extends DeltaLogging
Contains many utility methods that can also be executed on Spark executors.
- object DeltaLog extends DeltaLogging
- object DeltaLogFileIndex extends Serializable
-
object
DeltaOperations
Exhaustive list of operations that can be performed on a Delta table.
Exhaustive list of operations that can be performed on a Delta table. These operations are tracked as the first line in delta logs, and power
DESCRIBE HISTORYfor Delta tables. - object DeltaOptions extends DeltaLogging with Serializable
- object DeltaParquetFileFormat extends Serializable
-
object
DeltaRelation extends DeltaLogging
Matchers for dealing with a Delta table.
-
object
DeltaTable
Extractor Object for pulling out the table scan of a Delta table.
Extractor Object for pulling out the table scan of a Delta table. It could be a full scan or a partial scan.
-
object
DeltaTableIdentifier extends DeltaLogging with Serializable
Utilities for DeltaTableIdentifier.
Utilities for DeltaTableIdentifier. TODO(burak): Get rid of these utilities. DeltaCatalog should be the skinny-waist for figuring these things out.
- object DeltaTablePropertyValidationFailedSubClass
- object DeltaTableUtils extends PredicateHelper with DeltaLogging
-
object
DeltaTableValueFunctions
Resolve Delta specific table-value functions.
- object DeltaTableValueFunctionsShims
-
object
DeltaThrowableHelper
The helper object for Delta code base to pick error class template and compile the exception message.
- object DeltaThrowableHelperShims
- object DeltaTimeTravelSpec extends Serializable
- object DeltaTimeTravelSpecShims
-
object
DeltaUDF
Define a few templates for udfs used by Delta.
Define a few templates for udfs used by Delta. Use these templates to create
SparkUserDefinedFunctionto avoid creating new Encoders. This would save us from touchingScalaReflectionto reduce the lock contention in concurrent queries. - object DeltaViewHelper
- object DomainMetadataTableFeature extends WriterFeature
- object DomainMetadataUtils extends DomainMetadataUtilsBase
- object DynamicPartitionOverwriteDelta
-
object
EmptyCheckpointProvider extends CheckpointProvider
An implementation for CheckpointProvider which could be used to represent a scenario when checkpoint doesn't exist.
An implementation for CheckpointProvider which could be used to represent a scenario when checkpoint doesn't exist. This helps us simplify the code by making LogSegment.checkpointProvider as non-optional.
The CheckpointProvider.isEmpty method returns true for EmptyCheckpointProvider. Also version is returned as -1. For a real checkpoint, this will be returned true and version will be >= 0.
-
object
ExtractBaseColumn
Finds the full dot-separated path to a field and the data type of the field.
Finds the full dot-separated path to a field and the data type of the field. This unifies handling of nested and non-nested fields, and allows pattern matching on the data type.
- object FileMetadataMaterializationTracker extends DeltaLogging
- object GenerateIdentityValues extends Serializable
-
object
GenerateRowIDs extends Rule[LogicalPlan]
This rule adds a Project on top of Delta tables that support the Row tracking table feature to provide a default generated Row ID and row commit version for rows that don't have them materialized in the data file.
-
object
GeneratedColumn extends DeltaLogging with AnalysisHelper
Provide utility methods to implement Generated Columns for Delta.
Provide utility methods to implement Generated Columns for Delta. Users can use the following SQL syntax to create a table with generated columns.
CREATE TABLE table_identifier( column_name column_type, column_name column_type GENERATED ALWAYS AS ( generation_expr ), ... ) USING delta [ PARTITIONED BY (partition_column_name, ...) ]This is an example:
CREATE TABLE foo( id bigint, type string, subType string GENERATED ALWAYS AS ( SUBSTRING(type FROM 0 FOR 4) ), data string, eventTime timestamp, day date GENERATED ALWAYS AS ( days(eventTime) ) USING delta PARTITIONED BY (type, day)When writing to a table, for these generated columns: - If the output is missing a generated column, we will add an expression to generate it. - If a generated column exists in the output, in other words, we will add a constraint to ensure the given value doesn't violate the generation expression.
- object GeneratedColumnsTableFeature extends LegacyWriterFeature with FeatureAutomaticallyEnabledByMetadata
- object HudiConstants
-
object
IcebergCompat extends DeltaLogging with Serializable
Util methods to manage between IcebergCompat versions
-
object
IcebergCompatV1 extends IcebergCompat
Utils to validate the IcebergCompatV1 table feature, which is responsible for keeping Delta tables in valid states (see the Delta spec for full invariants, dependencies, and requirements) so that they are capable of having Delta to Iceberg metadata conversion applied to them.
Utils to validate the IcebergCompatV1 table feature, which is responsible for keeping Delta tables in valid states (see the Delta spec for full invariants, dependencies, and requirements) so that they are capable of having Delta to Iceberg metadata conversion applied to them. The IcebergCompatV1 table feature does not implement, specify, or control the actual metadata conversion; that is handled by the Delta UniForm feature.
Note that UniForm (Iceberg) depends on IcebergCompatV1, but IcebergCompatV1 does not depend on or require UniForm (Iceberg). It is perfectly valid for a Delta table to have IcebergCompatV1 enabled but UniForm (Iceberg) not enabled.
- object IcebergCompatV1TableFeature extends WriterFeature with FeatureAutomaticallyEnabledByMetadata
- object IcebergCompatV2 extends IcebergCompat
- object IcebergCompatV2TableFeature extends WriterFeature with FeatureAutomaticallyEnabledByMetadata
- object IcebergConstants
-
object
IdMapping extends DeltaColumnMappingMode with Product with Serializable
Id Mapping uses column ID as the true identifier of a column.
Id Mapping uses column ID as the true identifier of a column. Column IDs are stored as StructField metadata in the schema and will be used when reading and writing Parquet files. The Parquet files in this mode will also have corresponding field Ids for each column in their file schema.
This mode is used for tables converted from Iceberg.
-
object
IdentityColumn extends DeltaLogging
Provide utility methods related to IDENTITY column support for Delta.
-
object
IdentityColumnHighWaterMarkUpdateInfo
This object holds String constants used the field
debugInfofor logging IdentityColumn.opTypeHighWaterMarkUpdate.This object holds String constants used the field
debugInfofor logging IdentityColumn.opTypeHighWaterMarkUpdate. Each string represents an unexpected or notable event while calculating the high water mark. - object IdentityColumnsTableFeature extends LegacyWriterFeature with FeatureAutomaticallyEnabledByMetadata
- object IdentityOverflowLogger extends DeltaLogging
-
object
InCommitTimestampTableFeature extends WriterFeature with FeatureAutomaticallyEnabledByMetadata with RemovableFeature
inCommitTimestamp table feature is a writer feature that makes every writer write a monotonically increasing timestamp inside the commit file.
- object InCommitTimestampUtils
- object InvariantsTableFeature extends LegacyWriterFeature with FeatureAutomaticallyEnabledByMetadata
- object IsolationLevel
- object LastCheckpointInfo extends Serializable
- object LastCheckpointV2 extends Serializable
- object LogSegment extends Serializable
- object MaterializedRowCommitVersion extends MaterializedRowTrackingColumn
- object MaterializedRowId extends MaterializedRowTrackingColumn
-
object
NameMapping extends DeltaColumnMappingMode with Product with Serializable
Name Mapping uses the physical column name as the true identifier of a column.
Name Mapping uses the physical column name as the true identifier of a column. The physical name is stored as part of StructField metadata in the schema and will be used when reading and writing Parquet files. Even if id mapping can be used for reading the physical files, name mapping is used for reading statistics and partition values in the DeltaLog.
-
object
NoMapping extends DeltaColumnMappingMode with Product with Serializable
No mapping mode uses a column's display name as its true identifier to read and write data.
No mapping mode uses a column's display name as its true identifier to read and write data.
This is the default mode and is the same mode as Delta always has been.
-
object
NoOpTransactionExecutionObserver extends TransactionExecutionObserver
Default observer does nothing.
- object NumRecordsStats extends Serializable
- object OptimisticTransaction
- object OptimizablePartitionExpression
- object OverwriteDelta
- object RecordChecksum
- object RedirectReaderWriterFeature extends ReaderWriterFeature with FeatureAutomaticallyEnabledByMetadata
- object RedirectWriterOnlyFeature extends WriterFeature with FeatureAutomaticallyEnabledByMetadata
-
object
RelationFileIndex
Extractor Object for pulling out the file index of a logical relation.
- object RequireColumnMapping extends RequiredDeltaTableProperty[DeltaColumnMappingMode]
-
object
ResolveDeltaMergeInto
Implements logic to resolve conditions and actions in MERGE clauses and handles schema evolution.
- object ResolveDeltaPathTable extends Serializable
- object RowCommitVersion
-
object
RowId
Collection of helpers to handle Row IDs.
Collection of helpers to handle Row IDs.
This file includes the following Row ID features: - Enabling Row IDs using table feature and table property. - Assigning fresh Row IDs. - Reading back Row IDs. - Preserving stable Row IDs.
-
object
RowTracking
Utility functions for Row Tracking that are shared between Row IDs and Row Commit Versions.
- object RowTrackingFeature extends WriterFeature with FeatureAutomaticallyEnabledByMetadata
- object ScanWithDeletionVectors
-
object
Serializable extends IsolationLevel with Product with Serializable
This isolation level will ensure serializability between all read and write operations.
This isolation level will ensure serializability between all read and write operations. Specifically, for write operations, this mode will ensure that the result of the table will be perfectly consistent with the visible history of operations, that is, as if all the operations were executed sequentially one by one.
- object SerializableFileStatus extends Serializable
- object Snapshot extends DeltaLogging
-
object
SnapshotIsolation extends IsolationLevel with Product with Serializable
This isolation level will ensure that all reads will see a consistent snapshot of the table and any transactional write will successfully commit only if the values updated by the transaction have not been changed externally since the snapshot was read by the transaction.
This isolation level will ensure that all reads will see a consistent snapshot of the table and any transactional write will successfully commit only if the values updated by the transaction have not been changed externally since the snapshot was read by the transaction.
This provides a lower consistency guarantee than WriteSerializable but a higher availability than that. For example, unlike WriteSerializable, this level allows two concurrent UPDATE operations reading the same data to be committed successfully as long as they don't modify the same data.
Note that for operations that do not modify data in the table, Snapshot isolation is same as Serializablity. Hence such operations can be safely committed with Snapshot isolation level.
- object SnapshotManagement
- object StartingVersionLatest extends DeltaStartingVersion with Product with Serializable
-
object
SupportedGenerationExpressions
This class defines the list of expressions that can be used in a generated column.
- object TableFeature extends Serializable
- object TestFeatureWithDependency extends ReaderWriterFeature with FeatureAutomaticallyEnabledByMetadata
- object TestFeatureWithTransitiveDependency extends ReaderWriterFeature
- object TestLegacyReaderWriterFeature extends LegacyReaderWriterFeature
-
object
TestLegacyWriterFeature extends LegacyWriterFeature
Features below are for testing only, and are being registered to the system only in the testing environment.
Features below are for testing only, and are being registered to the system only in the testing environment. See TableFeature.allSupportedFeaturesMap for the registration.
- object TestReaderWriterFeature extends ReaderWriterFeature
- object TestReaderWriterMetadataAutoUpdateFeature extends ReaderWriterFeature with FeatureAutomaticallyEnabledByMetadata
- object TestReaderWriterMetadataNoAutoUpdateFeature extends ReaderWriterFeature with FeatureAutomaticallyEnabledByMetadata
- object TestRemovableLegacyReaderWriterFeature extends LegacyReaderWriterFeature with FeatureAutomaticallyEnabledByMetadata with RemovableFeature
- object TestRemovableLegacyWriterFeature extends LegacyWriterFeature with FeatureAutomaticallyEnabledByMetadata with RemovableFeature
- object TestRemovableWriterWithHistoryTruncationFeature extends WriterFeature with FeatureAutomaticallyEnabledByMetadata with RemovableFeature
- object TestWriterFeature extends WriterFeature
- object TestWriterFeatureWithTransitiveDependency extends WriterFeature
- object TestWriterMetadataNoAutoUpdateFeature extends WriterFeature with FeatureAutomaticallyEnabledByMetadata
- object TimestampNTZTableFeature extends ReaderWriterFeature with FeatureAutomaticallyEnabledByMetadata
- object TransactionExecutionObserver extends ThreadStorageExecutionObserver[TransactionExecutionObserver]
- object TypeWidening
-
object
TypeWideningPreviewTableFeature extends TypeWideningTableFeatureBase with FeatureAutomaticallyEnabledByMetadata
Feature used for the preview phase of type widening.
Feature used for the preview phase of type widening. Tables that enabled this feature during the preview will keep being supported after the preview.
-
object
TypeWideningShims
Type widening only supports a limited set of type changes with Spark 3.5 due to the parquet readers lacking the corresponding conversions that were added in Spark 4.0.
Type widening only supports a limited set of type changes with Spark 3.5 due to the parquet readers lacking the corresponding conversions that were added in Spark 4.0. This shim is for Delta on Spark 3.5 which supports: - byte -> short -> int
-
object
TypeWideningTableFeature extends TypeWideningTableFeatureBase
Stable feature for type widening.
Stable feature for type widening. The stable feature isn't enabled automatically yet when setting the type widening table property as the feature is still in preview in this version. The feature spec is finalized though and by supporting the stable feature here we guarantee that this version can already read any table created in the future.
Note: Users can manually add both the preview and stable features to a table using ADD FEATURE, although that's undocumented for type widening. This is allowed: the two feature specifications are compatible and supported.
-
object
UniversalFormat extends DeltaLogging
Utils to validate the Universal Format (UniForm) Delta feature (NOT a table feature).
Utils to validate the Universal Format (UniForm) Delta feature (NOT a table feature).
The UniForm Delta feature governs and implements the actual conversion of Delta metadata into other formats.
UniForm supports both Iceberg and Hudi. When
delta.universalFormat.enabledFormatscontains "iceberg", we say that Universal Format (Iceberg) is enabled. When it contains "hudi", we say that Universal Format (Hudi) is enabled.enforceInvariantsAndDependencies ensures that all of UniForm's requirements for the specified format are met (e.g. for 'iceberg' that IcebergCompatV1 or V2 is enabled). It doesn't verify that its nested requirements are met (e.g. IcebergCompat's requirements, like Column Mapping). That is the responsibility of format-specific validations such as IcebergCompatV1.enforceInvariantsAndDependencies and IcebergCompatV2.enforceInvariantsAndDependencies.
Note that UniForm (Iceberg) depends on IcebergCompat, but IcebergCompat does not depend on or require UniForm (Iceberg). It is perfectly valid for a Delta table to have IcebergCompatV1 or V2 enabled but UniForm (Iceberg) not enabled.
-
object
UnresolvedDeltaPathOrIdentifier
A helper object with an apply method to transform a path or table identifier to a LogicalPlan.
A helper object with an apply method to transform a path or table identifier to a LogicalPlan. If the path is set, it will be resolved to an UnresolvedPathBasedDeltaTable whereas if the tableIdentifier is set, the LogicalPlan will be an UnresolvedTable. If neither of the two options or both of them are set, apply will throw an exception.
-
object
UnresolvedPathOrIdentifier
A helper object with an apply method to transform a path or table identifier to a LogicalPlan.
A helper object with an apply method to transform a path or table identifier to a LogicalPlan. This is required by Delta commands that can also run against non-Delta tables, e.g. DESC DETAIL, VACUUM command. If the tableIdentifier is set, the LogicalPlan will be an UnresolvedTable. If the tableIdentifier is not set but the path is set, it will be resolved to an UnresolvedPathBasedTable since we can not tell if the path is for delta table or non delta table at this stage. If neither of the two are set, throws an exception.
- object V2Checkpoint
- object V2CheckpointProvider extends Serializable
-
object
V2CheckpointTableFeature extends ReaderWriterFeature with RemovableFeature with FeatureAutomaticallyEnabledByMetadata
V2 Checkpoint table feature is for checkpoints with sidecars and the new format and file naming scheme.
-
object
VacuumProtocolCheckTableFeature extends ReaderWriterFeature with RemovableFeature
A ReaderWriter table feature for VACUUM.
A ReaderWriter table feature for VACUUM. If this feature is enabled: A writer should follow one of the following:
- Non-Support for Vacuum: Writers can explicitly state that they do not support VACUUM for any table, regardless of whether the Vacuum Protocol Check Table feature exists. 2. Implement Writer Protocol Check: Ensure that the VACUUM implementation includes a writer protocol check before any file deletions occur. Readers don't need to understand or change anything new; they just need to acknowledge the feature exists
- object VariantTypeTableFeature extends ReaderWriterFeature with FeatureAutomaticallyEnabledByMetadata
-
object
WriteSerializable extends IsolationLevel with Product with Serializable
This isolation level will ensure snapshot isolation consistency guarantee between write operations only.
This isolation level will ensure snapshot isolation consistency guarantee between write operations only. In other words, if only the write operations are considered, then there exists a serializable sequence between them that would produce the same result as seen in the table. However, if both read and write operations are considered, then there may not exist a serializable sequence that would explain all the observed reads.
This provides a lower consistency guarantee than Serializable but a higher availability than that. For example, unlike Serializable, this level allows an UPDATE operation to be committed even if there was a concurrent INSERT operation that has already added data that should have been read by the UPDATE. It will be as if the UPDATE was executed before the INSERT even if the former was committed after the latter. As a side effect, the visible history of operations may not be consistent with the result expected if these operations were executed sequentially one by one.