Packages

t

org.apache.spark.sql.delta

OptimisticTransactionImpl

trait OptimisticTransactionImpl extends TransactionalWrite with SQLMetricsReporting with DeltaScanGenerator with RecordChecksum with DeltaLogging

Used to perform a set of reads in a transaction and then commit a set of updates to the state of the log. All reads from the DeltaLog, MUST go through this instance rather than directly to the DeltaLog otherwise they will not be check for logical conflicts with concurrent updates.

This trait is not thread-safe.

Known Subclasses
Ordering
  1. Alphabetic
  2. By Inheritance
Inherited
  1. OptimisticTransactionImpl
  2. RecordChecksum
  3. DeltaScanGenerator
  4. SQLMetricsReporting
  5. TransactionalWrite
  6. DeltaLogging
  7. DatabricksLogging
  8. DeltaProgressReporter
  9. LoggingShims
  10. Logging
  11. AnyRef
  12. Any
  1. Hide All
  2. Show All
Visibility
  1. Public
  2. All

Type Members

  1. implicit class LogStringContext extends AnyRef
    Definition Classes
    LoggingShims
  2. class DisabledAutoCompactPartitionStatsCollector extends AutoCompactPartitionStatsCollector

    A subclass of AutoCompactPartitionStatsCollector that's to be used if the config to collect auto compaction stats is turned off.

    A subclass of AutoCompactPartitionStatsCollector that's to be used if the config to collect auto compaction stats is turned off. This subclass intentionally does nothing.

  3. class FileSystemBasedCommitCoordinatorClient extends CommitCoordinatorClient

Abstract Value Members

  1. abstract val catalogTable: Option[CatalogTable]
  2. abstract val deltaLog: DeltaLog
  3. abstract val snapshot: Snapshot

Concrete Value Members

  1. final def !=(arg0: Any): Boolean
    Definition Classes
    AnyRef → Any
  2. final def ##(): Int
    Definition Classes
    AnyRef → Any
  3. final def ==(arg0: Any): Boolean
    Definition Classes
    AnyRef → Any
  4. val actions: ArrayBuffer[Action]

    Tracks actions within the transaction, will commit along with the passed-in actions in the commit function.

    Tracks actions within the transaction, will commit along with the passed-in actions in the commit function.

    Attributes
    protected
  5. final def asInstanceOf[T0]: T0
    Definition Classes
    Any
  6. def assertMetadata(metadata: Metadata): Unit
    Attributes
    protected
  7. def canDowngradeToSnapshotIsolation(preparedActions: Seq[Action], op: Operation): Boolean
    Attributes
    protected
  8. def canUpdateMetadata: Boolean

    Can this transaction still update the metadata? This is allowed only once per transaction.

  9. def checkForConflicts(checkVersion: Long, currentTransactionInfo: CurrentTransactionInfo, attemptNumber: Int, commitIsolationLevel: IsolationLevel): (Long, CurrentTransactionInfo)

    Looks at actions that have happened since the txn started and checks for logical conflicts with the read/writes.

    Looks at actions that have happened since the txn started and checks for logical conflicts with the read/writes. Resolve conflicts and returns a tuple representing the commit version to attempt next and the commit summary which we need to commit.

    Attributes
    protected
  10. def checkForConflictsAgainstVersion(currentTransactionInfo: CurrentTransactionInfo, otherCommitFileStatus: FileStatus, commitIsolationLevel: IsolationLevel): CurrentTransactionInfo
    Attributes
    protected
  11. def checkForSetTransactionConflictAndDedup(actions: Seq[Action]): Seq[Action]

    Checks if the passed-in actions have internal SetTransaction conflicts, will throw exceptions in case of conflicts.

    Checks if the passed-in actions have internal SetTransaction conflicts, will throw exceptions in case of conflicts. This function will also remove duplicated SetTransactions.

    Attributes
    protected
  12. def checkNoColumnDefaults(op: Operation): Unit

    If the operation assigns or modifies column default values, this method checks that the corresponding table feature is enabled and throws an error if not.

    If the operation assigns or modifies column default values, this method checks that the corresponding table feature is enabled and throws an error if not.

    Attributes
    protected
  13. def checkPartitionColumns(partitionSchema: StructType, output: Seq[Attribute], colsDropped: Boolean): Unit
    Attributes
    protected
    Definition Classes
    TransactionalWrite
  14. val checkUnsupportedDataType: Boolean

    Whether to check unsupported data type when updating the table schema

    Whether to check unsupported data type when updating the table schema

    Attributes
    protected
  15. def clock: Clock
  16. def clone(): AnyRef
    Attributes
    protected[lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( ... ) @native()
  17. def collectAutoOptimizeStats(numAdd: Long, numRemove: Long, actions: Iterator[Action]): Unit
  18. def commit(actions: Seq[Action], op: Operation, tags: Map[String, String]): Long

    Modifies the state of the log by adding a new commit that is based on a read at readVersion.

    Modifies the state of the log by adding a new commit that is based on a read at readVersion. In the case of a conflict with a concurrent writer this method will throw an exception.

    actions

    Set of actions to commit

    op

    Details of operation that is performing this transactional commit

    tags

    Extra tags to set to the CommitInfo action

  19. def commit(actions: Seq[Action], op: Operation): Long

    Modifies the state of the log by adding a new commit that is based on a read at readVersion.

    Modifies the state of the log by adding a new commit that is based on a read at readVersion. In the case of a conflict with a concurrent writer this method will throw an exception.

    actions

    Set of actions to commit

    op

    Details of operation that is performing this transactional commit

  20. val commitAttemptStartTimeMillis: Long

    Tracks the start time since we started trying to write a particular commit.

    Tracks the start time since we started trying to write a particular commit. Used for logging duration of retried transactions.

    Attributes
    protected
  21. val commitEndNano: Long

    The transaction commit end time.

    The transaction commit end time.

    Attributes
    protected
  22. def commitIfNeeded(actions: Seq[Action], op: Operation, tags: Map[String, String] = Map.empty): Option[Long]

    Modifies the state of the log by adding a new commit that is based on a read at readVersion.

    Modifies the state of the log by adding a new commit that is based on a read at readVersion. In the case of a conflict with a concurrent writer this method will throw an exception.

    Also skips creating the commit if the configured IsolationLevel doesn't need us to record the commit from correctness perspective.

    Returns the new version the transaction committed or None if the commit was skipped.

  23. def commitImpl(actions: Seq[Action], op: Operation, canSkipEmptyCommits: Boolean, tags: Map[String, String]): Option[Long]
    Attributes
    protected
    Annotations
    @throws( ... )
  24. val commitInfo: CommitInfo
    Attributes
    protected
  25. def commitLarge(spark: SparkSession, nonProtocolMetadataActions: Iterator[Action], newProtocolOpt: Option[Protocol], op: Operation, context: Map[String, String], metrics: Map[String, String]): (Long, Snapshot)

    Create a large commit on the Delta log by directly writing an iterator of FileActions to the LogStore.

    Create a large commit on the Delta log by directly writing an iterator of FileActions to the LogStore. This function only commits the next possible version and will not check whether the commit is retry-able. If the next version has already been committed, then this function will fail. This bypasses all optimistic concurrency checks. We assume that transaction conflicts should be rare because this method is typically used to create new tables (e.g. CONVERT TO DELTA) or apply some commands which rarely receive other transactions (e.g. CLONE/RESTORE). In addition, the expectation is that the list of actions performed by the transaction remains an iterator and is never materialized, given the nature of a large commit potentially touching many files. The nonProtocolMetadataActions parameter should only contain non-{protocol, metadata} actions only. If the protocol of table needs to be updated, it should be passed in the newProtocolOpt parameter.

  26. val commitStartNano: Long

    The transaction commit start time.

    The transaction commit start time.

    Attributes
    protected
  27. val committed: Boolean

    Tracks if this transaction has already committed.

    Tracks if this transaction has already committed.

    Attributes
    protected
  28. def containsPostCommitHook(hook: PostCommitHook): Boolean
  29. def convertEmptyToNullIfNeeded(plan: SparkPlan, partCols: Seq[Attribute], constraints: Seq[Constraint]): SparkPlan

    If there is any string partition column and there are constraints defined, add a projection to convert empty string to null for that column.

    If there is any string partition column and there are constraints defined, add a projection to convert empty string to null for that column. The empty strings will be converted to null eventually even without this convert, but we want to do this earlier before check constraints so that empty strings are correctly rejected. Note that this should not cause the downstream logic in FileFormatWriter to add duplicate conversions because the logic there checks the partition column using the original plan's output. When the plan is modified with additional projections, the partition column check won't match and will not add more conversion.

    plan

    The original SparkPlan.

    partCols

    The partition columns.

    constraints

    The defined constraints.

    returns

    A SparkPlan potentially modified with an additional projection on top of plan

    Attributes
    protected
    Definition Classes
    TransactionalWrite
  30. def createAutoCompactStatsCollector(): AutoCompactPartitionStatsCollector
  31. def createCoordinatedCommitsStats(): CoordinatedCommitsStats
  32. def deltaAssert(check: ⇒ Boolean, name: String, msg: String, deltaLog: DeltaLog = null, data: AnyRef = null, path: Option[Path] = None): Unit

    Helper method to check invariants in Delta code.

    Helper method to check invariants in Delta code. Fails when running in tests, records a delta assertion event and logs a warning otherwise.

    Attributes
    protected
    Definition Classes
    DeltaLogging
  33. def doCommit(attemptVersion: Long, currentTransactionInfo: CurrentTransactionInfo, attemptNumber: Int, isolationLevel: IsolationLevel): Snapshot

    Commit actions using attemptVersion version number.

    Commit actions using attemptVersion version number. Throws a FileAlreadyExistsException if any conflicts are detected.

    returns

    the post-commit snapshot of the deltaLog

    Attributes
    protected
  34. def doCommitRetryIteratively(attemptVersion: Long, currentTransactionInfo: CurrentTransactionInfo, isolationLevel: IsolationLevel): (Long, Snapshot, CurrentTransactionInfo)

    Commit the txn represented by currentTransactionInfo using attemptVersion version number.

    Commit the txn represented by currentTransactionInfo using attemptVersion version number. If there are any conflicts that are found, we will retry a fixed number of times.

    returns

    the real version that was committed, the postCommitSnapshot, and the txn info NOTE: The postCommitSnapshot may not be the same as the version committed if racing commits were written while we updated the snapshot.

    Attributes
    protected
  35. final def eq(arg0: AnyRef): Boolean
    Definition Classes
    AnyRef
  36. def equals(arg0: Any): Boolean
    Definition Classes
    AnyRef → Any
  37. val executionObserver: TransactionExecutionObserver

    Contains the execution instrumentation set via thread-local.

    Contains the execution instrumentation set via thread-local. No-op by default.

    Attributes
    protected[delta]
  38. def filesForScan(limit: Long, partitionFilters: Seq[Expression]): DeltaScan

    Returns aDeltaScan based on the given partition filters, projections and limits.

    Returns aDeltaScan based on the given partition filters, projections and limits.

    Definition Classes
    OptimisticTransactionImplDeltaScanGenerator
  39. def filesForScan(filters: Seq[Expression], keepNumRecords: Boolean = false): DeltaScan

    Returns aDeltaScan based on the given filters.

    Returns aDeltaScan based on the given filters.

    Definition Classes
    OptimisticTransactionImplDeltaScanGenerator
  40. def filesWithStatsForScan(partitionFilters: Seq[Expression]): DataFrame

    Returns a DataFrame for the given partition filters.

    Returns a DataFrame for the given partition filters. The schema of returned DataFrame is nearly the same as AddFile, except that the stats field is parsed to a struct from a json string.

    Definition Classes
    OptimisticTransactionImplDeltaScanGenerator
  41. def filterFiles(partitions: Set[Map[String, String]]): Seq[AddFile]

    Returns files within the given partitions.

    Returns files within the given partitions.

    partitions is a set of the partitionValues stored in AddFiles. This means they refer to the physical column names, and values are stored as strings.

  42. def filterFiles(filters: Seq[Expression], keepNumRecords: Boolean = false): Seq[AddFile]

    Returns files matching the given predicates.

  43. def filterFiles(): Seq[AddFile]

    Returns files matching the given predicates.

  44. def finalize(): Unit
    Attributes
    protected[lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( classOf[java.lang.Throwable] )
  45. def generateInCommitTimestampForFirstCommitAttempt(currentTimestamp: Long): Option[Long]

    Generates a timestamp which is greater than the commit timestamp of the last snapshot.

    Generates a timestamp which is greater than the commit timestamp of the last snapshot. Note that this is only needed when the feature inCommitTimestamps is enabled.

    Attributes
    protected[delta]
  46. def getAssertDeletionVectorWellFormedFunc(spark: SparkSession, op: Operation): (Action) ⇒ Unit

    Must make sure that deletion vectors are never added to a table where that isn't allowed.

    Must make sure that deletion vectors are never added to a table where that isn't allowed. Note, statistics recomputation is still allowed even though DVs might be currently disabled.

    This method returns a function that can be used to validate a single Action.

    Attributes
    protected
  47. final def getClass(): Class[_]
    Definition Classes
    AnyRef → Any
    Annotations
    @native()
  48. def getCommitter(outputPath: Path): DelayedCommitProtocol
    Attributes
    protected
    Definition Classes
    TransactionalWrite
  49. def getCommonTags(deltaLog: DeltaLog, tahoeId: String): Map[TagDefinition, String]
    Definition Classes
    DeltaLogging
  50. def getConflictingVersions(previousAttemptVersion: Long): Seq[FileStatus]

    Returns the conflicting commit information

    Returns the conflicting commit information

    Attributes
    protected
  51. def getDeltaScanGenerator(index: TahoeLogFileIndex): DeltaScanGenerator

    Returns the DeltaScanGenerator for the given log, which will be used to generate DeltaScans.

    Returns the DeltaScanGenerator for the given log, which will be used to generate DeltaScans. Every time this method is called on a log, the returned generator generator will read a snapshot that is pinned on the first access for that log.

    Internally, if the given log is the same as the log associated with this transaction, then it returns this transaction, otherwise it will return a snapshot of given log

  52. def getErrorData(e: Throwable): Map[String, Any]
    Definition Classes
    DeltaLogging
  53. def getIsolationLevelToUse(preparedActions: Seq[Action], op: Operation): IsolationLevel
    Attributes
    protected
  54. def getMetric(name: String): Option[SQLMetric]

    Returns the metric with name registered for the given transaction if it exists.

    Returns the metric with name registered for the given transaction if it exists.

    Definition Classes
    SQLMetricsReporting
  55. def getMetricsForOperation(operation: Operation): Map[String, String]

    Get the metrics for an operation based on collected SQL Metrics and filtering out the ones based on the metric parameters for that operation.

    Get the metrics for an operation based on collected SQL Metrics and filtering out the ones based on the metric parameters for that operation.

    Definition Classes
    SQLMetricsReporting
  56. def getOperationMetrics(op: Operation): Option[Map[String, String]]

    Return the operation metrics for the operation if it is enabled

  57. def getOptionalStatsTrackerAndStatsCollection(output: Seq[Attribute], outputPath: Path, partitionSchema: StructType, data: DataFrame): (Option[DeltaJobStatisticsTracker], Option[StatisticsCollection])

    Return the pair of optional stats tracker and stats collection class

    Return the pair of optional stats tracker and stats collection class

    Attributes
    protected
    Definition Classes
    TransactionalWrite
  58. def getPartitioningColumns(partitionSchema: StructType, output: Seq[Attribute]): Seq[Attribute]
    Attributes
    protected
    Definition Classes
    TransactionalWrite
  59. def getStatsColExpr(statsDataSchema: Seq[Attribute], statsCollection: StatisticsCollection): Expression
    Attributes
    protected
    Definition Classes
    TransactionalWrite
  60. def getStatsSchema(dataFrameOutput: Seq[Attribute], partitionSchema: StructType): (Seq[Attribute], Seq[Attribute])

    Return a tuple of (outputStatsCollectionSchema, statsCollectionSchema).

    Return a tuple of (outputStatsCollectionSchema, statsCollectionSchema). outputStatsCollectionSchema is the data source schema from DataFrame used for stats collection. It contains the columns in the DataFrame output, excluding the partition columns. tableStatsCollectionSchema is the schema to collect stats for. It contains the columns in the table schema, excluding the partition columns. Note: We only collect NULL_COUNT stats (as the number of rows) for the columns in statsCollectionSchema but missing in outputStatsCollectionSchema

    Attributes
    protected
    Definition Classes
    TransactionalWrite
  61. def getUserMetadata(op: Operation): Option[String]

    Return the user-defined metadata for the operation.

  62. val hasWritten: Boolean
    Attributes
    protected
    Definition Classes
    TransactionalWrite
  63. def hashCode(): Int
    Definition Classes
    AnyRef → Any
    Annotations
    @native()
  64. val incrementalCommitEnabled: Boolean
    Attributes
    protected
  65. def incrementallyDeriveChecksum(attemptVersion: Long, currentTransactionInfo: CurrentTransactionInfo): Option[VersionChecksum]

    Given an attemptVersion, obtain checksum for previous snapshot version (i.e., attemptVersion - 1) and incrementally derives a new checksum from the actions of the current transaction.

    Given an attemptVersion, obtain checksum for previous snapshot version (i.e., attemptVersion - 1) and incrementally derives a new checksum from the actions of the current transaction.

    attemptVersion

    that the current transaction is committing

    currentTransactionInfo

    containing actions of the current transaction

    Attributes
    protected
  66. def incrementallyDeriveChecksum(spark: SparkSession, deltaLog: DeltaLog, versionToCompute: Long, actions: Seq[Action], metadata: Metadata, protocol: Protocol, operationName: String, txnIdOpt: Option[String], previousVersionState: Either[Snapshot, VersionChecksum], includeAddFilesInCrc: Boolean): Either[String, VersionChecksum]

    Incrementally derive checksum for the just-committed or about-to-be committed snapshot.

    Incrementally derive checksum for the just-committed or about-to-be committed snapshot.

    spark

    The SparkSession

    deltaLog

    The DeltaLog

    versionToCompute

    The version for which we want to compute the checksum

    actions

    The actions corresponding to the version versionToCompute

    metadata

    The metadata corresponding to the version versionToCompute

    protocol

    The protocol corresponding to the version versionToCompute

    operationName

    The operation name corresponding to the version versionToCompute

    txnIdOpt

    The transaction identifier for the version versionToCompute

    previousVersionState

    Contains either the versionChecksum corresponding to versionToCompute - 1 or a snapshot. Note that the snapshot may belong to any version and this method will only use the snapshot if it corresponds to versionToCompute - 1.

    includeAddFilesInCrc

    True if the new checksum should include a AddFiles.

    returns

    Either the new checksum or an error code string if the checksum could not be computed.

    Definition Classes
    RecordChecksum
  67. def initializeLogIfNecessary(isInterpreter: Boolean, silent: Boolean): Boolean
    Attributes
    protected
    Definition Classes
    Logging
  68. def initializeLogIfNecessary(isInterpreter: Boolean): Unit
    Attributes
    protected
    Definition Classes
    Logging
  69. val isBlindAppend: Boolean

    True if this transaction is a blind append.

    True if this transaction is a blind append. This is only valid after commit.

    Attributes
    protected[delta]
  70. def isIdentityOnlyMetadataUpdate(): Boolean
  71. final def isInstanceOf[T0]: Boolean
    Definition Classes
    Any
  72. def isTraceEnabled(): Boolean
    Attributes
    protected
    Definition Classes
    Logging
  73. def log: Logger
    Attributes
    protected
    Definition Classes
    Logging
  74. def logConsole(line: String): Unit
    Definition Classes
    DatabricksLogging
  75. def logDebug(entry: LogEntry, throwable: Throwable): Unit
    Attributes
    protected
    Definition Classes
    LoggingShims
  76. def logDebug(entry: LogEntry): Unit
    Attributes
    protected
    Definition Classes
    LoggingShims
  77. def logDebug(msg: ⇒ String, throwable: Throwable): Unit
    Attributes
    protected
    Definition Classes
    Logging
  78. def logDebug(msg: ⇒ String): Unit
    Attributes
    protected
    Definition Classes
    Logging
  79. def logError(msg: MessageWithContext, throwable: Throwable): Unit
  80. def logError(msg: MessageWithContext): Unit
  81. def logError(entry: LogEntry, throwable: Throwable): Unit
    Attributes
    protected
    Definition Classes
    LoggingShims
  82. def logError(entry: LogEntry): Unit
    Attributes
    protected
    Definition Classes
    LoggingShims
  83. def logError(msg: ⇒ String, throwable: Throwable): Unit
    Attributes
    protected
    Definition Classes
    Logging
  84. def logError(msg: ⇒ String): Unit
    Attributes
    protected
    Definition Classes
    Logging
  85. def logInfo(msg: MessageWithContext): Unit
  86. def logInfo(entry: LogEntry, throwable: Throwable): Unit
    Attributes
    protected
    Definition Classes
    LoggingShims
  87. def logInfo(entry: LogEntry): Unit
    Attributes
    protected
    Definition Classes
    LoggingShims
  88. def logInfo(msg: ⇒ String, throwable: Throwable): Unit
    Attributes
    protected
    Definition Classes
    Logging
  89. def logInfo(msg: ⇒ String): Unit
    Attributes
    protected
    Definition Classes
    Logging
  90. def logName: String
    Attributes
    protected
    Definition Classes
    Logging
  91. lazy val logPrefix: MessageWithContext
    Attributes
    protected
  92. def logTrace(entry: LogEntry, throwable: Throwable): Unit
    Attributes
    protected
    Definition Classes
    LoggingShims
  93. def logTrace(entry: LogEntry): Unit
    Attributes
    protected
    Definition Classes
    LoggingShims
  94. def logTrace(msg: ⇒ String, throwable: Throwable): Unit
    Attributes
    protected
    Definition Classes
    Logging
  95. def logTrace(msg: ⇒ String): Unit
    Attributes
    protected
    Definition Classes
    Logging
  96. def logWarning(msg: MessageWithContext, throwable: Throwable): Unit
  97. def logWarning(msg: MessageWithContext): Unit
  98. def logWarning(entry: LogEntry, throwable: Throwable): Unit
    Attributes
    protected
    Definition Classes
    LoggingShims
  99. def logWarning(entry: LogEntry): Unit
    Attributes
    protected
    Definition Classes
    LoggingShims
  100. def logWarning(msg: ⇒ String, throwable: Throwable): Unit
    Attributes
    protected
    Definition Classes
    Logging
  101. def logWarning(msg: ⇒ String): Unit
    Attributes
    protected
    Definition Classes
    Logging
  102. def makeOutputNullable(output: Seq[Attribute]): Seq[Attribute]

    Makes the output attributes nullable, so that we don't write unreadable parquet files.

    Makes the output attributes nullable, so that we don't write unreadable parquet files.

    Attributes
    protected
    Definition Classes
    TransactionalWrite
  103. def mapColumnAttributes(output: Seq[Attribute], mappingMode: DeltaColumnMappingMode): Seq[Attribute]

    Replace the output attributes with the physical mapping information.

    Replace the output attributes with the physical mapping information.

    Attributes
    protected
    Definition Classes
    TransactionalWrite
  104. def metadata: Metadata

    Returns the metadata for this transaction.

    Returns the metadata for this transaction. The metadata refers to the metadata of the snapshot at the transaction's read version unless updated during the transaction.

    Definition Classes
    OptimisticTransactionImplTransactionalWrite
  105. final def ne(arg0: AnyRef): Boolean
    Definition Classes
    AnyRef
  106. val newMetadata: Option[Metadata]

    Stores the updated metadata (if any) that will result from this txn.

    Stores the updated metadata (if any) that will result from this txn.

    This is just one way to change metadata. New metadata can also be added during commit from actions. But metadata should *not* be updated via both paths.

    Attributes
    protected
  107. val newProtocol: Option[Protocol]

    Stores the updated protocol (if any) that will result from this txn.

    Stores the updated protocol (if any) that will result from this txn.

    Attributes
    protected
  108. def normalizeData(deltaLog: DeltaLog, options: Option[DeltaOptions], data: Dataset[_]): (QueryExecution, Seq[Attribute], Seq[Constraint], Set[String])

    Normalize the schema of the query, and return the QueryExecution to execute.

    Normalize the schema of the query, and return the QueryExecution to execute. If the table has generated columns and users provide these columns in the output, we will also return constraints that should be respected. If any constraints are returned, the caller should apply these constraints when writing data.

    Note: The output attributes of the QueryExecution may not match the attributes we return as the output schema. This is because streaming queries create IncrementalExecution, which cannot be further modified. We can however have the Parquet writer use the physical plan from IncrementalExecution and the output schema provided through the attributes.

    Attributes
    protected
    Definition Classes
    TransactionalWrite
  109. final def notify(): Unit
    Definition Classes
    AnyRef
    Annotations
    @native()
  110. final def notifyAll(): Unit
    Definition Classes
    AnyRef
    Annotations
    @native()
  111. val partitionsAddedToOpt: Option[HashSet[Map[String, String]]]

    The set of distinct partitions that contain added files by current transaction.

    The set of distinct partitions that contain added files by current transaction.

    Attributes
    protected[delta]
  112. def performCDCPartition(inputData: Dataset[_]): (DataFrame, StructType)

    Returns a tuple of (data, partition schema).

    Returns a tuple of (data, partition schema). For CDC writes, a is_cdc column is added to the data and is_cdc=true/false is added to the front of the partition schema.

    Attributes
    protected
    Definition Classes
    TransactionalWrite
  113. def performCdcMetadataCheck(): Unit

    Checks if the new schema contains any CDC columns (which is invalid) and throws the appropriate error

    Checks if the new schema contains any CDC columns (which is invalid) and throws the appropriate error

    Attributes
    protected
  114. def performRedirectCheck(op: Operation): Unit

    This method determines whether op is valid when the table redirect feature is set on current table.

    This method determines whether op is valid when the table redirect feature is set on current table. 1. If redirect table feature is in progress state, no DML/DDL is allowed to execute. 2. If user tries to access redirect source table, only the allowed operations listed inside no-redirect-rules are valid.

    Attributes
    protected
  115. val postCommitHooks: ArrayBuffer[PostCommitHook]
    Attributes
    protected
  116. def precommitUpdateSchemaWithIdentityHighWaterMarks(): Unit
  117. def prepareCommit(actions: Seq[Action], op: Operation): Seq[Action]

    Prepare for a commit by doing all necessary pre-commit checks and modifications to the actions.

    Prepare for a commit by doing all necessary pre-commit checks and modifications to the actions.

    returns

    The finalized set of actions.

    Attributes
    protected
  118. def protocol: Protocol

    The protocol of the snapshot that this transaction is reading at.

    The protocol of the snapshot that this transaction is reading at.

    Definition Classes
    OptimisticTransactionImplTransactionalWrite
  119. val readFiles: HashSet[AddFile]

    Tracks specific files that have been seen by this transaction.

    Tracks specific files that have been seen by this transaction.

    Attributes
    protected
  120. val readPredicates: ConcurrentLinkedQueue[DeltaTableReadPredicate]

    Tracks the data that could have been seen by recording the partition predicates by which files have been queried by this transaction.

    Tracks the data that could have been seen by recording the partition predicates by which files have been queried by this transaction.

    Attributes
    protected
  121. val readSnapshots: ConcurrentHashMap[(String, Path), Snapshot]

    Tracks the first-access snapshots of other Delta logs read by this transaction.

    Tracks the first-access snapshots of other Delta logs read by this transaction. The snapshots are keyed by the log's unique id.

    Attributes
    protected
  122. val readTheWholeTable: Boolean

    Whether the whole table was read during the transaction.

    Whether the whole table was read during the transaction.

    Attributes
    protected
  123. val readTxn: ArrayBuffer[String]

    Tracks the appIds that have been seen by this transaction.

    Tracks the appIds that have been seen by this transaction.

    Attributes
    protected
  124. def readVersion: Long

    The version that this transaction is reading from.

  125. def readWholeTable(): Unit

    Mark the entire table as tainted by this transaction.

  126. def recordDeltaEvent(deltaLog: DeltaLog, opType: String, tags: Map[TagDefinition, String] = Map.empty, data: AnyRef = null, path: Option[Path] = None): Unit

    Used to record the occurrence of a single event or report detailed, operation specific statistics.

    Used to record the occurrence of a single event or report detailed, operation specific statistics.

    path

    Used to log the path of the delta table when deltaLog is null.

    Attributes
    protected
    Definition Classes
    DeltaLogging
  127. def recordDeltaOperation[A](deltaLog: DeltaLog, opType: String, tags: Map[TagDefinition, String] = Map.empty)(thunk: ⇒ A): A

    Used to report the duration as well as the success or failure of an operation on a deltaLog.

    Used to report the duration as well as the success or failure of an operation on a deltaLog.

    Attributes
    protected
    Definition Classes
    DeltaLogging
  128. def recordDeltaOperationForTablePath[A](tablePath: String, opType: String, tags: Map[TagDefinition, String] = Map.empty)(thunk: ⇒ A): A

    Used to report the duration as well as the success or failure of an operation on a tahoePath.

    Used to report the duration as well as the success or failure of an operation on a tahoePath.

    Attributes
    protected
    Definition Classes
    DeltaLogging
  129. def recordEvent(metric: MetricDefinition, additionalTags: Map[TagDefinition, String] = Map.empty, blob: String = null, trimBlob: Boolean = true): Unit
    Definition Classes
    DatabricksLogging
  130. def recordFrameProfile[T](group: String, name: String)(thunk: ⇒ T): T
    Attributes
    protected
    Definition Classes
    DeltaLogging
  131. def recordOperation[S](opType: OpType, opTarget: String = null, extraTags: Map[TagDefinition, String], isSynchronous: Boolean = true, alwaysRecordStats: Boolean = false, allowAuthTags: Boolean = false, killJvmIfStuck: Boolean = false, outputMetric: MetricDefinition = METRIC_OPERATION_DURATION, silent: Boolean = true)(thunk: ⇒ S): S
    Definition Classes
    DatabricksLogging
  132. def recordProductEvent(metric: MetricDefinition with CentralizableMetric, additionalTags: Map[TagDefinition, String] = Map.empty, blob: String = null, trimBlob: Boolean = true): Unit
    Definition Classes
    DatabricksLogging
  133. def recordProductUsage(metric: MetricDefinition with CentralizableMetric, quantity: Double, additionalTags: Map[TagDefinition, String] = Map.empty, blob: String = null, forceSample: Boolean = false, trimBlob: Boolean = true, silent: Boolean = false): Unit
    Definition Classes
    DatabricksLogging
  134. def recordUsage(metric: MetricDefinition, quantity: Double, additionalTags: Map[TagDefinition, String] = Map.empty, blob: String = null, forceSample: Boolean = false, trimBlob: Boolean = true, silent: Boolean = false): Unit
    Definition Classes
    DatabricksLogging
  135. def registerPostCommitHook(hook: PostCommitHook): Unit

    Register a hook that will be executed once a commit is successful.

  136. def registerSQLMetrics(spark: SparkSession, metrics: Map[String, SQLMetric]): Unit

    Register SQL metrics for an operation by appending the supplied metrics map to the operationSQLMetrics map.

    Register SQL metrics for an operation by appending the supplied metrics map to the operationSQLMetrics map.

    Definition Classes
    SQLMetricsReporting
  137. def registerTableForCoordinatedCommitsIfNeeded(finalMetadata: Metadata, finalProtocol: Protocol): Option[Map[String, String]]

    This method registers the table with the commit-coordinator via the CommitCoordinatorClient if the table is transitioning from file-system based table to coordinated-commits table.

    This method registers the table with the commit-coordinator via the CommitCoordinatorClient if the table is transitioning from file-system based table to coordinated-commits table.

    finalMetadata

    the effective Metadata of the table. Note that this refers to the new metadata if this commit is updating the table Metadata.

    finalProtocol

    the effective Protocol of the table. Note that this refers to the new protocol if this commit is updating the table Protocol.

    returns

    The new coordinated-commits table metadata if the table is transitioning from file-system based table to coordinated-commits table. Otherwise, None. This metadata should be added to the Metadata.configuration before doing the commit.

    Attributes
    protected
  138. def reportAutoCompactStatsError(e: Throwable): Unit
  139. def runPostCommitHook(hook: PostCommitHook, version: Long, postCommitSnapshot: Snapshot, committedActions: Seq[Action]): Unit
    Attributes
    protected
  140. def runPostCommitHooks(version: Long, postCommitSnapshot: Snapshot, committedActions: Seq[Action]): Unit

    Executes the registered post commit hooks.

    Executes the registered post commit hooks.

    Attributes
    protected
  141. def setNeedsCheckpoint(committedVersion: Long, postCommitSnapshot: Snapshot): Unit

    Sets needsCheckpoint if we should checkpoint the version that has just been committed.

    Sets needsCheckpoint if we should checkpoint the version that has just been committed.

    Attributes
    protected
  142. def setSyncIdentity(): Unit
  143. def setTrackHighWaterMarks(track: Set[String]): Unit
  144. val shouldVerifyIncrementalCommit: Boolean
    Attributes
    protected
  145. def skipRecordingEmptyCommitAllowed(isolationLevelToUse: IsolationLevel): Boolean

    Whether to skip recording the commit in DeltaLog

    Whether to skip recording the commit in DeltaLog

    Attributes
    protected
  146. val snapshotToScan: Snapshot

    The snapshot that the scan is being generated on.

    The snapshot that the scan is being generated on.

    Definition Classes
    OptimisticTransactionImplDeltaScanGenerator
  147. def spark: SparkSession
    Attributes
    protected
    Definition Classes
    OptimisticTransactionImplRecordChecksum
  148. def split(readFilesSubset: Seq[AddFile]): OptimisticTransaction

    Splits a transaction into smaller child transactions that operate on disjoint sets of the files read by the parent transaction.

    Splits a transaction into smaller child transactions that operate on disjoint sets of the files read by the parent transaction. This function is typically used when you want to break a large operation into one that can be committed separately / incrementally.

    readFilesSubset

    The subset of files read by the current transaction that will be handled by the new transaction.

  149. def statsCollector: Column

    Gets the stats collector for the table at the snapshot this transaction has.

  150. val syncIdentity: Boolean
    Attributes
    protected
  151. final def synchronized[T0](arg0: ⇒ T0): T0
    Definition Classes
    AnyRef
  152. def toString(): String
    Definition Classes
    AnyRef → Any
  153. def trackFilesRead(files: Seq[AddFile]): Unit

    Mark the given files as read within this transaction.

  154. val trackHighWaterMarks: Option[Set[String]]
    Attributes
    protected
  155. def trackReadPredicates(filters: Seq[Expression], partitionOnly: Boolean = false, shouldRewriteFilter: Boolean = true): Unit

    Mark the predicates that have been queried by this transaction.

  156. def txnExecutionTimeMs: Option[Long]

    The end to end execution time of this transaction.

  157. val txnId: String

    Unique identifier for the transaction

  158. val txnStartNano: Long

    The transaction start time.

    The transaction start time.

    Attributes
    protected
  159. def txnStartTimeNs: Long

    Start time of txn in nanoseconds

  160. def txnVersion(id: String): Long

    Returns the latest version that has committed for the idempotent transaction with given id.

  161. def updateAndCheckpoint(spark: SparkSession, deltaLog: DeltaLog, commitSize: Int, attemptVersion: Long, commit: Commit, txnId: String): Snapshot

    Update the table now that the commit has been made, and write a checkpoint.

    Update the table now that the commit has been made, and write a checkpoint.

    Attributes
    protected
  162. def updateMetadata(proposedNewMetadata: Metadata, ignoreDefaultProperties: Boolean = false): Unit

    Records an update to the metadata that should be committed with this transaction.

    Records an update to the metadata that should be committed with this transaction. Note that this must be done before writing out any files so that file writing and checks happen with the final metadata for the table.

    IMPORTANT: It is the responsibility of the caller to ensure that files currently present in the table are still valid under the new metadata.

  163. def updateMetadataAndProtocolWithRequiredFeatures(metadataOpt: Option[Metadata]): Unit

    A metadata update can enable a feature that requires a protocol upgrade.

    A metadata update can enable a feature that requires a protocol upgrade. Furthermore, a feature can have dependencies on other features. This method enables the dependent features in the metadata. It then updates the protocol with the features enabled by the metadata. The global newMetadata and newProtocol are updated with the new metadata and protocol if needed.

    metadataOpt

    The new metadata that is being set.

    Attributes
    protected
  164. def updateMetadataForNewTable(metadata: Metadata): Unit

    Records an update to the metadata that should be committed with this transaction and when this transaction is logically creating a new table, e.g.

    Records an update to the metadata that should be committed with this transaction and when this transaction is logically creating a new table, e.g. replacing a previous table with new metadata. Note that this must be done before writing out any files so that file writing and checks happen with the final metadata for the table. IMPORTANT: It is the responsibility of the caller to ensure that files currently present in the table are still valid under the new metadata.

  165. def updateMetadataForNewTableInReplace(metadata: Metadata): Unit

    Updates the metadata of the target table in an effective REPLACE command.

    Updates the metadata of the target table in an effective REPLACE command. Note that replacing a table is similar to dropping a table and then recreating it. However, the backing catalog object does not change. For now, for Coordinated Commit tables, this function retains the coordinator details (and other associated Coordinated Commits properties) from the original table during a REPLACE. And if the table had a coordinator, existing ICT properties are also retained; otherwise, default ICT properties are included. TODO (YumingxuanGuo): Remove this once the exact semantic on default Coordinated Commits configurations is finalized.

  166. def updateMetadataForTableOverwrite(proposedNewMetadata: Metadata): Unit

    Records an update to the metadata that should be committed with this transaction and when this transaction is attempt to overwrite the data and schema using .mode('overwrite') and .option('overwriteSchema', true).

    Records an update to the metadata that should be committed with this transaction and when this transaction is attempt to overwrite the data and schema using .mode('overwrite') and .option('overwriteSchema', true). REPLACE the table is not considered in this category, because that is logically equivalent to DROP and RECREATE the table.

  167. def updateMetadataInternal(proposedNewMetadata: Metadata, ignoreDefaultProperties: Boolean): Unit

    Do the actual checks and works to update the metadata and save it into the newMetadata field, which will be added to the actions to commit in prepareCommit.

    Do the actual checks and works to update the metadata and save it into the newMetadata field, which will be added to the actions to commit in prepareCommit.

    Attributes
    protected
  168. def updateMetadataWithCoordinatedCommitsConfs(): Boolean

    This method makes the necessary changes to Metadata based on coordinated-commits: If the table is being converted from file-system to coordinated commits, then it registers the table with the commit-coordinator and updates the Metadata with the necessary configuration information from the commit-coordinator.

    This method makes the necessary changes to Metadata based on coordinated-commits: If the table is being converted from file-system to coordinated commits, then it registers the table with the commit-coordinator and updates the Metadata with the necessary configuration information from the commit-coordinator.

    returns

    A boolean which represents whether we have updated the table Metadata with coordinated-commits information. If no changed were made, returns false.

    Attributes
    protected
  169. def updateMetadataWithInCommitTimestamp(commitInfo: CommitInfo): Boolean

    This method makes the necessary changes to Metadata based on ICT: If ICT is getting enabled as part of this commit, then it updates the Metadata with the ICT enablement information.

    This method makes the necessary changes to Metadata based on ICT: If ICT is getting enabled as part of this commit, then it updates the Metadata with the ICT enablement information.

    commitInfo

    commitInfo for the commit

    returns

    true if changes were made to Metadata else false.

    Attributes
    protected
  170. def updateProtocol(protocol: Protocol): Unit

    This updates the protocol for the table with a given protocol.

    This updates the protocol for the table with a given protocol. Note that the protocol set by this method can be overwritten by other methods, such as updateMetadata.

  171. def updateSetTransaction(appId: String, version: Long, lastUpdate: Option[Long]): Unit

    Record a SetTransaction action that will be committed as part of this transaction.

  172. val updatedIdentityHighWaterMarks: ArrayBuffer[(String, Long)]
    Attributes
    protected
  173. def validateCoordinatedCommitsConfInMetadata(newMetadataOpt: Option[Metadata]): Unit
    Attributes
    protected
  174. final def wait(): Unit
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  175. final def wait(arg0: Long, arg1: Int): Unit
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  176. final def wait(arg0: Long): Unit
    Definition Classes
    AnyRef
    Annotations
    @throws( ... ) @native()
  177. def withStatusCode[T](statusCode: String, defaultMessage: String, data: Map[String, Any] = Map.empty)(body: ⇒ T): T

    Report a log to indicate some command is running.

    Report a log to indicate some command is running.

    Definition Classes
    DeltaProgressReporter
  178. def writeChecksumFile(txnId: String, snapshot: Snapshot): Unit
    Attributes
    protected
    Definition Classes
    RecordChecksum
  179. def writeCommitFile(attemptVersion: Long, jsonActions: Iterator[String], currentTransactionInfo: CurrentTransactionInfo): (Option[VersionChecksum], Commit)

    Writes the json actions provided to the commit file corresponding to attemptVersion.

    Writes the json actions provided to the commit file corresponding to attemptVersion. If coordinated-commits are enabled, this method must return a non-empty Commit since we can't guess it from the FileSystem.

    Attributes
    protected
  180. def writeCommitFileImpl(attemptVersion: Long, jsonActions: Iterator[String], tableCommitCoordinatorClient: TableCommitCoordinatorClient, currentTransactionInfo: CurrentTransactionInfo): Commit
    Attributes
    protected
  181. def writeFiles(inputData: Dataset[_], writeOptions: Option[DeltaOptions], isOptimize: Boolean, additionalConstraints: Seq[Constraint]): Seq[FileAction]

    Writes out the dataframe after performing schema validation.

    Writes out the dataframe after performing schema validation. Returns a list of actions to append these files to the reservoir.

    inputData

    Data to write out.

    writeOptions

    Options to decide how to write out the data.

    isOptimize

    Whether the operation writing this is Optimize or not.

    additionalConstraints

    Additional constraints on the write.

    Definition Classes
    TransactionalWrite
  182. def writeFiles(data: Dataset[_], deltaOptions: Option[DeltaOptions], additionalConstraints: Seq[Constraint]): Seq[FileAction]
    Definition Classes
    TransactionalWrite
  183. def writeFiles(data: Dataset[_]): Seq[FileAction]
    Definition Classes
    TransactionalWrite
  184. def writeFiles(data: Dataset[_], writeOptions: Option[DeltaOptions]): Seq[FileAction]
    Definition Classes
    TransactionalWrite
  185. def writeFiles(data: Dataset[_], additionalConstraints: Seq[Constraint]): Seq[FileAction]
    Definition Classes
    TransactionalWrite

Inherited from RecordChecksum

Inherited from DeltaScanGenerator

Inherited from SQLMetricsReporting

Inherited from TransactionalWrite

Inherited from DeltaLogging

Inherited from DatabricksLogging

Inherited from DeltaProgressReporter

Inherited from LoggingShims

Inherited from Logging

Inherited from AnyRef

Inherited from Any

Ungrouped