Packages

t

org.apache.spark.sql.delta

OptimisticTransactionImpl

trait OptimisticTransactionImpl extends TransactionalWrite with SQLMetricsReporting with DeltaScanGenerator with DeltaLogging

Used to perform a set of reads in a transaction and then commit a set of updates to the state of the log. All reads from the DeltaLog, MUST go through this instance rather than directly to the DeltaLog otherwise they will not be check for logical conflicts with concurrent updates.

This trait is not thread-safe.

Known Subclasses
Ordering
  1. Alphabetic
  2. By Inheritance
Inherited
  1. OptimisticTransactionImpl
  2. DeltaScanGenerator
  3. SQLMetricsReporting
  4. TransactionalWrite
  5. DeltaLogging
  6. DatabricksLogging
  7. DeltaProgressReporter
  8. Logging
  9. AnyRef
  10. Any
  1. Hide All
  2. Show All
Visibility
  1. Public
  2. Protected

Type Members

  1. class DisabledAutoCompactPartitionStatsCollector extends AutoCompactPartitionStatsCollector

    A subclass of AutoCompactPartitionStatsCollector that's to be used if the config to collect auto compaction stats is turned off.

    A subclass of AutoCompactPartitionStatsCollector that's to be used if the config to collect auto compaction stats is turned off. This subclass intentionally does nothing.

  2. class FileSystemBasedCommitOwnerClient extends CommitOwnerClient

Abstract Value Members

  1. abstract val catalogTable: Option[CatalogTable]
  2. abstract val deltaLog: DeltaLog
  3. abstract val snapshot: Snapshot

Concrete Value Members

  1. final def !=(arg0: Any): Boolean
    Definition Classes
    AnyRef → Any
  2. final def ##: Int
    Definition Classes
    AnyRef → Any
  3. final def ==(arg0: Any): Boolean
    Definition Classes
    AnyRef → Any
  4. val actions: ArrayBuffer[Action]

    Tracks actions within the transaction, will commit along with the passed-in actions in the commit function.

    Tracks actions within the transaction, will commit along with the passed-in actions in the commit function.

    Attributes
    protected
  5. final def asInstanceOf[T0]: T0
    Definition Classes
    Any
  6. def assertMetadata(metadata: Metadata): Unit
    Attributes
    protected
  7. def canDowngradeToSnapshotIsolation(preparedActions: Seq[Action], op: Operation): Boolean
    Attributes
    protected
  8. def canUpdateMetadata: Boolean

    Can this transaction still update the metadata? This is allowed only once per transaction.

  9. val checkDeletionVectorFilesHaveWideBounds: Boolean
    Attributes
    protected
  10. def checkForConflicts(checkVersion: Long, currentTransactionInfo: CurrentTransactionInfo, attemptNumber: Int, commitIsolationLevel: IsolationLevel): (Long, CurrentTransactionInfo)

    Looks at actions that have happened since the txn started and checks for logical conflicts with the read/writes.

    Looks at actions that have happened since the txn started and checks for logical conflicts with the read/writes. Resolve conflicts and returns a tuple representing the commit version to attempt next and the commit summary which we need to commit.

    Attributes
    protected
  11. def checkForConflictsAgainstVersion(currentTransactionInfo: CurrentTransactionInfo, otherCommitFileStatus: FileStatus, commitIsolationLevel: IsolationLevel): CurrentTransactionInfo
    Attributes
    protected
  12. def checkForSetTransactionConflictAndDedup(actions: Seq[Action]): Seq[Action]

    Checks if the passed-in actions have internal SetTransaction conflicts, will throw exceptions in case of conflicts.

    Checks if the passed-in actions have internal SetTransaction conflicts, will throw exceptions in case of conflicts. This function will also remove duplicated SetTransactions.

    Attributes
    protected
  13. def checkNoColumnDefaults(op: Operation): Unit

    If the operation assigns or modifies column default values, this method checks that the corresponding table feature is enabled and throws an error if not.

    If the operation assigns or modifies column default values, this method checks that the corresponding table feature is enabled and throws an error if not.

    Attributes
    protected
  14. def checkPartitionColumns(partitionSchema: StructType, output: Seq[Attribute], colsDropped: Boolean): Unit
    Attributes
    protected
    Definition Classes
    TransactionalWrite
  15. val checkUnsupportedDataType: Boolean

    Whether to check unsupported data type when updating the table schema

    Whether to check unsupported data type when updating the table schema

    Attributes
    protected
  16. def clock: Clock
  17. def clone(): AnyRef
    Attributes
    protected[lang]
    Definition Classes
    AnyRef
    Annotations
    @throws(classOf[java.lang.CloneNotSupportedException]) @native()
  18. def collectAutoOptimizeStats(numAdd: Long, numRemove: Long, actions: Iterator[Action]): Unit
  19. def commit(actions: Seq[Action], op: Operation, tags: Map[String, String]): Long

    Modifies the state of the log by adding a new commit that is based on a read at readVersion.

    Modifies the state of the log by adding a new commit that is based on a read at readVersion. In the case of a conflict with a concurrent writer this method will throw an exception.

    actions

    Set of actions to commit

    op

    Details of operation that is performing this transactional commit

    tags

    Extra tags to set to the CommitInfo action

  20. def commit(actions: Seq[Action], op: Operation): Long

    Modifies the state of the log by adding a new commit that is based on a read at readVersion.

    Modifies the state of the log by adding a new commit that is based on a read at readVersion. In the case of a conflict with a concurrent writer this method will throw an exception.

    actions

    Set of actions to commit

    op

    Details of operation that is performing this transactional commit

  21. val commitAttemptStartTimeMillis: Long

    Tracks the start time since we started trying to write a particular commit.

    Tracks the start time since we started trying to write a particular commit. Used for logging duration of retried transactions.

    Attributes
    protected
  22. val commitEndNano: Long

    The transaction commit end time.

    The transaction commit end time.

    Attributes
    protected
  23. def commitIfNeeded(actions: Seq[Action], op: Operation, tags: Map[String, String] = Map.empty): Unit

    Modifies the state of the log by adding a new commit that is based on a read at readVersion.

    Modifies the state of the log by adding a new commit that is based on a read at readVersion. In the case of a conflict with a concurrent writer this method will throw an exception.

    Also skips creating the commit if the configured IsolationLevel doesn't need us to record the commit from correctness perspective.

  24. def commitImpl(actions: Seq[Action], op: Operation, canSkipEmptyCommits: Boolean, tags: Map[String, String]): Option[Long]
    Attributes
    protected
    Annotations
    @throws(classOf[ConcurrentModificationException])
  25. val commitInfo: CommitInfo
    Attributes
    protected
  26. def commitLarge(spark: SparkSession, nonProtocolMetadataActions: Iterator[Action], newProtocolOpt: Option[Protocol], op: Operation, context: Map[String, String], metrics: Map[String, String]): (Long, Snapshot)

    Create a large commit on the Delta log by directly writing an iterator of FileActions to the LogStore.

    Create a large commit on the Delta log by directly writing an iterator of FileActions to the LogStore. This function only commits the next possible version and will not check whether the commit is retry-able. If the next version has already been committed, then this function will fail. This bypasses all optimistic concurrency checks. We assume that transaction conflicts should be rare because this method is typically used to create new tables (e.g. CONVERT TO DELTA) or apply some commands which rarely receive other transactions (e.g. CLONE/RESTORE). In addition, the expectation is that the list of actions performed by the transaction remains an iterator and is never materialized, given the nature of a large commit potentially touching many files. The nonProtocolMetadataActions parameter should only contain non-{protocol, metadata} actions only. If the protocol of table needs to be updated, it should be passed in the newProtocolOpt parameter.

  27. val commitStartNano: Long

    The transaction commit start time.

    The transaction commit start time.

    Attributes
    protected
  28. val committed: Boolean

    Tracks if this transaction has already committed.

    Tracks if this transaction has already committed.

    Attributes
    protected
  29. def containsPostCommitHook(hook: PostCommitHook): Boolean
  30. def convertEmptyToNullIfNeeded(plan: SparkPlan, partCols: Seq[Attribute], constraints: Seq[Constraint]): SparkPlan

    If there is any string partition column and there are constraints defined, add a projection to convert empty string to null for that column.

    If there is any string partition column and there are constraints defined, add a projection to convert empty string to null for that column. The empty strings will be converted to null eventually even without this convert, but we want to do this earlier before check constraints so that empty strings are correctly rejected. Note that this should not cause the downstream logic in FileFormatWriter to add duplicate conversions because the logic there checks the partition column using the original plan's output. When the plan is modified with additional projections, the partition column check won't match and will not add more conversion.

    plan

    The original SparkPlan.

    partCols

    The partition columns.

    constraints

    The defined constraints.

    returns

    A SparkPlan potentially modified with an additional projection on top of plan

    Attributes
    protected
    Definition Classes
    TransactionalWrite
  31. def createAutoCompactStatsCollector(): AutoCompactPartitionStatsCollector
  32. def deltaAssert(check: => Boolean, name: String, msg: String, deltaLog: DeltaLog = null, data: AnyRef = null, path: Option[Path] = None): Unit

    Helper method to check invariants in Delta code.

    Helper method to check invariants in Delta code. Fails when running in tests, records a delta assertion event and logs a warning otherwise.

    Attributes
    protected
    Definition Classes
    DeltaLogging
  33. def disableDeletionVectorFilesHaveWideBoundsCheck(): Unit

    Disable the check that ensures that all files with DVs added have tightBounds set to false.

    Disable the check that ensures that all files with DVs added have tightBounds set to false.

    This is necessary when recomputing the stats on a table with DVs.

  34. def doCommit(attemptVersion: Long, currentTransactionInfo: CurrentTransactionInfo, attemptNumber: Int, isolationLevel: IsolationLevel): Snapshot

    Commit actions using attemptVersion version number.

    Commit actions using attemptVersion version number. Throws a FileAlreadyExistsException if any conflicts are detected.

    returns

    the post-commit snapshot of the deltaLog

    Attributes
    protected
  35. def doCommitRetryIteratively(attemptVersion: Long, currentTransactionInfo: CurrentTransactionInfo, isolationLevel: IsolationLevel): (Long, Snapshot, CurrentTransactionInfo)

    Commit the txn represented by currentTransactionInfo using attemptVersion version number.

    Commit the txn represented by currentTransactionInfo using attemptVersion version number. If there are any conflicts that are found, we will retry a fixed number of times.

    returns

    the real version that was committed, the postCommitSnapshot, and the txn info NOTE: The postCommitSnapshot may not be the same as the version committed if racing commits were written while we updated the snapshot.

    Attributes
    protected
  36. final def eq(arg0: AnyRef): Boolean
    Definition Classes
    AnyRef
  37. def equals(arg0: AnyRef): Boolean
    Definition Classes
    AnyRef → Any
  38. val executionObserver: TransactionExecutionObserver

    Contains the execution instrumentation set via thread-local.

    Contains the execution instrumentation set via thread-local. No-op by default.

    Attributes
    protected[delta]
  39. def filesForScan(limit: Long, partitionFilters: Seq[Expression]): DeltaScan

    Returns aDeltaScan based on the given partition filters, projections and limits.

    Returns aDeltaScan based on the given partition filters, projections and limits.

    Definition Classes
    OptimisticTransactionImplDeltaScanGenerator
  40. def filesForScan(filters: Seq[Expression], keepNumRecords: Boolean = false): DeltaScan

    Returns aDeltaScan based on the given filters.

    Returns aDeltaScan based on the given filters.

    Definition Classes
    OptimisticTransactionImplDeltaScanGenerator
  41. def filesWithStatsForScan(partitionFilters: Seq[Expression]): DataFrame

    Returns a DataFrame for the given partition filters.

    Returns a DataFrame for the given partition filters. The schema of returned DataFrame is nearly the same as AddFile, except that the stats field is parsed to a struct from a json string.

    Definition Classes
    OptimisticTransactionImplDeltaScanGenerator
  42. def filterFiles(partitions: Set[Map[String, String]]): Seq[AddFile]

    Returns files within the given partitions.

    Returns files within the given partitions.

    partitions is a set of the partitionValues stored in AddFiles. This means they refer to the physical column names, and values are stored as strings.

  43. def filterFiles(filters: Seq[Expression], keepNumRecords: Boolean = false): Seq[AddFile]

    Returns files matching the given predicates.

  44. def filterFiles(): Seq[AddFile]

    Returns files matching the given predicates.

  45. def finalize(): Unit
    Attributes
    protected[lang]
    Definition Classes
    AnyRef
    Annotations
    @throws(classOf[java.lang.Throwable])
  46. def generateInCommitTimestampForFirstCommitAttempt(currentTimestamp: Long): Option[Long]

    Generates a timestamp which is greater than the commit timestamp of the last snapshot.

    Generates a timestamp which is greater than the commit timestamp of the last snapshot. Note that this is only needed when the feature inCommitTimestamps is enabled.

    Attributes
    protected[delta]
  47. def getAssertDeletionVectorWellFormedFunc(spark: SparkSession, op: Operation): (Action) => Unit

    Must make sure that deletion vectors are never added to a table where that isn't allowed.

    Must make sure that deletion vectors are never added to a table where that isn't allowed. Note, statistics recomputation is still allowed even though DVs might be currently disabled.

    This method returns a function that can be used to validate a single Action.

    Attributes
    protected
  48. final def getClass(): Class[_ <: AnyRef]
    Definition Classes
    AnyRef → Any
    Annotations
    @native()
  49. def getCommitter(outputPath: Path): DelayedCommitProtocol
    Attributes
    protected
    Definition Classes
    TransactionalWrite
  50. def getCommonTags(deltaLog: DeltaLog, tahoeId: String): Map[TagDefinition, String]
    Definition Classes
    DeltaLogging
  51. def getConflictingVersions(previousAttemptVersion: Long): Seq[FileStatus]

    Returns the conflicting commit information

    Returns the conflicting commit information

    Attributes
    protected
  52. def getDeltaScanGenerator(index: TahoeLogFileIndex): DeltaScanGenerator

    Returns the DeltaScanGenerator for the given log, which will be used to generate DeltaScans.

    Returns the DeltaScanGenerator for the given log, which will be used to generate DeltaScans. Every time this method is called on a log, the returned generator generator will read a snapshot that is pinned on the first access for that log.

    Internally, if the given log is the same as the log associated with this transaction, then it returns this transaction, otherwise it will return a snapshot of given log

  53. def getErrorData(e: Throwable): Map[String, Any]
    Definition Classes
    DeltaLogging
  54. def getIsolationLevelToUse(preparedActions: Seq[Action], op: Operation): IsolationLevel
    Attributes
    protected
  55. def getMetric(name: String): Option[SQLMetric]

    Returns the metric with name registered for the given transaction if it exists.

    Returns the metric with name registered for the given transaction if it exists.

    Definition Classes
    SQLMetricsReporting
  56. def getMetricsForOperation(operation: Operation): Map[String, String]

    Get the metrics for an operation based on collected SQL Metrics and filtering out the ones based on the metric parameters for that operation.

    Get the metrics for an operation based on collected SQL Metrics and filtering out the ones based on the metric parameters for that operation.

    Definition Classes
    SQLMetricsReporting
  57. def getOperationMetrics(op: Operation): Option[Map[String, String]]

    Return the operation metrics for the operation if it is enabled

  58. def getOptionalStatsTrackerAndStatsCollection(output: Seq[Attribute], outputPath: Path, partitionSchema: StructType, data: DataFrame): (Option[DeltaJobStatisticsTracker], Option[StatisticsCollection])

    Return the pair of optional stats tracker and stats collection class

    Return the pair of optional stats tracker and stats collection class

    Attributes
    protected
    Definition Classes
    TransactionalWrite
  59. def getPartitioningColumns(partitionSchema: StructType, output: Seq[Attribute]): Seq[Attribute]
    Attributes
    protected
    Definition Classes
    TransactionalWrite
  60. def getStatsColExpr(statsDataSchema: Seq[Attribute], statsCollection: StatisticsCollection): Expression
    Attributes
    protected
    Definition Classes
    TransactionalWrite
  61. def getStatsSchema(dataFrameOutput: Seq[Attribute], partitionSchema: StructType): (Seq[Attribute], Seq[Attribute])

    Return a tuple of (outputStatsCollectionSchema, statsCollectionSchema).

    Return a tuple of (outputStatsCollectionSchema, statsCollectionSchema). outputStatsCollectionSchema is the data source schema from DataFrame used for stats collection. It contains the columns in the DataFrame output, excluding the partition columns. tableStatsCollectionSchema is the schema to collect stats for. It contains the columns in the table schema, excluding the partition columns. Note: We only collect NULL_COUNT stats (as the number of rows) for the columns in statsCollectionSchema but missing in outputStatsCollectionSchema

    Attributes
    protected
    Definition Classes
    TransactionalWrite
  62. def getUserMetadata(op: Operation): Option[String]

    Return the user-defined metadata for the operation.

  63. val hasWritten: Boolean
    Attributes
    protected
    Definition Classes
    TransactionalWrite
  64. def hashCode(): Int
    Definition Classes
    AnyRef → Any
    Annotations
    @native()
  65. def initializeLogIfNecessary(isInterpreter: Boolean, silent: Boolean): Boolean
    Attributes
    protected
    Definition Classes
    Logging
  66. def initializeLogIfNecessary(isInterpreter: Boolean): Unit
    Attributes
    protected
    Definition Classes
    Logging
  67. val isBlindAppend: Boolean

    True if this transaction is a blind append.

    True if this transaction is a blind append. This is only valid after commit.

    Attributes
    protected[delta]
  68. final def isInstanceOf[T0]: Boolean
    Definition Classes
    Any
  69. def isTraceEnabled(): Boolean
    Attributes
    protected
    Definition Classes
    Logging
  70. def log: Logger
    Attributes
    protected
    Definition Classes
    Logging
  71. def logConsole(line: String): Unit
    Definition Classes
    DatabricksLogging
  72. def logDebug(msg: => String, throwable: Throwable): Unit
    Attributes
    protected
    Definition Classes
    Logging
  73. def logDebug(msg: => String): Unit
    Attributes
    protected
    Definition Classes
    Logging
  74. def logError(msg: => String, throwable: Throwable): Unit
    Definition Classes
    OptimisticTransactionImpl → Logging
  75. def logError(msg: => String): Unit
    Definition Classes
    OptimisticTransactionImpl → Logging
  76. def logInfo(msg: => String): Unit
    Definition Classes
    OptimisticTransactionImpl → Logging
  77. def logInfo(msg: => String, throwable: Throwable): Unit
    Attributes
    protected
    Definition Classes
    Logging
  78. def logName: String
    Attributes
    protected
    Definition Classes
    Logging
  79. lazy val logPrefix: String
    Attributes
    protected
  80. def logTrace(msg: => String, throwable: Throwable): Unit
    Attributes
    protected
    Definition Classes
    Logging
  81. def logTrace(msg: => String): Unit
    Attributes
    protected
    Definition Classes
    Logging
  82. def logWarning(msg: => String, throwable: Throwable): Unit
    Definition Classes
    OptimisticTransactionImpl → Logging
  83. def logWarning(msg: => String): Unit
    Definition Classes
    OptimisticTransactionImpl → Logging
  84. def makeOutputNullable(output: Seq[Attribute]): Seq[Attribute]

    Makes the output attributes nullable, so that we don't write unreadable parquet files.

    Makes the output attributes nullable, so that we don't write unreadable parquet files.

    Attributes
    protected
    Definition Classes
    TransactionalWrite
  85. def mapColumnAttributes(output: Seq[Attribute], mappingMode: DeltaColumnMappingMode): Seq[Attribute]

    Replace the output attributes with the physical mapping information.

    Replace the output attributes with the physical mapping information.

    Attributes
    protected
    Definition Classes
    TransactionalWrite
  86. def metadata: Metadata

    Returns the metadata for this transaction.

    Returns the metadata for this transaction. The metadata refers to the metadata of the snapshot at the transaction's read version unless updated during the transaction.

    Definition Classes
    OptimisticTransactionImplTransactionalWrite
  87. final def ne(arg0: AnyRef): Boolean
    Definition Classes
    AnyRef
  88. val newMetadata: Option[Metadata]

    Stores the updated metadata (if any) that will result from this txn.

    Stores the updated metadata (if any) that will result from this txn.

    This is just one way to change metadata. New metadata can also be added during commit from actions. But metadata should *not* be updated via both paths.

    Attributes
    protected
  89. val newProtocol: Option[Protocol]

    Stores the updated protocol (if any) that will result from this txn.

    Stores the updated protocol (if any) that will result from this txn.

    Attributes
    protected
  90. def normalizeData(deltaLog: DeltaLog, options: Option[DeltaOptions], data: Dataset[_]): (QueryExecution, Seq[Attribute], Seq[Constraint], Set[String])

    Normalize the schema of the query, and return the QueryExecution to execute.

    Normalize the schema of the query, and return the QueryExecution to execute. If the table has generated columns and users provide these columns in the output, we will also return constraints that should be respected. If any constraints are returned, the caller should apply these constraints when writing data.

    Note: The output attributes of the QueryExecution may not match the attributes we return as the output schema. This is because streaming queries create IncrementalExecution, which cannot be further modified. We can however have the Parquet writer use the physical plan from IncrementalExecution and the output schema provided through the attributes.

    Attributes
    protected
    Definition Classes
    TransactionalWrite
  91. final def notify(): Unit
    Definition Classes
    AnyRef
    Annotations
    @native()
  92. final def notifyAll(): Unit
    Definition Classes
    AnyRef
    Annotations
    @native()
  93. val partitionsAddedToOpt: Option[HashSet[Map[String, String]]]

    The set of distinct partitions that contain added files by current transaction.

    The set of distinct partitions that contain added files by current transaction.

    Attributes
    protected[delta]
  94. def performCDCPartition(inputData: Dataset[_]): (DataFrame, StructType)

    Returns a tuple of (data, partition schema).

    Returns a tuple of (data, partition schema). For CDC writes, a is_cdc column is added to the data and is_cdc=true/false is added to the front of the partition schema.

    Attributes
    protected
    Definition Classes
    TransactionalWrite
  95. def performCdcMetadataCheck(): Unit

    Checks if the new schema contains any CDC columns (which is invalid) and throws the appropriate error

    Checks if the new schema contains any CDC columns (which is invalid) and throws the appropriate error

    Attributes
    protected
  96. val postCommitHooks: ArrayBuffer[PostCommitHook]
    Attributes
    protected
  97. def prepareCommit(actions: Seq[Action], op: Operation): Seq[Action]

    Prepare for a commit by doing all necessary pre-commit checks and modifications to the actions.

    Prepare for a commit by doing all necessary pre-commit checks and modifications to the actions.

    returns

    The finalized set of actions.

    Attributes
    protected
  98. def protocol: Protocol

    The protocol of the snapshot that this transaction is reading at.

    The protocol of the snapshot that this transaction is reading at.

    Definition Classes
    OptimisticTransactionImplTransactionalWrite
  99. val readFiles: HashSet[AddFile]

    Tracks specific files that have been seen by this transaction.

    Tracks specific files that have been seen by this transaction.

    Attributes
    protected
  100. val readPredicates: ArrayBuffer[DeltaTableReadPredicate]

    Tracks the data that could have been seen by recording the partition predicates by which files have been queried by this transaction.

    Tracks the data that could have been seen by recording the partition predicates by which files have been queried by this transaction.

    Attributes
    protected
  101. val readSnapshots: ConcurrentHashMap[(String, Path), Snapshot]

    Tracks the first-access snapshots of other Delta logs read by this transaction.

    Tracks the first-access snapshots of other Delta logs read by this transaction. The snapshots are keyed by the log's unique id.

    Attributes
    protected
  102. val readTheWholeTable: Boolean

    Whether the whole table was read during the transaction.

    Whether the whole table was read during the transaction.

    Attributes
    protected
  103. val readTxn: ArrayBuffer[String]

    Tracks the appIds that have been seen by this transaction.

    Tracks the appIds that have been seen by this transaction.

    Attributes
    protected
  104. def readVersion: Long

    The version that this transaction is reading from.

  105. def readWholeTable(): Unit

    Mark the entire table as tainted by this transaction.

  106. def recordDeltaEvent(deltaLog: DeltaLog, opType: String, tags: Map[TagDefinition, String] = Map.empty, data: AnyRef = null, path: Option[Path] = None): Unit

    Used to record the occurrence of a single event or report detailed, operation specific statistics.

    Used to record the occurrence of a single event or report detailed, operation specific statistics.

    path

    Used to log the path of the delta table when deltaLog is null.

    Attributes
    protected
    Definition Classes
    DeltaLogging
  107. def recordDeltaOperation[A](deltaLog: DeltaLog, opType: String, tags: Map[TagDefinition, String] = Map.empty)(thunk: => A): A

    Used to report the duration as well as the success or failure of an operation on a deltaLog.

    Used to report the duration as well as the success or failure of an operation on a deltaLog.

    Attributes
    protected
    Definition Classes
    DeltaLogging
  108. def recordDeltaOperationForTablePath[A](tablePath: String, opType: String, tags: Map[TagDefinition, String] = Map.empty)(thunk: => A): A

    Used to report the duration as well as the success or failure of an operation on a tahoePath.

    Used to report the duration as well as the success or failure of an operation on a tahoePath.

    Attributes
    protected
    Definition Classes
    DeltaLogging
  109. def recordEvent(metric: MetricDefinition, additionalTags: Map[TagDefinition, String] = Map.empty, blob: String = null, trimBlob: Boolean = true): Unit
    Definition Classes
    DatabricksLogging
  110. def recordFrameProfile[T](group: String, name: String)(thunk: => T): T
    Attributes
    protected
    Definition Classes
    DeltaLogging
  111. def recordOperation[S](opType: OpType, opTarget: String = null, extraTags: Map[TagDefinition, String], isSynchronous: Boolean = true, alwaysRecordStats: Boolean = false, allowAuthTags: Boolean = false, killJvmIfStuck: Boolean = false, outputMetric: MetricDefinition = METRIC_OPERATION_DURATION, silent: Boolean = true)(thunk: => S): S
    Definition Classes
    DatabricksLogging
  112. def recordProductEvent(metric: MetricDefinition with CentralizableMetric, additionalTags: Map[TagDefinition, String] = Map.empty, blob: String = null, trimBlob: Boolean = true): Unit
    Definition Classes
    DatabricksLogging
  113. def recordProductUsage(metric: MetricDefinition with CentralizableMetric, quantity: Double, additionalTags: Map[TagDefinition, String] = Map.empty, blob: String = null, forceSample: Boolean = false, trimBlob: Boolean = true, silent: Boolean = false): Unit
    Definition Classes
    DatabricksLogging
  114. def recordUsage(metric: MetricDefinition, quantity: Double, additionalTags: Map[TagDefinition, String] = Map.empty, blob: String = null, forceSample: Boolean = false, trimBlob: Boolean = true, silent: Boolean = false): Unit
    Definition Classes
    DatabricksLogging
  115. def registerPostCommitHook(hook: PostCommitHook): Unit

    Register a hook that will be executed once a commit is successful.

  116. def registerSQLMetrics(spark: SparkSession, metrics: Map[String, SQLMetric]): Unit

    Register SQL metrics for an operation by appending the supplied metrics map to the operationSQLMetrics map.

    Register SQL metrics for an operation by appending the supplied metrics map to the operationSQLMetrics map.

    Definition Classes
    SQLMetricsReporting
  117. def registerTableForManagedCommitsIfNeeded(finalMetadata: Metadata, finalProtocol: Protocol): Option[Map[String, String]]

    This method registers the table with the commit-owner via the CommitOwnerClient if the table is transitioning from file-system based table to managed-commit table.

    This method registers the table with the commit-owner via the CommitOwnerClient if the table is transitioning from file-system based table to managed-commit table.

    finalMetadata

    the effective Metadata of the table. Note that this refers to the new metadata if this commit is updating the table Metadata.

    finalProtocol

    the effective Protocol of the table. Note that this refers to the new protocol if this commit is updating the table Protocol.

    returns

    The new managed-commit table metadata if the table is transitioning from file-system based table to managed-commit table. Otherwise, None. This metadata should be added to the Metadata.configuration before doing the commit.

    Attributes
    protected
  118. def reportAutoCompactStatsError(e: Throwable): Unit
  119. def runPostCommitHook(hook: PostCommitHook, version: Long, postCommitSnapshot: Snapshot, committedActions: Seq[Action]): Unit
    Attributes
    protected
  120. def runPostCommitHooks(version: Long, postCommitSnapshot: Snapshot, committedActions: Seq[Action]): Unit

    Executes the registered post commit hooks.

    Executes the registered post commit hooks.

    Attributes
    protected
  121. def setNeedsCheckpoint(committedVersion: Long, postCommitSnapshot: Snapshot): Unit

    Sets needsCheckpoint if we should checkpoint the version that has just been committed.

    Sets needsCheckpoint if we should checkpoint the version that has just been committed.

    Attributes
    protected
  122. def skipRecordingEmptyCommitAllowed(isolationLevelToUse: IsolationLevel): Boolean

    Whether to skip recording the commit in DeltaLog

    Whether to skip recording the commit in DeltaLog

    Attributes
    protected
  123. val snapshotToScan: Snapshot

    The snapshot that the scan is being generated on.

    The snapshot that the scan is being generated on.

    Definition Classes
    OptimisticTransactionImplDeltaScanGenerator
  124. def spark: SparkSession
    Attributes
    protected
  125. def statsCollector: Column

    Gets the stats collector for the table at the snapshot this transaction has.

  126. final def synchronized[T0](arg0: => T0): T0
    Definition Classes
    AnyRef
  127. def toString(): String
    Definition Classes
    AnyRef → Any
  128. def trackFilesRead(files: Seq[AddFile]): Unit

    Mark the given files as read within this transaction.

  129. def trackReadPredicates(filters: Seq[Expression], partitionOnly: Boolean = false, shouldRewriteFilter: Boolean = true): Unit

    Mark the predicates that have been queried by this transaction.

  130. def txnExecutionTimeMs: Option[Long]

    The end to end execution time of this transaction.

  131. val txnId: String

    Unique identifier for the transaction

  132. val txnStartNano: Long

    The transaction start time.

    The transaction start time.

    Attributes
    protected
  133. def txnStartTimeNs: Long

    Start time of txn in nanoseconds

  134. def txnVersion(id: String): Long

    Returns the latest version that has committed for the idempotent transaction with given id.

  135. def updateAndCheckpoint(spark: SparkSession, deltaLog: DeltaLog, commitSize: Int, attemptVersion: Long, commit: Commit, txnId: String): Snapshot

    Update the table now that the commit has been made, and write a checkpoint.

    Update the table now that the commit has been made, and write a checkpoint.

    Attributes
    protected
  136. def updateMetadata(proposedNewMetadata: Metadata, ignoreDefaultProperties: Boolean = false): Unit

    Records an update to the metadata that should be committed with this transaction.

    Records an update to the metadata that should be committed with this transaction. Note that this must be done before writing out any files so that file writing and checks happen with the final metadata for the table.

    IMPORTANT: It is the responsibility of the caller to ensure that files currently present in the table are still valid under the new metadata.

  137. def updateMetadataForNewTable(metadata: Metadata): Unit

    Records an update to the metadata that should be committed with this transaction and when this transaction is logically creating a new table, e.g.

    Records an update to the metadata that should be committed with this transaction and when this transaction is logically creating a new table, e.g. replacing a previous table with new metadata. Note that this must be done before writing out any files so that file writing and checks happen with the final metadata for the table. IMPORTANT: It is the responsibility of the caller to ensure that files currently present in the table are still valid under the new metadata.

  138. def updateMetadataForTableOverwrite(proposedNewMetadata: Metadata): Unit

    Records an update to the metadata that should be committed with this transaction and when this transaction is attempt to overwrite the data and schema using .mode('overwrite') and .option('overwriteSchema', true).

    Records an update to the metadata that should be committed with this transaction and when this transaction is attempt to overwrite the data and schema using .mode('overwrite') and .option('overwriteSchema', true). REPLACE the table is not considered in this category, because that is logically equivalent to DROP and RECREATE the table.

  139. def updateMetadataInternal(proposedNewMetadata: Metadata, ignoreDefaultProperties: Boolean): Unit

    Do the actual checks and works to update the metadata and save it into the newMetadata field, which will be added to the actions to commit in prepareCommit.

    Do the actual checks and works to update the metadata and save it into the newMetadata field, which will be added to the actions to commit in prepareCommit.

    Attributes
    protected
  140. def updateMetadataWithInCommitTimestamp(commitInfo: CommitInfo): Boolean

    This method makes the necessary changes to Metadata based on ICT: If ICT is getting enabled as part of this commit, then it updates the Metadata with the ICT enablement information.

    This method makes the necessary changes to Metadata based on ICT: If ICT is getting enabled as part of this commit, then it updates the Metadata with the ICT enablement information.

    commitInfo

    commitInfo for the commit

    returns

    true if changes were made to Metadata else false.

    Attributes
    protected
  141. def updateMetadataWithManagedCommitConfs(): Boolean

    This method makes the necessary changes to Metadata based on managed-commit: If the table is being converted from file-system to managed commits, then it registers the table with the commit-owner and updates the Metadata with the necessary configuration information from the commit-owner.

    This method makes the necessary changes to Metadata based on managed-commit: If the table is being converted from file-system to managed commits, then it registers the table with the commit-owner and updates the Metadata with the necessary configuration information from the commit-owner.

    returns

    A boolean which represents whether we have updated the table Metadata with managed-commit information. If no changed were made, returns false.

    Attributes
    protected
  142. def updateProtocol(protocol: Protocol): Unit

    This updates the protocol for the table with a given protocol.

    This updates the protocol for the table with a given protocol. Note that the protocol set by this method can be overwritten by other methods, such as updateMetadata.

  143. def updateSetTransaction(appId: String, version: Long, lastUpdate: Option[Long]): Unit

    Record a SetTransaction action that will be committed as part of this transaction.

  144. def validateManagedCommitConfInMetadata(newMetadataOpt: Option[Metadata]): Unit
    Attributes
    protected
  145. final def wait(): Unit
    Definition Classes
    AnyRef
    Annotations
    @throws(classOf[java.lang.InterruptedException])
  146. final def wait(arg0: Long, arg1: Int): Unit
    Definition Classes
    AnyRef
    Annotations
    @throws(classOf[java.lang.InterruptedException])
  147. final def wait(arg0: Long): Unit
    Definition Classes
    AnyRef
    Annotations
    @throws(classOf[java.lang.InterruptedException]) @native()
  148. def withStatusCode[T](statusCode: String, defaultMessage: String, data: Map[String, Any] = Map.empty)(body: => T): T

    Report a log to indicate some command is running.

    Report a log to indicate some command is running.

    Definition Classes
    DeltaProgressReporter
  149. def writeCommitFile(attemptVersion: Long, jsonActions: Iterator[String], currentTransactionInfo: CurrentTransactionInfo): (Option[VersionChecksum], Commit)

    Writes the json actions provided to the commit file corresponding to attemptVersion.

    Writes the json actions provided to the commit file corresponding to attemptVersion. If managed-commits are enabled, this method must return a non-empty Commit since we can't guess it from the FileSystem.

    Attributes
    protected
  150. def writeCommitFileImpl(attemptVersion: Long, jsonActions: Iterator[String], tableCommitOwnerClient: TableCommitOwnerClient, currentTransactionInfo: CurrentTransactionInfo): Commit
    Attributes
    protected
  151. def writeFiles(inputData: Dataset[_], writeOptions: Option[DeltaOptions], isOptimize: Boolean, additionalConstraints: Seq[Constraint]): Seq[FileAction]

    Writes out the dataframe after performing schema validation.

    Writes out the dataframe after performing schema validation. Returns a list of actions to append these files to the reservoir.

    inputData

    Data to write out.

    writeOptions

    Options to decide how to write out the data.

    isOptimize

    Whether the operation writing this is Optimize or not.

    additionalConstraints

    Additional constraints on the write.

    Definition Classes
    TransactionalWrite
  152. def writeFiles(data: Dataset[_], deltaOptions: Option[DeltaOptions], additionalConstraints: Seq[Constraint]): Seq[FileAction]
    Definition Classes
    TransactionalWrite
  153. def writeFiles(data: Dataset[_]): Seq[FileAction]
    Definition Classes
    TransactionalWrite
  154. def writeFiles(data: Dataset[_], writeOptions: Option[DeltaOptions]): Seq[FileAction]
    Definition Classes
    TransactionalWrite
  155. def writeFiles(data: Dataset[_], additionalConstraints: Seq[Constraint]): Seq[FileAction]
    Definition Classes
    TransactionalWrite

Inherited from DeltaScanGenerator

Inherited from SQLMetricsReporting

Inherited from TransactionalWrite

Inherited from DeltaLogging

Inherited from DatabricksLogging

Inherited from DeltaProgressReporter

Inherited from Logging

Inherited from AnyRef

Inherited from Any

Ungrouped