trait OptimisticTransactionImpl extends TransactionalWrite with SQLMetricsReporting with DeltaScanGenerator with DeltaLogging
Used to perform a set of reads in a transaction and then commit a set of updates to the state of the log. All reads from the DeltaLog, MUST go through this instance rather than directly to the DeltaLog otherwise they will not be check for logical conflicts with concurrent updates.
This trait is not thread-safe.
- Alphabetic
- By Inheritance
- OptimisticTransactionImpl
- DeltaScanGenerator
- SQLMetricsReporting
- TransactionalWrite
- DeltaLogging
- DatabricksLogging
- DeltaProgressReporter
- Logging
- AnyRef
- Any
- Hide All
- Show All
- Public
- Protected
Type Members
- class DisabledAutoCompactPartitionStatsCollector extends AutoCompactPartitionStatsCollector
A subclass of AutoCompactPartitionStatsCollector that's to be used if the config to collect auto compaction stats is turned off.
A subclass of AutoCompactPartitionStatsCollector that's to be used if the config to collect auto compaction stats is turned off. This subclass intentionally does nothing.
- class FileSystemBasedCommitOwnerClient extends CommitOwnerClient
Abstract Value Members
- abstract val catalogTable: Option[CatalogTable]
- abstract val deltaLog: DeltaLog
- Definition Classes
- OptimisticTransactionImpl → TransactionalWrite
- abstract val snapshot: Snapshot
- Definition Classes
- OptimisticTransactionImpl → TransactionalWrite
Concrete Value Members
- final def !=(arg0: Any): Boolean
- Definition Classes
- AnyRef → Any
- final def ##: Int
- Definition Classes
- AnyRef → Any
- final def ==(arg0: Any): Boolean
- Definition Classes
- AnyRef → Any
- val actions: ArrayBuffer[Action]
Tracks actions within the transaction, will commit along with the passed-in actions in the commit function.
Tracks actions within the transaction, will commit along with the passed-in actions in the commit function.
- Attributes
- protected
- final def asInstanceOf[T0]: T0
- Definition Classes
- Any
- def assertMetadata(metadata: Metadata): Unit
- Attributes
- protected
- def canDowngradeToSnapshotIsolation(preparedActions: Seq[Action], op: Operation): Boolean
- Attributes
- protected
- def canUpdateMetadata: Boolean
Can this transaction still update the metadata? This is allowed only once per transaction.
- val checkDeletionVectorFilesHaveWideBounds: Boolean
- Attributes
- protected
- def checkForConflicts(checkVersion: Long, currentTransactionInfo: CurrentTransactionInfo, attemptNumber: Int, commitIsolationLevel: IsolationLevel): (Long, CurrentTransactionInfo)
Looks at actions that have happened since the txn started and checks for logical conflicts with the read/writes.
Looks at actions that have happened since the txn started and checks for logical conflicts with the read/writes. Resolve conflicts and returns a tuple representing the commit version to attempt next and the commit summary which we need to commit.
- Attributes
- protected
- def checkForConflictsAgainstVersion(currentTransactionInfo: CurrentTransactionInfo, otherCommitFileStatus: FileStatus, commitIsolationLevel: IsolationLevel): CurrentTransactionInfo
- Attributes
- protected
- def checkForSetTransactionConflictAndDedup(actions: Seq[Action]): Seq[Action]
Checks if the passed-in actions have internal SetTransaction conflicts, will throw exceptions in case of conflicts.
Checks if the passed-in actions have internal SetTransaction conflicts, will throw exceptions in case of conflicts. This function will also remove duplicated SetTransactions.
- Attributes
- protected
- def checkNoColumnDefaults(op: Operation): Unit
If the operation assigns or modifies column default values, this method checks that the corresponding table feature is enabled and throws an error if not.
If the operation assigns or modifies column default values, this method checks that the corresponding table feature is enabled and throws an error if not.
- Attributes
- protected
- def checkPartitionColumns(partitionSchema: StructType, output: Seq[Attribute], colsDropped: Boolean): Unit
- Attributes
- protected
- Definition Classes
- TransactionalWrite
- val checkUnsupportedDataType: Boolean
Whether to check unsupported data type when updating the table schema
Whether to check unsupported data type when updating the table schema
- Attributes
- protected
- def clock: Clock
- def clone(): AnyRef
- Attributes
- protected[lang]
- Definition Classes
- AnyRef
- Annotations
- @throws(classOf[java.lang.CloneNotSupportedException]) @native()
- def collectAutoOptimizeStats(numAdd: Long, numRemove: Long, actions: Iterator[Action]): Unit
- def commit(actions: Seq[Action], op: Operation, tags: Map[String, String]): Long
Modifies the state of the log by adding a new commit that is based on a read at readVersion.
Modifies the state of the log by adding a new commit that is based on a read at readVersion. In the case of a conflict with a concurrent writer this method will throw an exception.
- actions
Set of actions to commit
- op
Details of operation that is performing this transactional commit
- tags
Extra tags to set to the CommitInfo action
- def commit(actions: Seq[Action], op: Operation): Long
Modifies the state of the log by adding a new commit that is based on a read at readVersion.
Modifies the state of the log by adding a new commit that is based on a read at readVersion. In the case of a conflict with a concurrent writer this method will throw an exception.
- actions
Set of actions to commit
- op
Details of operation that is performing this transactional commit
- val commitAttemptStartTimeMillis: Long
Tracks the start time since we started trying to write a particular commit.
Tracks the start time since we started trying to write a particular commit. Used for logging duration of retried transactions.
- Attributes
- protected
- val commitEndNano: Long
The transaction commit end time.
The transaction commit end time.
- Attributes
- protected
- def commitIfNeeded(actions: Seq[Action], op: Operation, tags: Map[String, String] = Map.empty): Unit
Modifies the state of the log by adding a new commit that is based on a read at readVersion.
Modifies the state of the log by adding a new commit that is based on a read at readVersion. In the case of a conflict with a concurrent writer this method will throw an exception.
Also skips creating the commit if the configured IsolationLevel doesn't need us to record the commit from correctness perspective.
- def commitImpl(actions: Seq[Action], op: Operation, canSkipEmptyCommits: Boolean, tags: Map[String, String]): Option[Long]
- Attributes
- protected
- Annotations
- @throws(classOf[ConcurrentModificationException])
- val commitInfo: CommitInfo
- Attributes
- protected
- def commitLarge(spark: SparkSession, nonProtocolMetadataActions: Iterator[Action], newProtocolOpt: Option[Protocol], op: Operation, context: Map[String, String], metrics: Map[String, String]): (Long, Snapshot)
Create a large commit on the Delta log by directly writing an iterator of FileActions to the LogStore.
Create a large commit on the Delta log by directly writing an iterator of FileActions to the LogStore. This function only commits the next possible version and will not check whether the commit is retry-able. If the next version has already been committed, then this function will fail. This bypasses all optimistic concurrency checks. We assume that transaction conflicts should be rare because this method is typically used to create new tables (e.g. CONVERT TO DELTA) or apply some commands which rarely receive other transactions (e.g. CLONE/RESTORE). In addition, the expectation is that the list of actions performed by the transaction remains an iterator and is never materialized, given the nature of a large commit potentially touching many files. The
nonProtocolMetadataActionsparameter should only contain non-{protocol, metadata} actions only. If the protocol of table needs to be updated, it should be passed in thenewProtocolOptparameter. - val commitStartNano: Long
The transaction commit start time.
The transaction commit start time.
- Attributes
- protected
- val committed: Boolean
Tracks if this transaction has already committed.
Tracks if this transaction has already committed.
- Attributes
- protected
- def containsPostCommitHook(hook: PostCommitHook): Boolean
- def convertEmptyToNullIfNeeded(plan: SparkPlan, partCols: Seq[Attribute], constraints: Seq[Constraint]): SparkPlan
If there is any string partition column and there are constraints defined, add a projection to convert empty string to null for that column.
If there is any string partition column and there are constraints defined, add a projection to convert empty string to null for that column. The empty strings will be converted to null eventually even without this convert, but we want to do this earlier before check constraints so that empty strings are correctly rejected. Note that this should not cause the downstream logic in
FileFormatWriterto add duplicate conversions because the logic there checks the partition column using the original plan's output. When the plan is modified with additional projections, the partition column check won't match and will not add more conversion.- plan
The original SparkPlan.
- partCols
The partition columns.
- constraints
The defined constraints.
- returns
A SparkPlan potentially modified with an additional projection on top of
plan
- Attributes
- protected
- Definition Classes
- TransactionalWrite
- def createAutoCompactStatsCollector(): AutoCompactPartitionStatsCollector
- def deltaAssert(check: => Boolean, name: String, msg: String, deltaLog: DeltaLog = null, data: AnyRef = null, path: Option[Path] = None): Unit
Helper method to check invariants in Delta code.
Helper method to check invariants in Delta code. Fails when running in tests, records a delta assertion event and logs a warning otherwise.
- Attributes
- protected
- Definition Classes
- DeltaLogging
- def disableDeletionVectorFilesHaveWideBoundsCheck(): Unit
Disable the check that ensures that all files with DVs added have tightBounds set to false.
Disable the check that ensures that all files with DVs added have tightBounds set to false.
This is necessary when recomputing the stats on a table with DVs.
- def doCommit(attemptVersion: Long, currentTransactionInfo: CurrentTransactionInfo, attemptNumber: Int, isolationLevel: IsolationLevel): Snapshot
Commit
actionsusingattemptVersionversion number.Commit
actionsusingattemptVersionversion number. Throws a FileAlreadyExistsException if any conflicts are detected.- returns
the post-commit snapshot of the deltaLog
- Attributes
- protected
- def doCommitRetryIteratively(attemptVersion: Long, currentTransactionInfo: CurrentTransactionInfo, isolationLevel: IsolationLevel): (Long, Snapshot, CurrentTransactionInfo)
Commit the txn represented by
currentTransactionInfousingattemptVersionversion number.Commit the txn represented by
currentTransactionInfousingattemptVersionversion number. If there are any conflicts that are found, we will retry a fixed number of times.- returns
the real version that was committed, the postCommitSnapshot, and the txn info NOTE: The postCommitSnapshot may not be the same as the version committed if racing commits were written while we updated the snapshot.
- Attributes
- protected
- final def eq(arg0: AnyRef): Boolean
- Definition Classes
- AnyRef
- def equals(arg0: AnyRef): Boolean
- Definition Classes
- AnyRef → Any
- val executionObserver: TransactionExecutionObserver
Contains the execution instrumentation set via thread-local.
Contains the execution instrumentation set via thread-local. No-op by default.
- Attributes
- protected[delta]
- def filesForScan(limit: Long, partitionFilters: Seq[Expression]): DeltaScan
Returns aDeltaScan based on the given partition filters, projections and limits.
Returns aDeltaScan based on the given partition filters, projections and limits.
- Definition Classes
- OptimisticTransactionImpl → DeltaScanGenerator
- def filesForScan(filters: Seq[Expression], keepNumRecords: Boolean = false): DeltaScan
Returns aDeltaScan based on the given filters.
Returns aDeltaScan based on the given filters.
- Definition Classes
- OptimisticTransactionImpl → DeltaScanGenerator
- def filesWithStatsForScan(partitionFilters: Seq[Expression]): DataFrame
Returns a DataFrame for the given partition filters.
Returns a DataFrame for the given partition filters. The schema of returned DataFrame is nearly the same as
AddFile, except that thestatsfield is parsed to a struct from a json string.- Definition Classes
- OptimisticTransactionImpl → DeltaScanGenerator
- def filterFiles(partitions: Set[Map[String, String]]): Seq[AddFile]
Returns files within the given partitions.
Returns files within the given partitions.
partitionsis a set of thepartitionValuesstored in AddFiles. This means they refer to the physical column names, and values are stored as strings. - def filterFiles(filters: Seq[Expression], keepNumRecords: Boolean = false): Seq[AddFile]
Returns files matching the given predicates.
- def filterFiles(): Seq[AddFile]
Returns files matching the given predicates.
- def finalize(): Unit
- Attributes
- protected[lang]
- Definition Classes
- AnyRef
- Annotations
- @throws(classOf[java.lang.Throwable])
- def generateInCommitTimestampForFirstCommitAttempt(currentTimestamp: Long): Option[Long]
Generates a timestamp which is greater than the commit timestamp of the last snapshot.
Generates a timestamp which is greater than the commit timestamp of the last snapshot. Note that this is only needed when the feature
inCommitTimestampsis enabled.- Attributes
- protected[delta]
- def getAssertDeletionVectorWellFormedFunc(spark: SparkSession, op: Operation): (Action) => Unit
Must make sure that deletion vectors are never added to a table where that isn't allowed.
Must make sure that deletion vectors are never added to a table where that isn't allowed. Note, statistics recomputation is still allowed even though DVs might be currently disabled.
This method returns a function that can be used to validate a single Action.
- Attributes
- protected
- final def getClass(): Class[_ <: AnyRef]
- Definition Classes
- AnyRef → Any
- Annotations
- @native()
- def getCommitter(outputPath: Path): DelayedCommitProtocol
- Attributes
- protected
- Definition Classes
- TransactionalWrite
- def getCommonTags(deltaLog: DeltaLog, tahoeId: String): Map[TagDefinition, String]
- Definition Classes
- DeltaLogging
- def getConflictingVersions(previousAttemptVersion: Long): Seq[FileStatus]
Returns the conflicting commit information
Returns the conflicting commit information
- Attributes
- protected
- def getDeltaScanGenerator(index: TahoeLogFileIndex): DeltaScanGenerator
Returns the DeltaScanGenerator for the given log, which will be used to generate DeltaScans.
Returns the DeltaScanGenerator for the given log, which will be used to generate DeltaScans. Every time this method is called on a log, the returned generator generator will read a snapshot that is pinned on the first access for that log.
Internally, if the given log is the same as the log associated with this transaction, then it returns this transaction, otherwise it will return a snapshot of given log
- def getErrorData(e: Throwable): Map[String, Any]
- Definition Classes
- DeltaLogging
- def getIsolationLevelToUse(preparedActions: Seq[Action], op: Operation): IsolationLevel
- Attributes
- protected
- def getMetric(name: String): Option[SQLMetric]
Returns the metric with
nameregistered for the given transaction if it exists.Returns the metric with
nameregistered for the given transaction if it exists.- Definition Classes
- SQLMetricsReporting
- def getMetricsForOperation(operation: Operation): Map[String, String]
Get the metrics for an operation based on collected SQL Metrics and filtering out the ones based on the metric parameters for that operation.
Get the metrics for an operation based on collected SQL Metrics and filtering out the ones based on the metric parameters for that operation.
- Definition Classes
- SQLMetricsReporting
- def getOperationMetrics(op: Operation): Option[Map[String, String]]
Return the operation metrics for the operation if it is enabled
- def getOptionalStatsTrackerAndStatsCollection(output: Seq[Attribute], outputPath: Path, partitionSchema: StructType, data: DataFrame): (Option[DeltaJobStatisticsTracker], Option[StatisticsCollection])
Return the pair of optional stats tracker and stats collection class
Return the pair of optional stats tracker and stats collection class
- Attributes
- protected
- Definition Classes
- TransactionalWrite
- def getPartitioningColumns(partitionSchema: StructType, output: Seq[Attribute]): Seq[Attribute]
- Attributes
- protected
- Definition Classes
- TransactionalWrite
- def getStatsColExpr(statsDataSchema: Seq[Attribute], statsCollection: StatisticsCollection): Expression
- Attributes
- protected
- Definition Classes
- TransactionalWrite
- def getStatsSchema(dataFrameOutput: Seq[Attribute], partitionSchema: StructType): (Seq[Attribute], Seq[Attribute])
Return a tuple of (outputStatsCollectionSchema, statsCollectionSchema).
Return a tuple of (outputStatsCollectionSchema, statsCollectionSchema). outputStatsCollectionSchema is the data source schema from DataFrame used for stats collection. It contains the columns in the DataFrame output, excluding the partition columns. tableStatsCollectionSchema is the schema to collect stats for. It contains the columns in the table schema, excluding the partition columns. Note: We only collect NULL_COUNT stats (as the number of rows) for the columns in statsCollectionSchema but missing in outputStatsCollectionSchema
- Attributes
- protected
- Definition Classes
- TransactionalWrite
- def getUserMetadata(op: Operation): Option[String]
Return the user-defined metadata for the operation.
- val hasWritten: Boolean
- Attributes
- protected
- Definition Classes
- TransactionalWrite
- def hashCode(): Int
- Definition Classes
- AnyRef → Any
- Annotations
- @native()
- def initializeLogIfNecessary(isInterpreter: Boolean, silent: Boolean): Boolean
- Attributes
- protected
- Definition Classes
- Logging
- def initializeLogIfNecessary(isInterpreter: Boolean): Unit
- Attributes
- protected
- Definition Classes
- Logging
- val isBlindAppend: Boolean
True if this transaction is a blind append.
True if this transaction is a blind append. This is only valid after commit.
- Attributes
- protected[delta]
- final def isInstanceOf[T0]: Boolean
- Definition Classes
- Any
- def isTraceEnabled(): Boolean
- Attributes
- protected
- Definition Classes
- Logging
- def log: Logger
- Attributes
- protected
- Definition Classes
- Logging
- def logConsole(line: String): Unit
- Definition Classes
- DatabricksLogging
- def logDebug(msg: => String, throwable: Throwable): Unit
- Attributes
- protected
- Definition Classes
- Logging
- def logDebug(msg: => String): Unit
- Attributes
- protected
- Definition Classes
- Logging
- def logError(msg: => String, throwable: Throwable): Unit
- Definition Classes
- OptimisticTransactionImpl → Logging
- def logError(msg: => String): Unit
- Definition Classes
- OptimisticTransactionImpl → Logging
- def logInfo(msg: => String): Unit
- Definition Classes
- OptimisticTransactionImpl → Logging
- def logInfo(msg: => String, throwable: Throwable): Unit
- Attributes
- protected
- Definition Classes
- Logging
- def logName: String
- Attributes
- protected
- Definition Classes
- Logging
- lazy val logPrefix: String
- Attributes
- protected
- def logTrace(msg: => String, throwable: Throwable): Unit
- Attributes
- protected
- Definition Classes
- Logging
- def logTrace(msg: => String): Unit
- Attributes
- protected
- Definition Classes
- Logging
- def logWarning(msg: => String, throwable: Throwable): Unit
- Definition Classes
- OptimisticTransactionImpl → Logging
- def logWarning(msg: => String): Unit
- Definition Classes
- OptimisticTransactionImpl → Logging
- def makeOutputNullable(output: Seq[Attribute]): Seq[Attribute]
Makes the output attributes nullable, so that we don't write unreadable parquet files.
Makes the output attributes nullable, so that we don't write unreadable parquet files.
- Attributes
- protected
- Definition Classes
- TransactionalWrite
- def mapColumnAttributes(output: Seq[Attribute], mappingMode: DeltaColumnMappingMode): Seq[Attribute]
Replace the output attributes with the physical mapping information.
Replace the output attributes with the physical mapping information.
- Attributes
- protected
- Definition Classes
- TransactionalWrite
- def metadata: Metadata
Returns the metadata for this transaction.
Returns the metadata for this transaction. The metadata refers to the metadata of the snapshot at the transaction's read version unless updated during the transaction.
- Definition Classes
- OptimisticTransactionImpl → TransactionalWrite
- final def ne(arg0: AnyRef): Boolean
- Definition Classes
- AnyRef
- val newMetadata: Option[Metadata]
Stores the updated metadata (if any) that will result from this txn.
Stores the updated metadata (if any) that will result from this txn.
This is just one way to change metadata. New metadata can also be added during commit from actions. But metadata should *not* be updated via both paths.
- Attributes
- protected
- val newProtocol: Option[Protocol]
Stores the updated protocol (if any) that will result from this txn.
Stores the updated protocol (if any) that will result from this txn.
- Attributes
- protected
- def normalizeData(deltaLog: DeltaLog, options: Option[DeltaOptions], data: Dataset[_]): (QueryExecution, Seq[Attribute], Seq[Constraint], Set[String])
Normalize the schema of the query, and return the QueryExecution to execute.
Normalize the schema of the query, and return the QueryExecution to execute. If the table has generated columns and users provide these columns in the output, we will also return constraints that should be respected. If any constraints are returned, the caller should apply these constraints when writing data.
Note: The output attributes of the QueryExecution may not match the attributes we return as the output schema. This is because streaming queries create
IncrementalExecution, which cannot be further modified. We can however have the Parquet writer use the physical plan fromIncrementalExecutionand the output schema provided through the attributes.- Attributes
- protected
- Definition Classes
- TransactionalWrite
- final def notify(): Unit
- Definition Classes
- AnyRef
- Annotations
- @native()
- final def notifyAll(): Unit
- Definition Classes
- AnyRef
- Annotations
- @native()
- val partitionsAddedToOpt: Option[HashSet[Map[String, String]]]
The set of distinct partitions that contain added files by current transaction.
The set of distinct partitions that contain added files by current transaction.
- Attributes
- protected[delta]
- def performCDCPartition(inputData: Dataset[_]): (DataFrame, StructType)
Returns a tuple of (data, partition schema).
Returns a tuple of (data, partition schema). For CDC writes, a
is_cdcis added to the front of the partition schema.column is added to the data andis_cdc=true/false- Attributes
- protected
- Definition Classes
- TransactionalWrite
- def performCdcMetadataCheck(): Unit
Checks if the new schema contains any CDC columns (which is invalid) and throws the appropriate error
Checks if the new schema contains any CDC columns (which is invalid) and throws the appropriate error
- Attributes
- protected
- val postCommitHooks: ArrayBuffer[PostCommitHook]
- Attributes
- protected
- def prepareCommit(actions: Seq[Action], op: Operation): Seq[Action]
Prepare for a commit by doing all necessary pre-commit checks and modifications to the actions.
Prepare for a commit by doing all necessary pre-commit checks and modifications to the actions.
- returns
The finalized set of actions.
- Attributes
- protected
- def protocol: Protocol
The protocol of the snapshot that this transaction is reading at.
The protocol of the snapshot that this transaction is reading at.
- Definition Classes
- OptimisticTransactionImpl → TransactionalWrite
- val readFiles: HashSet[AddFile]
Tracks specific files that have been seen by this transaction.
Tracks specific files that have been seen by this transaction.
- Attributes
- protected
- val readPredicates: ArrayBuffer[DeltaTableReadPredicate]
Tracks the data that could have been seen by recording the partition predicates by which files have been queried by this transaction.
Tracks the data that could have been seen by recording the partition predicates by which files have been queried by this transaction.
- Attributes
- protected
- val readSnapshots: ConcurrentHashMap[(String, Path), Snapshot]
Tracks the first-access snapshots of other Delta logs read by this transaction.
Tracks the first-access snapshots of other Delta logs read by this transaction. The snapshots are keyed by the log's unique id.
- Attributes
- protected
- val readTheWholeTable: Boolean
Whether the whole table was read during the transaction.
Whether the whole table was read during the transaction.
- Attributes
- protected
- val readTxn: ArrayBuffer[String]
Tracks the appIds that have been seen by this transaction.
Tracks the appIds that have been seen by this transaction.
- Attributes
- protected
- def readVersion: Long
The version that this transaction is reading from.
- def readWholeTable(): Unit
Mark the entire table as tainted by this transaction.
- def recordDeltaEvent(deltaLog: DeltaLog, opType: String, tags: Map[TagDefinition, String] = Map.empty, data: AnyRef = null, path: Option[Path] = None): Unit
Used to record the occurrence of a single event or report detailed, operation specific statistics.
Used to record the occurrence of a single event or report detailed, operation specific statistics.
- path
Used to log the path of the delta table when
deltaLogis null.
- Attributes
- protected
- Definition Classes
- DeltaLogging
- def recordDeltaOperation[A](deltaLog: DeltaLog, opType: String, tags: Map[TagDefinition, String] = Map.empty)(thunk: => A): A
Used to report the duration as well as the success or failure of an operation on a
deltaLog.Used to report the duration as well as the success or failure of an operation on a
deltaLog.- Attributes
- protected
- Definition Classes
- DeltaLogging
- def recordDeltaOperationForTablePath[A](tablePath: String, opType: String, tags: Map[TagDefinition, String] = Map.empty)(thunk: => A): A
Used to report the duration as well as the success or failure of an operation on a
tahoePath.Used to report the duration as well as the success or failure of an operation on a
tahoePath.- Attributes
- protected
- Definition Classes
- DeltaLogging
- def recordEvent(metric: MetricDefinition, additionalTags: Map[TagDefinition, String] = Map.empty, blob: String = null, trimBlob: Boolean = true): Unit
- Definition Classes
- DatabricksLogging
- def recordFrameProfile[T](group: String, name: String)(thunk: => T): T
- Attributes
- protected
- Definition Classes
- DeltaLogging
- def recordOperation[S](opType: OpType, opTarget: String = null, extraTags: Map[TagDefinition, String], isSynchronous: Boolean = true, alwaysRecordStats: Boolean = false, allowAuthTags: Boolean = false, killJvmIfStuck: Boolean = false, outputMetric: MetricDefinition = METRIC_OPERATION_DURATION, silent: Boolean = true)(thunk: => S): S
- Definition Classes
- DatabricksLogging
- def recordProductEvent(metric: MetricDefinition with CentralizableMetric, additionalTags: Map[TagDefinition, String] = Map.empty, blob: String = null, trimBlob: Boolean = true): Unit
- Definition Classes
- DatabricksLogging
- def recordProductUsage(metric: MetricDefinition with CentralizableMetric, quantity: Double, additionalTags: Map[TagDefinition, String] = Map.empty, blob: String = null, forceSample: Boolean = false, trimBlob: Boolean = true, silent: Boolean = false): Unit
- Definition Classes
- DatabricksLogging
- def recordUsage(metric: MetricDefinition, quantity: Double, additionalTags: Map[TagDefinition, String] = Map.empty, blob: String = null, forceSample: Boolean = false, trimBlob: Boolean = true, silent: Boolean = false): Unit
- Definition Classes
- DatabricksLogging
- def registerPostCommitHook(hook: PostCommitHook): Unit
Register a hook that will be executed once a commit is successful.
- def registerSQLMetrics(spark: SparkSession, metrics: Map[String, SQLMetric]): Unit
Register SQL metrics for an operation by appending the supplied metrics map to the operationSQLMetrics map.
Register SQL metrics for an operation by appending the supplied metrics map to the operationSQLMetrics map.
- Definition Classes
- SQLMetricsReporting
- def registerTableForManagedCommitsIfNeeded(finalMetadata: Metadata, finalProtocol: Protocol): Option[Map[String, String]]
This method registers the table with the commit-owner via the CommitOwnerClient if the table is transitioning from file-system based table to managed-commit table.
This method registers the table with the commit-owner via the CommitOwnerClient if the table is transitioning from file-system based table to managed-commit table.
- finalMetadata
the effective Metadata of the table. Note that this refers to the new metadata if this commit is updating the table Metadata.
- finalProtocol
the effective Protocol of the table. Note that this refers to the new protocol if this commit is updating the table Protocol.
- returns
The new managed-commit table metadata if the table is transitioning from file-system based table to managed-commit table. Otherwise, None. This metadata should be added to the Metadata.configuration before doing the commit.
- Attributes
- protected
- def reportAutoCompactStatsError(e: Throwable): Unit
- def runPostCommitHook(hook: PostCommitHook, version: Long, postCommitSnapshot: Snapshot, committedActions: Seq[Action]): Unit
- Attributes
- protected
- def runPostCommitHooks(version: Long, postCommitSnapshot: Snapshot, committedActions: Seq[Action]): Unit
Executes the registered post commit hooks.
Executes the registered post commit hooks.
- Attributes
- protected
- def setNeedsCheckpoint(committedVersion: Long, postCommitSnapshot: Snapshot): Unit
Sets needsCheckpoint if we should checkpoint the version that has just been committed.
Sets needsCheckpoint if we should checkpoint the version that has just been committed.
- Attributes
- protected
- def skipRecordingEmptyCommitAllowed(isolationLevelToUse: IsolationLevel): Boolean
Whether to skip recording the commit in DeltaLog
Whether to skip recording the commit in DeltaLog
- Attributes
- protected
- val snapshotToScan: Snapshot
The snapshot that the scan is being generated on.
The snapshot that the scan is being generated on.
- Definition Classes
- OptimisticTransactionImpl → DeltaScanGenerator
- def spark: SparkSession
- Attributes
- protected
- def statsCollector: Column
Gets the stats collector for the table at the snapshot this transaction has.
- final def synchronized[T0](arg0: => T0): T0
- Definition Classes
- AnyRef
- def toString(): String
- Definition Classes
- AnyRef → Any
- def trackFilesRead(files: Seq[AddFile]): Unit
Mark the given files as read within this transaction.
- def trackReadPredicates(filters: Seq[Expression], partitionOnly: Boolean = false, shouldRewriteFilter: Boolean = true): Unit
Mark the predicates that have been queried by this transaction.
- def txnExecutionTimeMs: Option[Long]
The end to end execution time of this transaction.
- val txnId: String
Unique identifier for the transaction
- val txnStartNano: Long
The transaction start time.
The transaction start time.
- Attributes
- protected
- def txnStartTimeNs: Long
Start time of txn in nanoseconds
- def txnVersion(id: String): Long
Returns the latest version that has committed for the idempotent transaction with given
id. - def updateAndCheckpoint(spark: SparkSession, deltaLog: DeltaLog, commitSize: Int, attemptVersion: Long, commit: Commit, txnId: String): Snapshot
Update the table now that the commit has been made, and write a checkpoint.
Update the table now that the commit has been made, and write a checkpoint.
- Attributes
- protected
- def updateMetadata(proposedNewMetadata: Metadata, ignoreDefaultProperties: Boolean = false): Unit
Records an update to the metadata that should be committed with this transaction.
Records an update to the metadata that should be committed with this transaction. Note that this must be done before writing out any files so that file writing and checks happen with the final metadata for the table.
IMPORTANT: It is the responsibility of the caller to ensure that files currently present in the table are still valid under the new metadata.
- def updateMetadataForNewTable(metadata: Metadata): Unit
Records an update to the metadata that should be committed with this transaction and when this transaction is logically creating a new table, e.g.
Records an update to the metadata that should be committed with this transaction and when this transaction is logically creating a new table, e.g. replacing a previous table with new metadata. Note that this must be done before writing out any files so that file writing and checks happen with the final metadata for the table. IMPORTANT: It is the responsibility of the caller to ensure that files currently present in the table are still valid under the new metadata.
- def updateMetadataForTableOverwrite(proposedNewMetadata: Metadata): Unit
Records an update to the metadata that should be committed with this transaction and when this transaction is attempt to overwrite the data and schema using .mode('overwrite') and .option('overwriteSchema', true).
Records an update to the metadata that should be committed with this transaction and when this transaction is attempt to overwrite the data and schema using .mode('overwrite') and .option('overwriteSchema', true). REPLACE the table is not considered in this category, because that is logically equivalent to DROP and RECREATE the table.
- def updateMetadataInternal(proposedNewMetadata: Metadata, ignoreDefaultProperties: Boolean): Unit
Do the actual checks and works to update the metadata and save it into the
newMetadatafield, which will be added to the actions to commit in prepareCommit.Do the actual checks and works to update the metadata and save it into the
newMetadatafield, which will be added to the actions to commit in prepareCommit.- Attributes
- protected
- def updateMetadataWithInCommitTimestamp(commitInfo: CommitInfo): Boolean
This method makes the necessary changes to Metadata based on ICT: If ICT is getting enabled as part of this commit, then it updates the Metadata with the ICT enablement information.
This method makes the necessary changes to Metadata based on ICT: If ICT is getting enabled as part of this commit, then it updates the Metadata with the ICT enablement information.
- commitInfo
commitInfo for the commit
- returns
true if changes were made to Metadata else false.
- Attributes
- protected
- def updateMetadataWithManagedCommitConfs(): Boolean
This method makes the necessary changes to Metadata based on managed-commit: If the table is being converted from file-system to managed commits, then it registers the table with the commit-owner and updates the Metadata with the necessary configuration information from the commit-owner.
This method makes the necessary changes to Metadata based on managed-commit: If the table is being converted from file-system to managed commits, then it registers the table with the commit-owner and updates the Metadata with the necessary configuration information from the commit-owner.
- returns
A boolean which represents whether we have updated the table Metadata with managed-commit information. If no changed were made, returns false.
- Attributes
- protected
- def updateProtocol(protocol: Protocol): Unit
This updates the protocol for the table with a given protocol.
This updates the protocol for the table with a given protocol. Note that the protocol set by this method can be overwritten by other methods, such as updateMetadata.
- def updateSetTransaction(appId: String, version: Long, lastUpdate: Option[Long]): Unit
Record a SetTransaction action that will be committed as part of this transaction.
- def validateManagedCommitConfInMetadata(newMetadataOpt: Option[Metadata]): Unit
- Attributes
- protected
- final def wait(): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws(classOf[java.lang.InterruptedException])
- final def wait(arg0: Long, arg1: Int): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws(classOf[java.lang.InterruptedException])
- final def wait(arg0: Long): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws(classOf[java.lang.InterruptedException]) @native()
- def withStatusCode[T](statusCode: String, defaultMessage: String, data: Map[String, Any] = Map.empty)(body: => T): T
Report a log to indicate some command is running.
Report a log to indicate some command is running.
- Definition Classes
- DeltaProgressReporter
- def writeCommitFile(attemptVersion: Long, jsonActions: Iterator[String], currentTransactionInfo: CurrentTransactionInfo): (Option[VersionChecksum], Commit)
Writes the json actions provided to the commit file corresponding to attemptVersion.
Writes the json actions provided to the commit file corresponding to attemptVersion. If managed-commits are enabled, this method must return a non-empty Commit since we can't guess it from the FileSystem.
- Attributes
- protected
- def writeCommitFileImpl(attemptVersion: Long, jsonActions: Iterator[String], tableCommitOwnerClient: TableCommitOwnerClient, currentTransactionInfo: CurrentTransactionInfo): Commit
- Attributes
- protected
- def writeFiles(inputData: Dataset[_], writeOptions: Option[DeltaOptions], isOptimize: Boolean, additionalConstraints: Seq[Constraint]): Seq[FileAction]
Writes out the dataframe after performing schema validation.
Writes out the dataframe after performing schema validation. Returns a list of actions to append these files to the reservoir.
- inputData
Data to write out.
- writeOptions
Options to decide how to write out the data.
- isOptimize
Whether the operation writing this is Optimize or not.
- additionalConstraints
Additional constraints on the write.
- Definition Classes
- TransactionalWrite
- def writeFiles(data: Dataset[_], deltaOptions: Option[DeltaOptions], additionalConstraints: Seq[Constraint]): Seq[FileAction]
- Definition Classes
- TransactionalWrite
- def writeFiles(data: Dataset[_]): Seq[FileAction]
- Definition Classes
- TransactionalWrite
- def writeFiles(data: Dataset[_], writeOptions: Option[DeltaOptions]): Seq[FileAction]
- Definition Classes
- TransactionalWrite
- def writeFiles(data: Dataset[_], additionalConstraints: Seq[Constraint]): Seq[FileAction]
- Definition Classes
- TransactionalWrite