class DeltaLog extends Checkpoints with MetadataCleanup with LogStoreProvider with SnapshotManagement with ReadChecksum
Used to query the current state of the log as well as modify it by adding new atomic collections of actions.
Internally, this class implements an optimistic concurrency control algorithm to handle multiple readers or writers. Any single read is guaranteed to see a consistent snapshot of the table.
- Alphabetic
- By Inheritance
- DeltaLog
- ReadChecksum
- SnapshotManagement
- LogStoreProvider
- MetadataCleanup
- Checkpoints
- DeltaLogging
- DatabricksLogging
- DeltaProgressReporter
- Logging
- AnyRef
- Any
- Hide All
- Show All
- Public
- All
Value Members
-
final
def
!=(arg0: Any): Boolean
- Definition Classes
- AnyRef → Any
-
final
def
##(): Int
- Definition Classes
- AnyRef → Any
-
final
def
==(arg0: Any): Boolean
- Definition Classes
- AnyRef → Any
-
val
LAST_CHECKPOINT: Path
The path to the file that holds metadata about the most recent checkpoint.
The path to the file that holds metadata about the most recent checkpoint.
- Definition Classes
- Checkpoints
-
final
def
asInstanceOf[T0]: T0
- Definition Classes
- Any
-
def
assertRemovable(): Unit
Checks whether this table only accepts appends.
Checks whether this table only accepts appends. If so it will throw an error in operations that can remove data such as DELETE/UPDATE/MERGE.
-
def
checkpoint(snapshotToCheckpoint: Snapshot): CheckpointMetaData
- Attributes
- protected
- Definition Classes
- Checkpoints
-
def
checkpoint(): Unit
Creates a checkpoint at the current log version.
Creates a checkpoint at the current log version.
- Definition Classes
- Checkpoints
-
def
checkpointInterval: Int
Returns the checkpoint interval for this log.
Returns the checkpoint interval for this log. Not transactional.
- val clock: Clock
-
def
clone(): AnyRef
- Attributes
- protected[lang]
- Definition Classes
- AnyRef
- Annotations
- @throws( ... ) @native()
-
def
createDataFrame(snapshot: Snapshot, addFiles: Seq[AddFile], isStreaming: Boolean = false, actionTypeOpt: Option[String] = None): DataFrame
Returns a org.apache.spark.sql.DataFrame containing the new files within the specified version range.
-
def
createLogStore(sparkConf: SparkConf, hadoopConf: Configuration): LogStore
- Definition Classes
- LogStoreProvider
-
def
createLogStore(spark: SparkSession): LogStore
- Definition Classes
- LogStoreProvider
-
def
createRelation(partitionFilters: Seq[Expression] = Nil, timeTravel: Option[DeltaTimeTravelSpec] = None): BaseRelation
Returns a BaseRelation that contains all of the data present in the table.
Returns a BaseRelation that contains all of the data present in the table. This relation will be continually updated as files are added or removed from the table. However, new BaseRelation must be requested in order to see changes to the schema.
-
def
createSnapshot(segment: LogSegment, minFileRetentionTimestamp: Long, timestamp: Long): Snapshot
- Attributes
- protected
- Definition Classes
- SnapshotManagement
-
val
currentSnapshot: Snapshot
- Attributes
- protected
- Definition Classes
- SnapshotManagement
- Annotations
- @volatile()
-
val
dataPath: Path
- Definition Classes
- DeltaLog → Checkpoints
-
val
defaultLogStoreClass: String
- Definition Classes
- LogStoreProvider
-
val
deltaLogLock: ReentrantLock
Use ReentrantLock to allow us to call
lockInterruptiblyUse ReentrantLock to allow us to call
lockInterruptibly- Attributes
- protected
-
def
deltaRetentionMillis: Long
Returns the duration in millis for how long to keep around obsolete logs.
Returns the duration in millis for how long to keep around obsolete logs. We may keep logs beyond this duration until the next calendar day to avoid constantly creating checkpoints.
- Definition Classes
- MetadataCleanup
-
def
doLogCleanup(): Unit
- Definition Classes
- MetadataCleanup
-
def
enableExpiredLogCleanup: Boolean
Whether to clean up expired log files and checkpoints.
Whether to clean up expired log files and checkpoints.
- Definition Classes
- MetadataCleanup
-
def
ensureLogDirectoryExist(): Unit
Creates the log directory if it does not exist.
-
final
def
eq(arg0: AnyRef): Boolean
- Definition Classes
- AnyRef
-
def
equals(arg0: Any): Boolean
- Definition Classes
- AnyRef → Any
-
def
finalize(): Unit
- Attributes
- protected[lang]
- Definition Classes
- AnyRef
- Annotations
- @throws( classOf[java.lang.Throwable] )
-
def
findLastCompleteCheckpoint(cv: CheckpointInstance): Option[CheckpointInstance]
Finds the first verified, complete checkpoint before the given version.
Finds the first verified, complete checkpoint before the given version.
- cv
The CheckpointVersion to compare against
- Attributes
- protected
- Definition Classes
- Checkpoints
-
def
getChanges(startVersion: Long): Iterator[(Long, Seq[Action])]
Get all actions starting from "startVersion" (inclusive).
Get all actions starting from "startVersion" (inclusive). If
startVersiondoesn't exist, return an empty Iterator. -
final
def
getClass(): Class[_]
- Definition Classes
- AnyRef → Any
- Annotations
- @native()
-
def
getLatestCompleteCheckpointFromList(instances: Array[CheckpointInstance], notLaterThan: CheckpointInstance): Option[CheckpointInstance]
Given a list of checkpoint files, pick the latest complete checkpoint instance which is not later than
notLaterThan.Given a list of checkpoint files, pick the latest complete checkpoint instance which is not later than
notLaterThan.- Attributes
- protected
- Definition Classes
- Checkpoints
-
def
getLogSegmentForVersion(startCheckpoint: Option[Long], versionToLoad: Option[Long] = None): LogSegment
Get a list of files that can be used to compute a Snapshot at version
versionToLoad, IfversionToLoadis not provided, will generate the list of files that are needed to load the latest version of the Delta table.Get a list of files that can be used to compute a Snapshot at version
versionToLoad, IfversionToLoadis not provided, will generate the list of files that are needed to load the latest version of the Delta table. This method also performs checks to ensure that the delta files are contiguous.- startCheckpoint
A potential start version to perform the listing of the DeltaLog, typically that of a known checkpoint. If this version's not provided, we will start listing from version 0.
- versionToLoad
A specific version to load. Typically used with time travel and the Delta streaming source. If not provided, we will try to load the latest version of the table.
- returns
Some LogSegment to build a Snapshot if files do exist after the given startCheckpoint. None, if there are no new files after
startCheckpoint.
- Attributes
- protected
- Definition Classes
- SnapshotManagement
-
def
getLogSegmentFrom(startingCheckpoint: Option[CheckpointMetaData]): LogSegment
Get the LogSegment that will help in computing the Snapshot of the table at DeltaLog initialization.
Get the LogSegment that will help in computing the Snapshot of the table at DeltaLog initialization.
- startingCheckpoint
A checkpoint that we can start our listing from
- Attributes
- protected
- Definition Classes
- SnapshotManagement
-
def
getSnapshotAt(version: Long, commitTimestamp: Option[Long] = None, lastCheckpointHint: Option[CheckpointInstance] = None): Snapshot
Get the snapshot at
version.Get the snapshot at
version.- Definition Classes
- SnapshotManagement
-
def
getSnapshotAtInit: Snapshot
Load the Snapshot for this Delta table at initialization.
Load the Snapshot for this Delta table at initialization. This method uses the
lastCheckpointfile as a hint on where to start listing the transaction log directory. If the _delta_log directory doesn't exist, this method will return anInitialSnapshot.- Attributes
- protected
- Definition Classes
- SnapshotManagement
-
def
hashCode(): Int
- Definition Classes
- AnyRef → Any
- Annotations
- @native()
-
lazy val
history: DeltaHistoryManager
Delta History Manager containing version and commit history.
-
def
initializeLogIfNecessary(isInterpreter: Boolean, silent: Boolean): Boolean
- Attributes
- protected
- Definition Classes
- Logging
-
def
initializeLogIfNecessary(isInterpreter: Boolean): Unit
- Attributes
- protected
- Definition Classes
- Logging
-
final
def
isInstanceOf[T0]: Boolean
- Definition Classes
- Any
- def isSameLogAs(otherLog: DeltaLog): Boolean
-
def
isTraceEnabled(): Boolean
- Attributes
- protected
- Definition Classes
- Logging
- def isValid(): Boolean
-
val
lastUpdateTimestamp: Long
The timestamp when the last successful update action is finished.
The timestamp when the last successful update action is finished.
- Attributes
- protected
- Definition Classes
- SnapshotManagement
- Annotations
- @volatile()
-
def
lockInterruptibly[T](body: ⇒ T): T
Run
bodyinsidedeltaLogLocklock usinglockInterruptiblyso that the thread can be interrupted when waiting for the lock. -
def
log: Logger
- Attributes
- protected
- Definition Classes
- Logging
-
def
logConsole(line: String): Unit
- Definition Classes
- DatabricksLogging
-
def
logDebug(msg: ⇒ String, throwable: Throwable): Unit
- Attributes
- protected
- Definition Classes
- Logging
-
def
logDebug(msg: ⇒ String): Unit
- Attributes
- protected
- Definition Classes
- Logging
-
def
logError(msg: ⇒ String, throwable: Throwable): Unit
- Attributes
- protected
- Definition Classes
- Logging
-
def
logError(msg: ⇒ String): Unit
- Attributes
- protected
- Definition Classes
- Logging
-
def
logInfo(msg: ⇒ String, throwable: Throwable): Unit
- Attributes
- protected
- Definition Classes
- Logging
-
def
logInfo(msg: ⇒ String): Unit
- Attributes
- protected
- Definition Classes
- Logging
-
def
logName: String
- Attributes
- protected
- Definition Classes
- Logging
-
val
logPath: Path
- Definition Classes
- DeltaLog → ReadChecksum → Checkpoints
-
val
logStoreClassConfKey: String
- Definition Classes
- LogStoreProvider
-
def
logTrace(msg: ⇒ String, throwable: Throwable): Unit
- Attributes
- protected
- Definition Classes
- Logging
-
def
logTrace(msg: ⇒ String): Unit
- Attributes
- protected
- Definition Classes
- Logging
-
def
logWarning(msg: ⇒ String, throwable: Throwable): Unit
- Attributes
- protected
- Definition Classes
- Logging
-
def
logWarning(msg: ⇒ String): Unit
- Attributes
- protected
- Definition Classes
- Logging
-
def
manuallyLoadCheckpoint(cv: CheckpointInstance): CheckpointMetaData
Loads the given checkpoint manually to come up with the CheckpointMetaData
Loads the given checkpoint manually to come up with the CheckpointMetaData
- Attributes
- protected
- Definition Classes
- Checkpoints
-
def
maxSnapshotLineageLength: Int
The max lineage length of a Snapshot before Delta forces to build a Snapshot from scratch.
The max lineage length of a Snapshot before Delta forces to build a Snapshot from scratch. Delta will build a Snapshot on top of the previous one if it doesn't see a checkpoint. However, there is a race condition that when two writers are writing at the same time, a writer may fail to pick up checkpoints written by another one, and the lineage will grow and finally cause StackOverflowError. Hence we have to force to build a Snapshot from scratch when the lineage length is too large to avoid hitting StackOverflowError.
-
def
metadata: Metadata
- Attributes
- protected
- Definition Classes
- DeltaLog → Checkpoints
-
def
minFileRetentionTimestamp: Long
Tombstones before this timestamp will be dropped from the state and the files can be garbage collected.
-
final
def
ne(arg0: AnyRef): Boolean
- Definition Classes
- AnyRef
-
final
def
notify(): Unit
- Definition Classes
- AnyRef
- Annotations
- @native()
-
final
def
notifyAll(): Unit
- Definition Classes
- AnyRef
- Annotations
- @native()
-
def
protocolRead(protocol: Protocol): Unit
Asserts that the client is up to date with the protocol and allowed to read the table that is using the given
protocol. -
def
protocolWrite(protocol: Protocol, logUpgradeMessage: Boolean = true): Unit
Asserts that the client is up to date with the protocol and allowed to write to the table that is using the given
protocol. -
def
readChecksum(version: Long): Option[VersionChecksum]
- Attributes
- protected
- Definition Classes
- ReadChecksum
-
def
recordDeltaEvent(deltaLog: DeltaLog, opType: String, tags: Map[TagDefinition, String] = Map.empty, data: AnyRef = null): Unit
Used to record the occurrence of a single event or report detailed, operation specific statistics.
Used to record the occurrence of a single event or report detailed, operation specific statistics.
- Attributes
- protected
- Definition Classes
- DeltaLogging
-
def
recordDeltaOperation[A](deltaLog: DeltaLog, opType: String, tags: Map[TagDefinition, String] = Map.empty)(thunk: ⇒ A): A
Used to report the duration as well as the success or failure of an operation.
Used to report the duration as well as the success or failure of an operation.
- Attributes
- protected
- Definition Classes
- DeltaLogging
-
def
recordEvent(metric: MetricDefinition, additionalTags: Map[TagDefinition, String] = Map.empty, blob: String = null, trimBlob: Boolean = true): Unit
- Definition Classes
- DatabricksLogging
-
def
recordOperation[S](opType: OpType, opTarget: String = null, extraTags: Map[TagDefinition, String], isSynchronous: Boolean = true, alwaysRecordStats: Boolean = false, allowAuthTags: Boolean = false, killJvmIfStuck: Boolean = false, outputMetric: MetricDefinition = null, silent: Boolean = true)(thunk: ⇒ S): S
- Definition Classes
- DatabricksLogging
-
def
recordUsage(metric: MetricDefinition, quantity: Double, additionalTags: Map[TagDefinition, String] = Map.empty, blob: String = null, forceSample: Boolean = false, trimBlob: Boolean = true, silent: Boolean = false): Unit
- Definition Classes
- DatabricksLogging
-
def
snapshot: Snapshot
Returns the current snapshot.
Returns the current snapshot. Note this does not automatically
update().- Definition Classes
- SnapshotManagement
-
def
spark: SparkSession
- Attributes
- protected
-
def
startTransaction(): OptimisticTransaction
Returns a new OptimisticTransaction that can be used to read the current state of the log and then commit updates.
Returns a new OptimisticTransaction that can be used to read the current state of the log and then commit updates. The reads and updates will be checked for logical conflicts with any concurrent writes to the log.
Note that all reads in a transaction must go through the returned transaction object, and not directly to the DeltaLog otherwise they will not be checked for conflicts.
-
lazy val
store: LogStore
Used to read and write physical log files and checkpoints.
Used to read and write physical log files and checkpoints.
- Definition Classes
- DeltaLog → ReadChecksum → Checkpoints
-
final
def
synchronized[T0](arg0: ⇒ T0): T0
- Definition Classes
- AnyRef
-
def
tableId: String
The unique identifier for this table.
-
def
toString(): String
- Definition Classes
- AnyRef → Any
-
def
update(stalenessAcceptable: Boolean = false): Snapshot
Update ActionLog by applying the new delta files if any.
Update ActionLog by applying the new delta files if any.
- stalenessAcceptable
Whether we can accept working with a stale version of the table. If the table has surpassed our staleness tolerance, we will update to the latest state of the table synchronously. If staleness is acceptable, and the table hasn't passed the staleness tolerance, we will kick off a job in the background to update the table state, and can return a stale snapshot in the meantime.
- Definition Classes
- SnapshotManagement
-
def
updateInternal(isAsync: Boolean): Snapshot
Queries the store for new delta files and applies them to the current state.
Queries the store for new delta files and applies them to the current state. Note: the caller should hold
deltaLogLockbefore calling this method.- Attributes
- protected
- Definition Classes
- SnapshotManagement
-
def
upgradeProtocol(newVersion: Protocol = Protocol()): Unit
Upgrade the table's protocol version, by default to the maximum recognized reader and writer versions in this DBR release.
-
def
verifyDeltaVersions(versions: Array[Long]): Unit
Verify the versions are contiguous.
Verify the versions are contiguous.
- Attributes
- protected
- Definition Classes
- SnapshotManagement
-
final
def
wait(): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws( ... )
-
final
def
wait(arg0: Long, arg1: Int): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws( ... )
-
final
def
wait(arg0: Long): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws( ... ) @native()
-
def
withNewTransaction[T](thunk: (OptimisticTransaction) ⇒ T): T
Execute a piece of code within a new OptimisticTransaction.
Execute a piece of code within a new OptimisticTransaction. Reads/write sets will be recorded for this table, and all other tables will be read at a snapshot that is pinned on the first access.
- Note
This uses thread-local variable to make the active transaction visible. So do not use multi-threaded code in the provided thunk.
-
def
withStatusCode[T](statusCode: String, defaultMessage: String, data: Map[String, Any] = Map.empty)(body: ⇒ T): T
Report a log to indicate some command is running.
Report a log to indicate some command is running.
- Definition Classes
- DeltaProgressReporter