Packages

class DeltaLog extends Checkpoints with MetadataCleanup with LogStoreProvider with SnapshotManagement with ReadChecksum

Used to query the current state of the log as well as modify it by adding new atomic collections of actions.

Internally, this class implements an optimistic concurrency control algorithm to handle multiple readers or writers. Any single read is guaranteed to see a consistent snapshot of the table.

Ordering
  1. Alphabetic
  2. By Inheritance
Inherited
  1. DeltaLog
  2. ReadChecksum
  3. SnapshotManagement
  4. LogStoreProvider
  5. MetadataCleanup
  6. Checkpoints
  7. DeltaLogging
  8. DatabricksLogging
  9. DeltaProgressReporter
  10. Logging
  11. AnyRef
  12. Any
  1. Hide All
  2. Show All
Visibility
  1. Public
  2. All

Value Members

  1. final def !=(arg0: Any): Boolean
    Definition Classes
    AnyRef → Any
  2. final def ##(): Int
    Definition Classes
    AnyRef → Any
  3. final def ==(arg0: Any): Boolean
    Definition Classes
    AnyRef → Any
  4. val LAST_CHECKPOINT: Path

    The path to the file that holds metadata about the most recent checkpoint.

    The path to the file that holds metadata about the most recent checkpoint.

    Definition Classes
    Checkpoints
  5. final def asInstanceOf[T0]: T0
    Definition Classes
    Any
  6. def assertRemovable(): Unit

    Checks whether this table only accepts appends.

    Checks whether this table only accepts appends. If so it will throw an error in operations that can remove data such as DELETE/UPDATE/MERGE.

  7. def checkpoint(snapshotToCheckpoint: Snapshot): CheckpointMetaData
    Attributes
    protected
    Definition Classes
    Checkpoints
  8. def checkpoint(): Unit

    Creates a checkpoint at the current log version.

    Creates a checkpoint at the current log version.

    Definition Classes
    Checkpoints
  9. def checkpointInterval: Int

    Returns the checkpoint interval for this log.

    Returns the checkpoint interval for this log. Not transactional.

  10. val clock: Clock
  11. def clone(): AnyRef
    Attributes
    protected[lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( ... ) @native()
  12. def createDataFrame(snapshot: Snapshot, addFiles: Seq[AddFile], isStreaming: Boolean = false, actionTypeOpt: Option[String] = None): DataFrame

    Returns a org.apache.spark.sql.DataFrame containing the new files within the specified version range.

  13. def createLogStore(sparkConf: SparkConf, hadoopConf: Configuration): LogStore
    Definition Classes
    LogStoreProvider
  14. def createLogStore(spark: SparkSession): LogStore
    Definition Classes
    LogStoreProvider
  15. def createRelation(partitionFilters: Seq[Expression] = Nil, timeTravel: Option[DeltaTimeTravelSpec] = None): BaseRelation

    Returns a BaseRelation that contains all of the data present in the table.

    Returns a BaseRelation that contains all of the data present in the table. This relation will be continually updated as files are added or removed from the table. However, new BaseRelation must be requested in order to see changes to the schema.

  16. def createSnapshot(segment: LogSegment, minFileRetentionTimestamp: Long, timestamp: Long): Snapshot
    Attributes
    protected
    Definition Classes
    SnapshotManagement
  17. val currentSnapshot: Snapshot
    Attributes
    protected
    Definition Classes
    SnapshotManagement
    Annotations
    @volatile()
  18. val dataPath: Path
    Definition Classes
    DeltaLogCheckpoints
  19. val defaultLogStoreClass: String
    Definition Classes
    LogStoreProvider
  20. val deltaLogLock: ReentrantLock

    Use ReentrantLock to allow us to call lockInterruptibly

    Use ReentrantLock to allow us to call lockInterruptibly

    Attributes
    protected
  21. def deltaRetentionMillis: Long

    Returns the duration in millis for how long to keep around obsolete logs.

    Returns the duration in millis for how long to keep around obsolete logs. We may keep logs beyond this duration until the next calendar day to avoid constantly creating checkpoints.

    Definition Classes
    MetadataCleanup
  22. def doLogCleanup(): Unit
    Definition Classes
    MetadataCleanup
  23. def enableExpiredLogCleanup: Boolean

    Whether to clean up expired log files and checkpoints.

    Whether to clean up expired log files and checkpoints.

    Definition Classes
    MetadataCleanup
  24. def ensureLogDirectoryExist(): Unit

    Creates the log directory if it does not exist.

  25. final def eq(arg0: AnyRef): Boolean
    Definition Classes
    AnyRef
  26. def equals(arg0: Any): Boolean
    Definition Classes
    AnyRef → Any
  27. def finalize(): Unit
    Attributes
    protected[lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( classOf[java.lang.Throwable] )
  28. def findLastCompleteCheckpoint(cv: CheckpointInstance): Option[CheckpointInstance]

    Finds the first verified, complete checkpoint before the given version.

    Finds the first verified, complete checkpoint before the given version.

    cv

    The CheckpointVersion to compare against

    Attributes
    protected
    Definition Classes
    Checkpoints
  29. def getChanges(startVersion: Long): Iterator[(Long, Seq[Action])]

    Get all actions starting from "startVersion" (inclusive).

    Get all actions starting from "startVersion" (inclusive). If startVersion doesn't exist, return an empty Iterator.

  30. final def getClass(): Class[_]
    Definition Classes
    AnyRef → Any
    Annotations
    @native()
  31. def getLatestCompleteCheckpointFromList(instances: Array[CheckpointInstance], notLaterThan: CheckpointInstance): Option[CheckpointInstance]

    Given a list of checkpoint files, pick the latest complete checkpoint instance which is not later than notLaterThan.

    Given a list of checkpoint files, pick the latest complete checkpoint instance which is not later than notLaterThan.

    Attributes
    protected
    Definition Classes
    Checkpoints
  32. def getLogSegmentForVersion(startCheckpoint: Option[Long], versionToLoad: Option[Long] = None): LogSegment

    Get a list of files that can be used to compute a Snapshot at version versionToLoad, If versionToLoad is not provided, will generate the list of files that are needed to load the latest version of the Delta table.

    Get a list of files that can be used to compute a Snapshot at version versionToLoad, If versionToLoad is not provided, will generate the list of files that are needed to load the latest version of the Delta table. This method also performs checks to ensure that the delta files are contiguous.

    startCheckpoint

    A potential start version to perform the listing of the DeltaLog, typically that of a known checkpoint. If this version's not provided, we will start listing from version 0.

    versionToLoad

    A specific version to load. Typically used with time travel and the Delta streaming source. If not provided, we will try to load the latest version of the table.

    returns

    Some LogSegment to build a Snapshot if files do exist after the given startCheckpoint. None, if there are no new files after startCheckpoint.

    Attributes
    protected
    Definition Classes
    SnapshotManagement
  33. def getLogSegmentFrom(startingCheckpoint: Option[CheckpointMetaData]): LogSegment

    Get the LogSegment that will help in computing the Snapshot of the table at DeltaLog initialization.

    Get the LogSegment that will help in computing the Snapshot of the table at DeltaLog initialization.

    startingCheckpoint

    A checkpoint that we can start our listing from

    Attributes
    protected
    Definition Classes
    SnapshotManagement
  34. def getSnapshotAt(version: Long, commitTimestamp: Option[Long] = None, lastCheckpointHint: Option[CheckpointInstance] = None): Snapshot

    Get the snapshot at version.

    Get the snapshot at version.

    Definition Classes
    SnapshotManagement
  35. def getSnapshotAtInit: Snapshot

    Load the Snapshot for this Delta table at initialization.

    Load the Snapshot for this Delta table at initialization. This method uses the lastCheckpoint file as a hint on where to start listing the transaction log directory. If the _delta_log directory doesn't exist, this method will return an InitialSnapshot.

    Attributes
    protected
    Definition Classes
    SnapshotManagement
  36. def hashCode(): Int
    Definition Classes
    AnyRef → Any
    Annotations
    @native()
  37. lazy val history: DeltaHistoryManager

    Delta History Manager containing version and commit history.

  38. def initializeLogIfNecessary(isInterpreter: Boolean, silent: Boolean): Boolean
    Attributes
    protected
    Definition Classes
    Logging
  39. def initializeLogIfNecessary(isInterpreter: Boolean): Unit
    Attributes
    protected
    Definition Classes
    Logging
  40. final def isInstanceOf[T0]: Boolean
    Definition Classes
    Any
  41. def isSameLogAs(otherLog: DeltaLog): Boolean
  42. def isTraceEnabled(): Boolean
    Attributes
    protected
    Definition Classes
    Logging
  43. def isValid(): Boolean
  44. val lastUpdateTimestamp: Long

    The timestamp when the last successful update action is finished.

    The timestamp when the last successful update action is finished.

    Attributes
    protected
    Definition Classes
    SnapshotManagement
    Annotations
    @volatile()
  45. def lockInterruptibly[T](body: ⇒ T): T

    Run body inside deltaLogLock lock using lockInterruptibly so that the thread can be interrupted when waiting for the lock.

  46. def log: Logger
    Attributes
    protected
    Definition Classes
    Logging
  47. def logConsole(line: String): Unit
    Definition Classes
    DatabricksLogging
  48. def logDebug(msg: ⇒ String, throwable: Throwable): Unit
    Attributes
    protected
    Definition Classes
    Logging
  49. def logDebug(msg: ⇒ String): Unit
    Attributes
    protected
    Definition Classes
    Logging
  50. def logError(msg: ⇒ String, throwable: Throwable): Unit
    Attributes
    protected
    Definition Classes
    Logging
  51. def logError(msg: ⇒ String): Unit
    Attributes
    protected
    Definition Classes
    Logging
  52. def logInfo(msg: ⇒ String, throwable: Throwable): Unit
    Attributes
    protected
    Definition Classes
    Logging
  53. def logInfo(msg: ⇒ String): Unit
    Attributes
    protected
    Definition Classes
    Logging
  54. def logName: String
    Attributes
    protected
    Definition Classes
    Logging
  55. val logPath: Path
    Definition Classes
    DeltaLogReadChecksumCheckpoints
  56. val logStoreClassConfKey: String
    Definition Classes
    LogStoreProvider
  57. def logTrace(msg: ⇒ String, throwable: Throwable): Unit
    Attributes
    protected
    Definition Classes
    Logging
  58. def logTrace(msg: ⇒ String): Unit
    Attributes
    protected
    Definition Classes
    Logging
  59. def logWarning(msg: ⇒ String, throwable: Throwable): Unit
    Attributes
    protected
    Definition Classes
    Logging
  60. def logWarning(msg: ⇒ String): Unit
    Attributes
    protected
    Definition Classes
    Logging
  61. def manuallyLoadCheckpoint(cv: CheckpointInstance): CheckpointMetaData

    Loads the given checkpoint manually to come up with the CheckpointMetaData

    Loads the given checkpoint manually to come up with the CheckpointMetaData

    Attributes
    protected
    Definition Classes
    Checkpoints
  62. def maxSnapshotLineageLength: Int

    The max lineage length of a Snapshot before Delta forces to build a Snapshot from scratch.

    The max lineage length of a Snapshot before Delta forces to build a Snapshot from scratch. Delta will build a Snapshot on top of the previous one if it doesn't see a checkpoint. However, there is a race condition that when two writers are writing at the same time, a writer may fail to pick up checkpoints written by another one, and the lineage will grow and finally cause StackOverflowError. Hence we have to force to build a Snapshot from scratch when the lineage length is too large to avoid hitting StackOverflowError.

  63. def metadata: Metadata
    Attributes
    protected
    Definition Classes
    DeltaLogCheckpoints
  64. def minFileRetentionTimestamp: Long

    Tombstones before this timestamp will be dropped from the state and the files can be garbage collected.

  65. final def ne(arg0: AnyRef): Boolean
    Definition Classes
    AnyRef
  66. final def notify(): Unit
    Definition Classes
    AnyRef
    Annotations
    @native()
  67. final def notifyAll(): Unit
    Definition Classes
    AnyRef
    Annotations
    @native()
  68. def protocolRead(protocol: Protocol): Unit

    Asserts that the client is up to date with the protocol and allowed to read the table that is using the given protocol.

  69. def protocolWrite(protocol: Protocol, logUpgradeMessage: Boolean = true): Unit

    Asserts that the client is up to date with the protocol and allowed to write to the table that is using the given protocol.

  70. def readChecksum(version: Long): Option[VersionChecksum]
    Attributes
    protected
    Definition Classes
    ReadChecksum
  71. def recordDeltaEvent(deltaLog: DeltaLog, opType: String, tags: Map[TagDefinition, String] = Map.empty, data: AnyRef = null): Unit

    Used to record the occurrence of a single event or report detailed, operation specific statistics.

    Used to record the occurrence of a single event or report detailed, operation specific statistics.

    Attributes
    protected
    Definition Classes
    DeltaLogging
  72. def recordDeltaOperation[A](deltaLog: DeltaLog, opType: String, tags: Map[TagDefinition, String] = Map.empty)(thunk: ⇒ A): A

    Used to report the duration as well as the success or failure of an operation.

    Used to report the duration as well as the success or failure of an operation.

    Attributes
    protected
    Definition Classes
    DeltaLogging
  73. def recordEvent(metric: MetricDefinition, additionalTags: Map[TagDefinition, String] = Map.empty, blob: String = null, trimBlob: Boolean = true): Unit
    Definition Classes
    DatabricksLogging
  74. def recordOperation[S](opType: OpType, opTarget: String = null, extraTags: Map[TagDefinition, String], isSynchronous: Boolean = true, alwaysRecordStats: Boolean = false, allowAuthTags: Boolean = false, killJvmIfStuck: Boolean = false, outputMetric: MetricDefinition = null, silent: Boolean = true)(thunk: ⇒ S): S
    Definition Classes
    DatabricksLogging
  75. def recordUsage(metric: MetricDefinition, quantity: Double, additionalTags: Map[TagDefinition, String] = Map.empty, blob: String = null, forceSample: Boolean = false, trimBlob: Boolean = true, silent: Boolean = false): Unit
    Definition Classes
    DatabricksLogging
  76. def snapshot: Snapshot

    Returns the current snapshot.

    Returns the current snapshot. Note this does not automatically update().

    Definition Classes
    SnapshotManagement
  77. def spark: SparkSession
    Attributes
    protected
  78. def startTransaction(): OptimisticTransaction

    Returns a new OptimisticTransaction that can be used to read the current state of the log and then commit updates.

    Returns a new OptimisticTransaction that can be used to read the current state of the log and then commit updates. The reads and updates will be checked for logical conflicts with any concurrent writes to the log.

    Note that all reads in a transaction must go through the returned transaction object, and not directly to the DeltaLog otherwise they will not be checked for conflicts.

  79. lazy val store: LogStore

    Used to read and write physical log files and checkpoints.

    Used to read and write physical log files and checkpoints.

    Definition Classes
    DeltaLogReadChecksumCheckpoints
  80. final def synchronized[T0](arg0: ⇒ T0): T0
    Definition Classes
    AnyRef
  81. def tableId: String

    The unique identifier for this table.

  82. def toString(): String
    Definition Classes
    AnyRef → Any
  83. def update(stalenessAcceptable: Boolean = false): Snapshot

    Update ActionLog by applying the new delta files if any.

    Update ActionLog by applying the new delta files if any.

    stalenessAcceptable

    Whether we can accept working with a stale version of the table. If the table has surpassed our staleness tolerance, we will update to the latest state of the table synchronously. If staleness is acceptable, and the table hasn't passed the staleness tolerance, we will kick off a job in the background to update the table state, and can return a stale snapshot in the meantime.

    Definition Classes
    SnapshotManagement
  84. def updateInternal(isAsync: Boolean): Snapshot

    Queries the store for new delta files and applies them to the current state.

    Queries the store for new delta files and applies them to the current state. Note: the caller should hold deltaLogLock before calling this method.

    Attributes
    protected
    Definition Classes
    SnapshotManagement
  85. def upgradeProtocol(newVersion: Protocol = Protocol()): Unit

    Upgrade the table's protocol version, by default to the maximum recognized reader and writer versions in this DBR release.

  86. def verifyDeltaVersions(versions: Array[Long]): Unit

    Verify the versions are contiguous.

    Verify the versions are contiguous.

    Attributes
    protected
    Definition Classes
    SnapshotManagement
  87. final def wait(): Unit
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  88. final def wait(arg0: Long, arg1: Int): Unit
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  89. final def wait(arg0: Long): Unit
    Definition Classes
    AnyRef
    Annotations
    @throws( ... ) @native()
  90. def withNewTransaction[T](thunk: (OptimisticTransaction) ⇒ T): T

    Execute a piece of code within a new OptimisticTransaction.

    Execute a piece of code within a new OptimisticTransaction. Reads/write sets will be recorded for this table, and all other tables will be read at a snapshot that is pinned on the first access.

    Note

    This uses thread-local variable to make the active transaction visible. So do not use multi-threaded code in the provided thunk.

  91. def withStatusCode[T](statusCode: String, defaultMessage: String, data: Map[String, Any] = Map.empty)(body: ⇒ T): T

    Report a log to indicate some command is running.

    Report a log to indicate some command is running.

    Definition Classes
    DeltaProgressReporter

Inherited from ReadChecksum

Inherited from SnapshotManagement

Inherited from LogStoreProvider

Inherited from MetadataCleanup

Inherited from Checkpoints

Inherited from DeltaLogging

Inherited from DatabricksLogging

Inherited from DeltaProgressReporter

Inherited from Logging

Inherited from AnyRef

Inherited from Any

Ungrouped