Packages

class DeltaLog extends Checkpoints with MetadataCleanup with LogStoreProvider with SnapshotManagement with DeltaFileFormat with ProvidesUniFormConverters with ReadChecksum

Used to query the current state of the log as well as modify it by adding new atomic collections of actions.

Internally, this class implements an optimistic concurrency control algorithm to handle multiple readers or writers. Any single read is guaranteed to see a consistent snapshot of the table.

Ordering
  1. Alphabetic
  2. By Inheritance
Inherited
  1. DeltaLog
  2. ReadChecksum
  3. ProvidesUniFormConverters
  4. DeltaFileFormat
  5. SnapshotManagement
  6. LogStoreProvider
  7. MetadataCleanup
  8. Checkpoints
  9. DeltaLogging
  10. DatabricksLogging
  11. DeltaProgressReporter
  12. Logging
  13. AnyRef
  14. Any
  1. Hide All
  2. Show All
Visibility
  1. Public
  2. Protected

Type Members

  1. class SidecarDeletionMetrics extends AnyRef

    Class to track metrics related to V2 Checkpoint Sidecars deletion.

    Class to track metrics related to V2 Checkpoint Sidecars deletion.

    Attributes
    protected
    Definition Classes
    MetadataCleanup
  2. class V2CompatCheckpointMetrics extends AnyRef

    Class to track metrics related to V2 Compatibility checkpoint creation.

    Class to track metrics related to V2 Compatibility checkpoint creation.

    Attributes
    protected[delta]
    Definition Classes
    MetadataCleanup

Value Members

  1. final def !=(arg0: Any): Boolean
    Definition Classes
    AnyRef → Any
  2. final def ##: Int
    Definition Classes
    AnyRef → Any
  3. final def ==(arg0: Any): Boolean
    Definition Classes
    AnyRef → Any
  4. val LAST_CHECKPOINT: Path

    The path to the file that holds metadata about the most recent checkpoint.

    The path to the file that holds metadata about the most recent checkpoint.

    Definition Classes
    Checkpoints
  5. lazy val _hudiConverter: UniversalFormatConverter
    Attributes
    protected
    Definition Classes
    ProvidesUniFormConverters
  6. lazy val _icebergConverter: UniversalFormatConverter

    Helper trait to instantiate the icebergConverter member variable of the DeltaLog.

    Helper trait to instantiate the icebergConverter member variable of the DeltaLog. We do this through reflection so that delta-spark doesn't have a compile-time dependency on the shaded iceberg module.

    Attributes
    protected
    Definition Classes
    ProvidesUniFormConverters
  7. val allOptions: Map[String, String]
  8. final def asInstanceOf[T0]: T0
    Definition Classes
    Any
  9. def assertTableFeaturesMatchMetadata(targetProtocol: Protocol, targetMetadata: Metadata): Unit

    Asserts that the table's protocol enabled all features that are active in the metadata.

    Asserts that the table's protocol enabled all features that are active in the metadata.

    A mismatch shouldn't happen when the table has gone through a proper write process because we require all active features during writes. However, other clients may void this guarantee.

  10. def buildHadoopFsRelationWithFileIndex(snapshot: SnapshotDescriptor, fileIndex: TahoeFileIndex, bucketSpec: Option[BucketSpec]): HadoopFsRelation
  11. def checkLogStoreConfConflicts(sparkConf: SparkConf): Unit
    Definition Classes
    LogStoreProvider
  12. def checkRequiredConfigurations(): Unit

    Verify the required Spark conf for delta Throw DeltaErrors.configureSparkSessionWithExtensionAndCatalog exception if spark.sql.catalog.spark_catalog config is missing.

    Verify the required Spark conf for delta Throw DeltaErrors.configureSparkSessionWithExtensionAndCatalog exception if spark.sql.catalog.spark_catalog config is missing. We do not check for spark.sql.extensions because DeltaSparkSessionExtension can alternatively be activated using the .withExtension() API. This check can be disabled by setting DELTA_CHECK_REQUIRED_SPARK_CONF to false.

    Attributes
    protected
  13. def checkpoint(snapshotToCheckpoint: Snapshot): Unit

    Creates a checkpoint using snapshotToCheckpoint.

    Creates a checkpoint using snapshotToCheckpoint. By default it uses the current log version. Note that this function captures and logs all exceptions, since the checkpoint shouldn't fail the overall commit operation.

    Definition Classes
    Checkpoints
  14. def checkpointAndCleanUpDeltaLog(snapshotToCheckpoint: Snapshot): Unit
    Definition Classes
    Checkpoints
  15. def checkpointInterval(metadata: Metadata): Int

    Returns the checkpoint interval for this log.

    Returns the checkpoint interval for this log. Not transactional.

    Definition Classes
    Checkpoints
  16. val clock: Clock
  17. def clone(): AnyRef
    Attributes
    protected[lang]
    Definition Classes
    AnyRef
    Annotations
    @throws(classOf[java.lang.CloneNotSupportedException]) @native()
  18. def createCheckpointAtVersion(version: Long): Unit

    Creates a checkpoint at given version.

    Creates a checkpoint at given version. Does not invoke metadata cleanup as part of it.

    version

    - version at which we want to create a checkpoint.

    Definition Classes
    Checkpoints
  19. def createDataFrame(snapshot: SnapshotDescriptor, addFiles: Seq[AddFile], isStreaming: Boolean = false, actionTypeOpt: Option[String] = None): DataFrame

    Returns a org.apache.spark.sql.DataFrame containing the new files within the specified version range.

  20. def createLogDirectory(): Unit

    Create the log directory.

    Create the log directory. Unlike ensureLogDirectoryExist, this method doesn't check whether the log directory exists and it will ignore the return value of mkdirs.

  21. def createLogSegment(versionToLoad: Option[Long] = None, oldCheckpointProviderOpt: Option[UninitializedCheckpointProvider] = None, tableCommitOwnerClientOpt: Option[TableCommitOwnerClient] = None, lastCheckpointInfo: Option[LastCheckpointInfo] = None): Option[LogSegment]

    Get a list of files that can be used to compute a Snapshot at version versionToLoad, If versionToLoad is not provided, will generate the list of files that are needed to load the latest version of the Delta table.

    Get a list of files that can be used to compute a Snapshot at version versionToLoad, If versionToLoad is not provided, will generate the list of files that are needed to load the latest version of the Delta table. This method also performs checks to ensure that the delta files are contiguous.

    versionToLoad

    A specific version to load. Typically used with time travel and the Delta streaming source. If not provided, we will try to load the latest version of the table.

    oldCheckpointProviderOpt

    The CheckpointProvider from the previous snapshot. This is used as a start version for the listing when startCheckpoint is unavailable. This is also used to initialize the LogSegment.

    lastCheckpointInfo

    LastCheckpointInfo from the _last_checkpoint. This could be used to initialize the Snapshot's LogSegment.

    returns

    Some LogSegment to build a Snapshot if files do exist after the given startCheckpoint. None, if the directory was missing or empty.

    Attributes
    protected
    Definition Classes
    SnapshotManagement
  22. def createLogStore(sparkConf: SparkConf, hadoopConf: Configuration): LogStore
    Definition Classes
    LogStoreProvider
  23. def createLogStore(spark: SparkSession): LogStore
    Definition Classes
    LogStoreProvider
  24. def createRelation(partitionFilters: Seq[Expression] = Nil, snapshotToUseOpt: Option[Snapshot] = None, catalogTableOpt: Option[CatalogTable] = None, isTimeTravelQuery: Boolean = false): BaseRelation

    Returns a BaseRelation that contains all of the data present in the table.

    Returns a BaseRelation that contains all of the data present in the table. This relation will be continually updated as files are added or removed from the table. However, new BaseRelation must be requested in order to see changes to the schema.

  25. def createSinglePartCheckpointForBackwardCompat(snapshotToCleanup: Snapshot, metrics: V2CompatCheckpointMetrics): Unit

    Helper method to create a compatibility classic single file checkpoint file for this table.

    Helper method to create a compatibility classic single file checkpoint file for this table. This is needed so that any legacy reader which do not understand V2CheckpointTableFeature could read the legacy classic checkpoint file and fail gracefully with Protocol requirement failure.

    Attributes
    protected[delta]
    Definition Classes
    MetadataCleanup
  26. def createSnapshot(initSegment: LogSegment, tableCommitOwnerClientOpt: Option[TableCommitOwnerClient], checksumOpt: Option[VersionChecksum]): Snapshot
    Attributes
    protected
    Definition Classes
    SnapshotManagement
  27. def createSnapshotAfterCommit(initSegment: LogSegment, newChecksumOpt: Option[VersionChecksum], tableCommitOwnerClientOpt: Option[TableCommitOwnerClient], committedVersion: Long): Snapshot

    Creates a snapshot for a new delta commit.

    Creates a snapshot for a new delta commit.

    Attributes
    protected
    Definition Classes
    SnapshotManagement
  28. def createSnapshotFromGivenOrEquivalentLogSegment(initSegment: LogSegment, tableCommitOwnerClientOpt: Option[TableCommitOwnerClient])(snapshotCreator: (LogSegment) => Snapshot): Snapshot

    Create a Snapshot from the given LogSegment.

    Create a Snapshot from the given LogSegment. If failing to create the snapshot, we will search an equivalent LogSegment using a different checkpoint and retry up to DeltaSQLConf.DELTA_SNAPSHOT_LOADING_MAX_RETRIES times.

    Attributes
    protected
    Definition Classes
    SnapshotManagement
  29. val currentSnapshot: CapturedSnapshot
    Attributes
    protected
    Definition Classes
    SnapshotManagement
    Annotations
    @volatile()
  30. val dataPath: Path
    Definition Classes
    DeltaLogCheckpoints
  31. val defaultLogStoreClass: String
    Definition Classes
    LogStoreProvider
  32. def deltaAssert(check: => Boolean, name: String, msg: String, deltaLog: DeltaLog = null, data: AnyRef = null, path: Option[Path] = None): Unit

    Helper method to check invariants in Delta code.

    Helper method to check invariants in Delta code. Fails when running in tests, records a delta assertion event and logs a warning otherwise.

    Attributes
    protected
    Definition Classes
    DeltaLogging
  33. def deltaRetentionMillis(metadata: Metadata): Long

    Returns the duration in millis for how long to keep around obsolete logs.

    Returns the duration in millis for how long to keep around obsolete logs. We may keep logs beyond this duration until the next calendar day to avoid constantly creating checkpoints.

    Definition Classes
    MetadataCleanup
  34. def doLogCleanup(snapshotToCleanup: Snapshot): Unit
    Definition Classes
    MetadataCleanup
  35. def enableExpiredLogCleanup(metadata: Metadata): Boolean

    Whether to clean up expired log files and checkpoints.

    Whether to clean up expired log files and checkpoints.

    Definition Classes
    MetadataCleanup
  36. def ensureLogDirectoryExist(): Unit

    Creates the log directory if it does not exist.

  37. final def eq(arg0: AnyRef): Boolean
    Definition Classes
    AnyRef
  38. def equals(arg0: AnyRef): Boolean
    Definition Classes
    AnyRef → Any
  39. def fileFormat(protocol: Protocol, metadata: Metadata): FileFormat

    Build the underlying Spark FileFormat of the Delta table with specified metadata.

    Build the underlying Spark FileFormat of the Delta table with specified metadata.

    With column mapping, some properties of the underlying file format might change during transaction, so if possible, we should always pass in the latest transaction's metadata instead of one from a past snapshot.

    Definition Classes
    DeltaFileFormat
  40. def finalize(): Unit
    Attributes
    protected[lang]
    Definition Classes
    AnyRef
    Annotations
    @throws(classOf[java.lang.Throwable])
  41. def findEarliestReliableCheckpoint: Option[Long]

    Finds a checkpoint such that we are able to construct table snapshot for all versions at or greater than the checkpoint version returned.

    Finds a checkpoint such that we are able to construct table snapshot for all versions at or greater than the checkpoint version returned.

    Definition Classes
    MetadataCleanup
  42. def findLastCompleteCheckpointBefore(checkpointInstance: Option[CheckpointInstance] = None): Option[CheckpointInstance]

    Finds the first verified, complete checkpoint before the given CheckpointInstance.

    Finds the first verified, complete checkpoint before the given CheckpointInstance. If checkpointInstance is passed as None, then we return the last complete checkpoint in the deltalog directory.

    checkpointInstance

    The checkpoint instance to compare against

    Attributes
    protected
    Definition Classes
    Checkpoints
  43. def findLastCompleteCheckpointBefore(version: Long): Option[CheckpointInstance]

    Finds the first verified, complete checkpoint before the given version.

    Finds the first verified, complete checkpoint before the given version. Note that the returned checkpoint will always be < version.

    version

    The checkpoint version to compare against

    Attributes
    protected
    Definition Classes
    Checkpoints
  44. def getChangeLogFiles(startVersion: Long, failOnDataLoss: Boolean = false): Iterator[(Long, FileStatus)]

    Get access to all actions starting from "startVersion" (inclusive) via FileStatus.

    Get access to all actions starting from "startVersion" (inclusive) via FileStatus. If startVersion doesn't exist, return an empty Iterator. Callers are encouraged to use the other override which takes the endVersion if available to avoid I/O and improve performance of this method.

  45. def getChanges(startVersion: Long, failOnDataLoss: Boolean = false): Iterator[(Long, Seq[Action])]

    Get all actions starting from "startVersion" (inclusive).

    Get all actions starting from "startVersion" (inclusive). If startVersion doesn't exist, return an empty Iterator. Callers are encouraged to use the other override which takes the endVersion if available to avoid I/O and improve performance of this method.

  46. def getCheckpointVersion(lastCheckpointInfoOpt: Option[LastCheckpointInfo], oldCheckpointProviderOpt: Option[UninitializedCheckpointProvider]): Long

    Returns the last known checkpoint version based on LastCheckpointInfo or CheckpointProvider.

    Returns the last known checkpoint version based on LastCheckpointInfo or CheckpointProvider. Returns -1 if both the info is not available.

    Attributes
    protected
    Definition Classes
    SnapshotManagement
  47. final def getClass(): Class[_ <: AnyRef]
    Definition Classes
    AnyRef → Any
    Annotations
    @native()
  48. def getCommonTags(deltaLog: DeltaLog, tahoeId: String): Map[TagDefinition, String]
    Definition Classes
    DeltaLogging
  49. def getErrorData(e: Throwable): Map[String, Any]
    Definition Classes
    DeltaLogging
  50. def getLatestCompleteCheckpointFromList(instances: Array[CheckpointInstance], notLaterThanVersion: Option[Long] = None): Option[CheckpointInstance]

    Given a list of checkpoint files, pick the latest complete checkpoint instance which is not later than notLaterThan.

    Given a list of checkpoint files, pick the latest complete checkpoint instance which is not later than notLaterThan.

    Attributes
    protected[delta]
    Definition Classes
    Checkpoints
  51. def getLogSegmentAfterCommit(tableCommitOwnerClientOpt: Option[TableCommitOwnerClient], oldCheckpointProvider: UninitializedCheckpointProvider): LogSegment
    Attributes
    protected[delta]
    Definition Classes
    SnapshotManagement
  52. def getLogSegmentAfterCommit(committedVersion: Long, newChecksumOpt: Option[VersionChecksum], preCommitLogSegment: LogSegment, commit: Commit, tableCommitOwnerClientOpt: Option[TableCommitOwnerClient], oldCheckpointProvider: CheckpointProvider): LogSegment

    Used to compute the LogSegment after a commit, by adding the delta file with the specified version to the preCommitLogSegment (which must match the immediately preceding version).

    Used to compute the LogSegment after a commit, by adding the delta file with the specified version to the preCommitLogSegment (which must match the immediately preceding version).

    Attributes
    protected[delta]
    Definition Classes
    SnapshotManagement
  53. def getLogSegmentForVersion(versionToLoad: Option[Long], files: Option[Array[FileStatus]], validateLogSegmentWithoutCompactedDeltas: Boolean, tableCommitOwnerClientOpt: Option[TableCommitOwnerClient], oldCheckpointProviderOpt: Option[UninitializedCheckpointProvider], lastCheckpointInfo: Option[LastCheckpointInfo]): Option[LogSegment]

    Helper function for the getLogSegmentForVersion above.

    Helper function for the getLogSegmentForVersion above. Called with a provided files list, and will then try to construct a new LogSegment using that. *Note*: If table is a managed-commit table, the commit-owner MUST be passed to correctly list the commits.

    Attributes
    protected
    Definition Classes
    SnapshotManagement
  54. def getLogStoreConfValue(key: String, sparkConf: SparkConf): Option[String]

    We accept keys both with and without the spark. prefix to maintain compatibility across the Delta ecosystem

    We accept keys both with and without the spark. prefix to maintain compatibility across the Delta ecosystem

    key

    the spark-prefixed key to access

    Definition Classes
    LogStoreProvider
  55. def getSnapshotAt(version: Long, lastCheckpointHint: Option[CheckpointInstance] = None): Snapshot

    Get the snapshot at version.

    Get the snapshot at version.

    Definition Classes
    SnapshotManagement
  56. def getSnapshotAtInit: CapturedSnapshot

    Load the Snapshot for this Delta table at initialization.

    Load the Snapshot for this Delta table at initialization. This method uses the lastCheckpoint file as a hint on where to start listing the transaction log directory. If the _delta_log directory doesn't exist, this method will return an InitialSnapshot.

    Attributes
    protected
    Definition Classes
    SnapshotManagement
  57. def getSnapshotForLogSegmentInternal(previousSnapshotOpt: Option[Snapshot], segmentOpt: Option[LogSegment], tableCommitOwnerClientOpt: Option[TableCommitOwnerClient], isAsync: Boolean): Snapshot

    Creates a Snapshot for the given segmentOpt

    Creates a Snapshot for the given segmentOpt

    Attributes
    protected
    Definition Classes
    SnapshotManagement
  58. def getUpdatedLogSegment(oldLogSegment: LogSegment, tableCommitOwnerClientOpt: Option[TableCommitOwnerClient]): (LogSegment, Seq[FileStatus])

    Get the newest logSegment, using the previous logSegment as a hint.

    Get the newest logSegment, using the previous logSegment as a hint. This is faster than doing a full update, but it won't work if the table's log directory was replaced.

    Definition Classes
    SnapshotManagement
  59. def getUpdatedSnapshot(oldSnapshotOpt: Option[Snapshot], initialSegmentForNewSnapshot: Option[LogSegment], initialTableCommitOwnerClient: Option[TableCommitOwnerClient], isAsync: Boolean): Snapshot

    Updates and installs a new snapshot in the currentSnapshot.

    Updates and installs a new snapshot in the currentSnapshot. This method takes care of recursively creating new snapshots if the commit-owner has changed.

    oldSnapshotOpt

    The previous snapshot, if any.

    initialSegmentForNewSnapshot

    the log segment constructed for the new snapshot

    initialTableCommitOwnerClient

    the commit-owner used for constructing the initialSegmentForNewSnapshot

    isAsync

    Whether the update is async.

    returns

    The new snapshot.

    Attributes
    protected
    Definition Classes
    SnapshotManagement
  60. def hashCode(): Int
    Definition Classes
    AnyRef → Any
    Annotations
    @native()
  61. lazy val history: DeltaHistoryManager

    Delta History Manager containing version and commit history.

  62. def hudiConverter: UniversalFormatConverter
    Definition Classes
    ProvidesUniFormConverters
  63. def icebergConverter: UniversalFormatConverter
    Definition Classes
    ProvidesUniFormConverters
  64. def identifyAndDeleteUnreferencedSidecarFiles(snapshotToCleanup: Snapshot, checkpointRetention: Long, metrics: SidecarDeletionMetrics): Unit

    Deletes any unreferenced files from the sidecar directory _delta_log/_sidecar

    Deletes any unreferenced files from the sidecar directory _delta_log/_sidecar

    Attributes
    protected
    Definition Classes
    MetadataCleanup
  65. def indexToRelation(index: DeltaLogFileIndex, schema: StructType = Action.logSchema): LogicalRelation

    Creates a LogicalRelation for a given DeltaLogFileIndex, with all necessary file source options taken from the Delta Log.

    Creates a LogicalRelation for a given DeltaLogFileIndex, with all necessary file source options taken from the Delta Log. All reads of Delta metadata files should use this method.

  66. def initializeLogIfNecessary(isInterpreter: Boolean, silent: Boolean): Boolean
    Attributes
    protected
    Definition Classes
    Logging
  67. def initializeLogIfNecessary(isInterpreter: Boolean): Unit
    Attributes
    protected
    Definition Classes
    Logging
  68. def installSnapshot(newSnapshot: Snapshot, updateTimestamp: Long): Snapshot

    Installs the given newSnapshot as the currentSnapshot

    Installs the given newSnapshot as the currentSnapshot

    Attributes
    protected
    Definition Classes
    SnapshotManagement
  69. def isCurrentlyStale: (Long) => Boolean

    Checks if the given timestamp is outside the current staleness window

    Checks if the given timestamp is outside the current staleness window

    Attributes
    protected
    Definition Classes
    SnapshotManagement
  70. def isDeltaCommitOrCheckpointFile(path: Path): Boolean

    Returns true if the path is delta log files.

    Returns true if the path is delta log files. Delta log files can be delta commit file (e.g., 000000000.json), or checkpoint file. (e.g., 000000001.checkpoint.00001.00003.parquet)

    path

    Path of a file

    returns

    Boolean Whether the file is delta log files

    Attributes
    protected
    Definition Classes
    SnapshotManagement
  71. final def isInstanceOf[T0]: Boolean
    Definition Classes
    Any
  72. def isSameLogAs(otherLog: DeltaLog): Boolean
  73. def isTraceEnabled(): Boolean
    Attributes
    protected
    Definition Classes
    Logging
  74. val lastSeenChecksumFileStatusOpt: Option[FileStatus]

    Cached fileStatus for the latest CRC file seen in the deltaLog.

    Cached fileStatus for the latest CRC file seen in the deltaLog.

    Attributes
    protected
    Definition Classes
    SnapshotManagement
    Annotations
    @volatile()
  75. final def listDeltaCompactedDeltaAndCheckpointFiles(startVersion: Long, tableCommitOwnerClientOpt: Option[TableCommitOwnerClient], versionToLoad: Option[Long], includeMinorCompactions: Boolean): Option[Array[FileStatus]]

    This method is designed to efficiently and reliably list delta, compacted delta, and checkpoint files associated with a Delta Lake table.

    This method is designed to efficiently and reliably list delta, compacted delta, and checkpoint files associated with a Delta Lake table. It makes parallel calls to both the file-system and a commit-owner (if available), reconciles the results to account for asynchronous backfill operations, and ensures a comprehensive list of file statuses without missing any concurrently backfilled files. *Note*: If table is a managed-commit table, the commit-owner client MUST be passed to correctly list the commits.

    startVersion

    the version to start. Inclusive.

    tableCommitOwnerClientOpt

    the optional commit-owner client to use for fetching un-backfilled commits.

    versionToLoad

    the optional parameter to set the max version we should return. Inclusive.

    includeMinorCompactions

    Whether to include minor compaction files in the result

    returns

    Some array of files found (possibly empty, if no usable commit files are present), or None if the listing returned no files at all.

    Attributes
    protected
    Definition Classes
    SnapshotManagement
  76. def listFromOrNone(startVersion: Long): Option[Iterator[FileStatus]]

    Returns an iterator containing a list of files found from the provided path

    Returns an iterator containing a list of files found from the provided path

    Attributes
    protected
    Definition Classes
    SnapshotManagement
  77. def loadIndex(index: DeltaLogFileIndex, schema: StructType = Action.logSchema): DataFrame

    Load the data using the FileIndex.

    Load the data using the FileIndex. This allows us to skip many checks that add overhead, e.g. file existence checks, partitioning schema inference.

  78. def log: Logger
    Attributes
    protected
    Definition Classes
    Logging
  79. def logConsole(line: String): Unit
    Definition Classes
    DatabricksLogging
  80. def logDebug(msg: => String, throwable: Throwable): Unit
    Attributes
    protected
    Definition Classes
    Logging
  81. def logDebug(msg: => String): Unit
    Attributes
    protected
    Definition Classes
    Logging
  82. def logError(msg: => String, throwable: Throwable): Unit
    Attributes
    protected
    Definition Classes
    Logging
  83. def logError(msg: => String): Unit
    Attributes
    protected
    Definition Classes
    Logging
  84. def logInfo(msg: => String, throwable: Throwable): Unit
    Attributes
    protected
    Definition Classes
    Logging
  85. def logInfo(msg: => String): Unit
    Attributes
    protected
    Definition Classes
    Logging
  86. def logName: String
    Attributes
    protected
    Definition Classes
    Logging
  87. val logPath: Path
    Definition Classes
    DeltaLogReadChecksumCheckpoints
  88. val logStoreClassConfKey: String
    Definition Classes
    LogStoreProvider
  89. def logStoreSchemeConfKey(scheme: String): String
    Definition Classes
    LogStoreProvider
  90. def logTrace(msg: => String, throwable: Throwable): Unit
    Attributes
    protected
    Definition Classes
    Logging
  91. def logTrace(msg: => String): Unit
    Attributes
    protected
    Definition Classes
    Logging
  92. def logWarning(msg: => String, throwable: Throwable): Unit
    Attributes
    protected
    Definition Classes
    Logging
  93. def logWarning(msg: => String): Unit
    Attributes
    protected
    Definition Classes
    Logging
  94. def manuallyLoadCheckpoint(cv: CheckpointInstance): LastCheckpointInfo

    Loads the given checkpoint manually to come up with the LastCheckpointInfo

    Loads the given checkpoint manually to come up with the LastCheckpointInfo

    Attributes
    protected
    Definition Classes
    Checkpoints
  95. def maxSnapshotLineageLength: Int

    The max lineage length of a Snapshot before Delta forces to build a Snapshot from scratch.

    The max lineage length of a Snapshot before Delta forces to build a Snapshot from scratch. Delta will build a Snapshot on top of the previous one if it doesn't see a checkpoint. However, there is a race condition that when two writers are writing at the same time, a writer may fail to pick up checkpoints written by another one, and the lineage will grow and finally cause StackOverflowError. Hence we have to force to build a Snapshot from scratch when the lineage length is too large to avoid hitting StackOverflowError.

  96. final def ne(arg0: AnyRef): Boolean
    Definition Classes
    AnyRef
  97. final def newDeltaHadoopConf(): Configuration

    Returns the Hadoop Configuration object which can be used to access the file system.

    Returns the Hadoop Configuration object which can be used to access the file system. All Delta code should use this method to create the Hadoop Configuration object, so that the hadoop file system configurations specified in DataFrame options will come into effect.

  98. final def notify(): Unit
    Definition Classes
    AnyRef
    Annotations
    @native()
  99. final def notifyAll(): Unit
    Definition Classes
    AnyRef
    Annotations
    @native()
  100. val options: Map[String, String]
  101. def protocolRead(protocol: Protocol): Unit

    Asserts that the client is up to date with the protocol and allowed to read the table that is using the given protocol.

  102. def protocolWrite(protocol: Protocol): Unit

    Asserts that the client is up to date with the protocol and allowed to write to the table that is using the given protocol.

  103. def recordDeltaEvent(deltaLog: DeltaLog, opType: String, tags: Map[TagDefinition, String] = Map.empty, data: AnyRef = null, path: Option[Path] = None): Unit

    Used to record the occurrence of a single event or report detailed, operation specific statistics.

    Used to record the occurrence of a single event or report detailed, operation specific statistics.

    path

    Used to log the path of the delta table when deltaLog is null.

    Attributes
    protected
    Definition Classes
    DeltaLogging
  104. def recordDeltaOperation[A](deltaLog: DeltaLog, opType: String, tags: Map[TagDefinition, String] = Map.empty)(thunk: => A): A

    Used to report the duration as well as the success or failure of an operation on a deltaLog.

    Used to report the duration as well as the success or failure of an operation on a deltaLog.

    Attributes
    protected
    Definition Classes
    DeltaLogging
  105. def recordDeltaOperationForTablePath[A](tablePath: String, opType: String, tags: Map[TagDefinition, String] = Map.empty)(thunk: => A): A

    Used to report the duration as well as the success or failure of an operation on a tahoePath.

    Used to report the duration as well as the success or failure of an operation on a tahoePath.

    Attributes
    protected
    Definition Classes
    DeltaLogging
  106. def recordEvent(metric: MetricDefinition, additionalTags: Map[TagDefinition, String] = Map.empty, blob: String = null, trimBlob: Boolean = true): Unit
    Definition Classes
    DatabricksLogging
  107. def recordFrameProfile[T](group: String, name: String)(thunk: => T): T
    Attributes
    protected
    Definition Classes
    DeltaLogging
  108. def recordOperation[S](opType: OpType, opTarget: String = null, extraTags: Map[TagDefinition, String], isSynchronous: Boolean = true, alwaysRecordStats: Boolean = false, allowAuthTags: Boolean = false, killJvmIfStuck: Boolean = false, outputMetric: MetricDefinition = METRIC_OPERATION_DURATION, silent: Boolean = true)(thunk: => S): S
    Definition Classes
    DatabricksLogging
  109. def recordProductEvent(metric: MetricDefinition with CentralizableMetric, additionalTags: Map[TagDefinition, String] = Map.empty, blob: String = null, trimBlob: Boolean = true): Unit
    Definition Classes
    DatabricksLogging
  110. def recordProductUsage(metric: MetricDefinition with CentralizableMetric, quantity: Double, additionalTags: Map[TagDefinition, String] = Map.empty, blob: String = null, forceSample: Boolean = false, trimBlob: Boolean = true, silent: Boolean = false): Unit
    Definition Classes
    DatabricksLogging
  111. def recordUsage(metric: MetricDefinition, quantity: Double, additionalTags: Map[TagDefinition, String] = Map.empty, blob: String = null, forceSample: Boolean = false, trimBlob: Boolean = true, silent: Boolean = false): Unit
    Definition Classes
    DatabricksLogging
  112. lazy val sidecarDirPath: Path

    Path to sidecar directory.

    Path to sidecar directory. This is intentionally kept lazy val as otherwise any other constructor codepaths in DeltaLog (e.g. SnapshotManagement etc) will see it as null as they are executed before this line is called.

  113. val snapshotLock: ReentrantLock

    Use ReentrantLock to allow us to call lockInterruptibly

    Use ReentrantLock to allow us to call lockInterruptibly

    Attributes
    protected
    Definition Classes
    SnapshotManagement
  114. def spark: SparkSession

    Return the current Spark session used.

    Return the current Spark session used.

    Attributes
    protected
    Definition Classes
    DeltaLogDeltaFileFormat
  115. def startTransaction(catalogTableOpt: Option[CatalogTable], snapshotOpt: Option[Snapshot] = None): OptimisticTransaction

    Returns a new OptimisticTransaction that can be used to read the current state of the log and then commit updates.

    Returns a new OptimisticTransaction that can be used to read the current state of the log and then commit updates. The reads and updates will be checked for logical conflicts with any concurrent writes to the log, and post-commit hooks can be used to notify the table's catalog of schema changes, etc.

    Note that all reads in a transaction must go through the returned transaction object, and not directly to the DeltaLog otherwise they will not be checked for conflicts.

    catalogTableOpt

    The CatalogTable for the table this transaction updates. Passing None asserts this is a path-based table with no catalog entry.

    snapshotOpt

    THe Snapshot this transaction should use, if not latest.

  116. lazy val store: LogStore

    Used to read and write physical log files and checkpoints.

    Used to read and write physical log files and checkpoints.

    Definition Classes
    DeltaLogReadChecksumCheckpoints
  117. final def synchronized[T0](arg0: => T0): T0
    Definition Classes
    AnyRef
  118. def tableExists: Boolean

    Whether a Delta table exists at this directory.

    Whether a Delta table exists at this directory. It is okay to use the cached volatile snapshot here, since the worst case is that the table has recently started existing which hasn't been picked up here. If so, any subsequent command that updates the table will see the right value.

  119. def tableId: String

    The unique identifier for this table.

  120. def throwNonExistentVersionError(versionToLoad: Long): Unit
    Definition Classes
    SnapshotManagement
  121. def toString(): String
    Definition Classes
    AnyRef → Any
  122. def unsafeVolatileSnapshot: Snapshot

    Returns the current snapshot.

    Returns the current snapshot. This does not automatically update().

    WARNING: This is not guaranteed to give you the latest snapshot of the log, nor stay consistent across multiple accesses. If you need the latest snapshot, it is recommended to fetch it using deltaLog.update(); and save the returned snapshot so it does not unexpectedly change from under you. See how OptimisticTransaction and DeltaScan use the snapshot as examples for write/read paths respectively. This API should only be used in scenarios where any recent snapshot will suffice and an update is undesired, or by internal code that holds the DeltaLog lock to prevent races.

    Definition Classes
    SnapshotManagement
  123. def update(stalenessAcceptable: Boolean = false, checkIfUpdatedSinceTs: Option[Long] = None): Snapshot

    Update ActionLog by applying the new delta files if any.

    Update ActionLog by applying the new delta files if any.

    stalenessAcceptable

    Whether we can accept working with a stale version of the table. If the table has surpassed our staleness tolerance, we will update to the latest state of the table synchronously. If staleness is acceptable, and the table hasn't passed the staleness tolerance, we will kick off a job in the background to update the table state, and can return a stale snapshot in the meantime.

    checkIfUpdatedSinceTs

    Skip the update if we've already updated the snapshot since the specified timestamp.

    Definition Classes
    SnapshotManagement
  124. def updateAfterCommit(committedVersion: Long, commit: Commit, newChecksumOpt: Option[VersionChecksum], preCommitLogSegment: LogSegment): Snapshot

    Called after committing a transaction and updating the state of the table.

    Called after committing a transaction and updating the state of the table.

    committedVersion

    the version that was committed

    commit

    information about the commit file.

    newChecksumOpt

    the checksum for the new commit, if available. Usually None, since the commit would have just finished.

    preCommitLogSegment

    the log segment of the table prior to commit

    Definition Classes
    SnapshotManagement
  125. def updateInternal(isAsync: Boolean): Snapshot

    Queries the store for new delta files and applies them to the current state.

    Queries the store for new delta files and applies them to the current state. Note: the caller should hold snapshotLock before calling this method.

    Attributes
    protected
    Definition Classes
    SnapshotManagement
  126. def upgradeProtocol(catalogTable: Option[CatalogTable], snapshot: Snapshot, newVersion: Protocol): Unit

    Upgrade the table's protocol version, by default to the maximum recognized reader and writer versions in this Delta release.

    Upgrade the table's protocol version, by default to the maximum recognized reader and writer versions in this Delta release. This method only upgrades protocol version, and will fail if the new protocol version is not a superset of the original one used by the snapshot.

  127. def useCompactedDeltasForLogSegment(deltasAndCompactedDeltas: Seq[FileStatus], deltasAfterCheckpoint: Array[FileStatus], latestCommitVersion: Long, checkpointVersionToUse: Long): Array[FileStatus]

    deltasAndCompactedDeltas

    - all deltas or compacted deltas which could be used

    deltasAfterCheckpoint

    - deltas after the last checkpoint file

    latestCommitVersion

    - commit version for which we are trying to create Snapshot for

    checkpointVersionToUse

    - underlying checkpoint version to use in Snapshot, -1 if no checkpoint is used.

    returns

    Returns a list of deltas/compacted-deltas which can be used to construct the LogSegment instead of deltasAfterCheckpoint.

    Attributes
    protected
    Definition Classes
    SnapshotManagement
  128. def verifyLogStoreConfs(sparkConf: SparkConf): Unit

    Check for conflicting LogStore configs in the spark configuration.

    Check for conflicting LogStore configs in the spark configuration.

    To maintain compatibility across the Delta ecosystem, we accept keys both with and without the "spark." prefix. This means for setting the class conf, we accept both "spark.delta.logStore.class" and "delta.logStore.class" and for scheme confs we accept both "spark.delta.logStore.${scheme}.impl" and "delta.logStore.${scheme}.impl"

    If a conf is set both with and without the spark prefix, it must be set to the same value, otherwise we throw an error.

    Definition Classes
    LogStoreProvider
  129. final def wait(): Unit
    Definition Classes
    AnyRef
    Annotations
    @throws(classOf[java.lang.InterruptedException])
  130. final def wait(arg0: Long, arg1: Int): Unit
    Definition Classes
    AnyRef
    Annotations
    @throws(classOf[java.lang.InterruptedException])
  131. final def wait(arg0: Long): Unit
    Definition Classes
    AnyRef
    Annotations
    @throws(classOf[java.lang.InterruptedException]) @native()
  132. def withCheckpointExceptionHandling(deltaLog: DeltaLog, opType: String)(thunk: => Unit): Unit

    Catch non-fatal exceptions related to checkpointing, since the checkpoint is written after the commit has completed.

    Catch non-fatal exceptions related to checkpointing, since the checkpoint is written after the commit has completed. From the perspective of the user, the commit has completed successfully. However, throw if this is in a testing environment - that way any breaking changes can be caught in unit tests.

    Attributes
    protected
    Definition Classes
    Checkpoints
  133. def withNewTransaction[T](catalogTableOpt: Option[CatalogTable], snapshotOpt: Option[Snapshot] = None)(thunk: (OptimisticTransaction) => T): T

    Execute a piece of code within a new OptimisticTransaction.

    Execute a piece of code within a new OptimisticTransaction. Reads/write sets will be recorded for this table, and all other tables will be read at a snapshot that is pinned on the first access.

    catalogTableOpt

    The CatalogTable for the table this transaction updates. Passing None asserts this is a path-based table with no catalog entry.

    snapshotOpt

    THe Snapshot this transaction should use, if not latest.

    Note

    This uses thread-local variable to make the active transaction visible. So do not use multi-threaded code in the provided thunk.

  134. def withSnapshotLockInterruptibly[T](body: => T): T

    Run body inside snapshotLock lock using lockInterruptibly so that the thread can be interrupted when waiting for the lock.

    Run body inside snapshotLock lock using lockInterruptibly so that the thread can be interrupted when waiting for the lock.

    Definition Classes
    SnapshotManagement
  135. def withStatusCode[T](statusCode: String, defaultMessage: String, data: Map[String, Any] = Map.empty)(body: => T): T

    Report a log to indicate some command is running.

    Report a log to indicate some command is running.

    Definition Classes
    DeltaProgressReporter
  136. def writeCheckpointFiles(snapshotToCheckpoint: Snapshot): LastCheckpointInfo
    Attributes
    protected
    Definition Classes
    Checkpoints
  137. def writeLastCheckpointFile(deltaLog: DeltaLog, lastCheckpointInfo: LastCheckpointInfo, addChecksum: Boolean): Unit
    Attributes
    protected[delta]
    Definition Classes
    Checkpoints

Deprecated Value Members

  1. def checkpoint(): Unit

    Creates a checkpoint using the default snapshot.

    Creates a checkpoint using the default snapshot.

    WARNING: This API is being deprecated, and will be removed in future versions. Please use the checkpoint(Snapshot) function below to write checkpoints to the delta log.

    Definition Classes
    Checkpoints
    Annotations
    @deprecated
    Deprecated

    (Since version 12.0) This method is deprecated and will be removed in future versions.

  2. def snapshot: Snapshot

    WARNING: This API is unsafe and deprecated.

    WARNING: This API is unsafe and deprecated. It will be removed in future versions. Use the above unsafeVolatileSnapshot to get the most recently cached snapshot on the cluster.

    Definition Classes
    SnapshotManagement
    Annotations
    @deprecated
    Deprecated

    (Since version 12.0) This method is deprecated and will be removed in future versions. Use unsafeVolatileSnapshot instead

  3. def startTransaction(): OptimisticTransaction

    Legacy/compat overload that does not require catalog table information.

    Legacy/compat overload that does not require catalog table information. Avoid prod use.

    Annotations
    @deprecated
    Deprecated

    (Since version 3.0) Please use the CatalogTable overload instead

  4. def withNewTransaction[T](thunk: (OptimisticTransaction) => T): T

    Legacy/compat overload that does not require catalog table information.

    Legacy/compat overload that does not require catalog table information. Avoid prod use.

    Annotations
    @deprecated
    Deprecated

    (Since version 3.0) Please use the CatalogTable overload instead

Inherited from ReadChecksum

Inherited from DeltaFileFormat

Inherited from SnapshotManagement

Inherited from LogStoreProvider

Inherited from MetadataCleanup

Inherited from Checkpoints

Inherited from DeltaLogging

Inherited from DatabricksLogging

Inherited from DeltaProgressReporter

Inherited from Logging

Inherited from AnyRef

Inherited from Any

Ungrouped