object DeltaHistoryManager extends DeltaLogging
Contains many utility methods that can also be executed on Spark executors.
- Alphabetic
- By Inheritance
- DeltaHistoryManager
- DeltaLogging
- DatabricksLogging
- DeltaProgressReporter
- LoggingShims
- Logging
- AnyRef
- Any
- Hide All
- Show All
- Public
- All
Type Members
-
implicit
class
LogStringContext extends AnyRef
- Definition Classes
- LoggingShims
-
class
BufferingLogDeletionIterator extends Iterator[FileStatus]
An iterator that helps select old log files for deletion.
An iterator that helps select old log files for deletion. It takes the input iterator of log files from the earliest file, and returns should-be-deleted files until the given maxTimestamp or maxVersion to delete is reached. Note that this iterator may stop deleting files earlier than maxTimestamp or maxVersion if it finds that files that need to be preserved for adjusting the timestamps of subsequent files. Let's go through an example. Assume the following commit history:
+---------+-----------+--------------------+
Version
Timestamp
Adjusted Timestamp
+---------+-----------+--------------------+
0
0
0
2
10
10
3
7
11
4
8
12
5
14
14
+---------+-----------+--------------------+
As you can see from the example, we require timestamps to be monotonically increasing with respect to the version of the commit, and each commit to have a unique timestamp. If we have a commit which doesn't obey one of these two requirements, we adjust the timestamp of that commit to be one millisecond greater than the previous commit.
Given the above commit history, the behavior of this iterator will be as follows:
- For maxVersion = 1 and maxTimestamp = 9, we can delete versions 0 and 1
- Until we receive maxVersion >= 4 and maxTimestamp >= 12, we can't delete versions 2 and 3. This is because version 2 is used to adjust the timestamps of commits up to version 4.
- For maxVersion >= 5 and maxTimestamp >= 14 we can delete everything The semantics of time travel guarantee that for a given timestamp, the user will ALWAYS get the same version. Consider a user asks to get the version at timestamp 11. If all files are there, we would return version 3 (timestamp 11) for this query. If we delete versions 0-2, the original timestamp of version 3 (7) will not have an anchor to adjust on, and if the time travel query is re-executed we would return version 4. This is the motivation behind this iterator implementation.
The implementation maintains an internal "maybeDelete" buffer of files that we are unsure of deleting because they may be necessary to adjust time of future files. For each file we get from the underlying iterator, we check whether it needs time adjustment or not. If it does need time adjustment, then we cannot immediately decide whether it is safe to delete that file or not and therefore we put it in each the buffer. Then we iteratively peek ahead at the future files and accordingly decide whether to delete all the buffered files or retain them.
-
case class
Commit(version: Long, timestamp: Long) extends CommitMarker with Product with Serializable
A helper class to represent the timestamp and version of a commit.
Value Members
-
final
def
!=(arg0: Any): Boolean
- Definition Classes
- AnyRef → Any
-
final
def
##(): Int
- Definition Classes
- AnyRef → Any
-
final
def
==(arg0: Any): Boolean
- Definition Classes
- AnyRef → Any
-
final
def
asInstanceOf[T0]: T0
- Definition Classes
- Any
-
def
clone(): AnyRef
- Attributes
- protected[lang]
- Definition Classes
- AnyRef
- Annotations
- @throws( ... ) @native()
-
def
deltaAssert(check: ⇒ Boolean, name: String, msg: String, deltaLog: DeltaLog = null, data: AnyRef = null, path: Option[Path] = None): Unit
Helper method to check invariants in Delta code.
Helper method to check invariants in Delta code. Fails when running in tests, records a delta assertion event and logs a warning otherwise.
- Attributes
- protected
- Definition Classes
- DeltaLogging
-
final
def
eq(arg0: AnyRef): Boolean
- Definition Classes
- AnyRef
-
def
equals(arg0: Any): Boolean
- Definition Classes
- AnyRef → Any
-
def
finalize(): Unit
- Attributes
- protected[lang]
- Definition Classes
- AnyRef
- Annotations
- @throws( classOf[java.lang.Throwable] )
-
final
def
getClass(): Class[_]
- Definition Classes
- AnyRef → Any
- Annotations
- @native()
-
def
getCommitInfoOpt(logStore: LogStore, deltaFile: Path, hadoopConf: Configuration): Option[CommitInfo]
Get the persisted commit info (if available) for the given delta file.
-
def
getCommonTags(deltaLog: DeltaLog, tahoeId: String): Map[TagDefinition, String]
- Definition Classes
- DeltaLogging
-
def
getEarliestDeltaFile(deltaLog: DeltaLog): Long
Get the earliest commit available for this table.
Get the earliest commit available for this table. Note that this version isn't guaranteed to exist when performing an action as a concurrent operation can delete the file during cleanup. This value must be used as a lower bound.
-
def
getErrorData(e: Throwable): Map[String, Any]
- Definition Classes
- DeltaLogging
-
def
hashCode(): Int
- Definition Classes
- AnyRef → Any
- Annotations
- @native()
-
def
initializeLogIfNecessary(isInterpreter: Boolean, silent: Boolean): Boolean
- Attributes
- protected
- Definition Classes
- Logging
-
def
initializeLogIfNecessary(isInterpreter: Boolean): Unit
- Attributes
- protected
- Definition Classes
- Logging
-
final
def
isInstanceOf[T0]: Boolean
- Definition Classes
- Any
-
def
isTraceEnabled(): Boolean
- Attributes
- protected
- Definition Classes
- Logging
-
def
log: Logger
- Attributes
- protected
- Definition Classes
- Logging
-
def
logConsole(line: String): Unit
- Definition Classes
- DatabricksLogging
-
def
logDebug(entry: LogEntry, throwable: Throwable): Unit
- Attributes
- protected
- Definition Classes
- LoggingShims
-
def
logDebug(entry: LogEntry): Unit
- Attributes
- protected
- Definition Classes
- LoggingShims
-
def
logDebug(msg: ⇒ String, throwable: Throwable): Unit
- Attributes
- protected
- Definition Classes
- Logging
-
def
logDebug(msg: ⇒ String): Unit
- Attributes
- protected
- Definition Classes
- Logging
-
def
logError(entry: LogEntry, throwable: Throwable): Unit
- Attributes
- protected
- Definition Classes
- LoggingShims
-
def
logError(entry: LogEntry): Unit
- Attributes
- protected
- Definition Classes
- LoggingShims
-
def
logError(msg: ⇒ String, throwable: Throwable): Unit
- Attributes
- protected
- Definition Classes
- Logging
-
def
logError(msg: ⇒ String): Unit
- Attributes
- protected
- Definition Classes
- Logging
-
def
logInfo(entry: LogEntry, throwable: Throwable): Unit
- Attributes
- protected
- Definition Classes
- LoggingShims
-
def
logInfo(entry: LogEntry): Unit
- Attributes
- protected
- Definition Classes
- LoggingShims
-
def
logInfo(msg: ⇒ String, throwable: Throwable): Unit
- Attributes
- protected
- Definition Classes
- Logging
-
def
logInfo(msg: ⇒ String): Unit
- Attributes
- protected
- Definition Classes
- Logging
-
def
logName: String
- Attributes
- protected
- Definition Classes
- Logging
-
def
logTrace(entry: LogEntry, throwable: Throwable): Unit
- Attributes
- protected
- Definition Classes
- LoggingShims
-
def
logTrace(entry: LogEntry): Unit
- Attributes
- protected
- Definition Classes
- LoggingShims
-
def
logTrace(msg: ⇒ String, throwable: Throwable): Unit
- Attributes
- protected
- Definition Classes
- Logging
-
def
logTrace(msg: ⇒ String): Unit
- Attributes
- protected
- Definition Classes
- Logging
-
def
logWarning(entry: LogEntry, throwable: Throwable): Unit
- Attributes
- protected
- Definition Classes
- LoggingShims
-
def
logWarning(entry: LogEntry): Unit
- Attributes
- protected
- Definition Classes
- LoggingShims
-
def
logWarning(msg: ⇒ String, throwable: Throwable): Unit
- Attributes
- protected
- Definition Classes
- Logging
-
def
logWarning(msg: ⇒ String): Unit
- Attributes
- protected
- Definition Classes
- Logging
-
final
def
ne(arg0: AnyRef): Boolean
- Definition Classes
- AnyRef
-
final
def
notify(): Unit
- Definition Classes
- AnyRef
- Annotations
- @native()
-
final
def
notifyAll(): Unit
- Definition Classes
- AnyRef
- Annotations
- @native()
-
def
recordDeltaEvent(deltaLog: DeltaLog, opType: String, tags: Map[TagDefinition, String] = Map.empty, data: AnyRef = null, path: Option[Path] = None): Unit
Used to record the occurrence of a single event or report detailed, operation specific statistics.
Used to record the occurrence of a single event or report detailed, operation specific statistics.
- path
Used to log the path of the delta table when
deltaLogis null.
- Attributes
- protected
- Definition Classes
- DeltaLogging
-
def
recordDeltaOperation[A](deltaLog: DeltaLog, opType: String, tags: Map[TagDefinition, String] = Map.empty)(thunk: ⇒ A): A
Used to report the duration as well as the success or failure of an operation on a
deltaLog.Used to report the duration as well as the success or failure of an operation on a
deltaLog.- Attributes
- protected
- Definition Classes
- DeltaLogging
-
def
recordDeltaOperationForTablePath[A](tablePath: String, opType: String, tags: Map[TagDefinition, String] = Map.empty)(thunk: ⇒ A): A
Used to report the duration as well as the success or failure of an operation on a
tahoePath.Used to report the duration as well as the success or failure of an operation on a
tahoePath.- Attributes
- protected
- Definition Classes
- DeltaLogging
-
def
recordEvent(metric: MetricDefinition, additionalTags: Map[TagDefinition, String] = Map.empty, blob: String = null, trimBlob: Boolean = true): Unit
- Definition Classes
- DatabricksLogging
-
def
recordFrameProfile[T](group: String, name: String)(thunk: ⇒ T): T
- Attributes
- protected
- Definition Classes
- DeltaLogging
-
def
recordOperation[S](opType: OpType, opTarget: String = null, extraTags: Map[TagDefinition, String], isSynchronous: Boolean = true, alwaysRecordStats: Boolean = false, allowAuthTags: Boolean = false, killJvmIfStuck: Boolean = false, outputMetric: MetricDefinition = METRIC_OPERATION_DURATION, silent: Boolean = true)(thunk: ⇒ S): S
- Definition Classes
- DatabricksLogging
-
def
recordProductEvent(metric: MetricDefinition with CentralizableMetric, additionalTags: Map[TagDefinition, String] = Map.empty, blob: String = null, trimBlob: Boolean = true): Unit
- Definition Classes
- DatabricksLogging
-
def
recordProductUsage(metric: MetricDefinition with CentralizableMetric, quantity: Double, additionalTags: Map[TagDefinition, String] = Map.empty, blob: String = null, forceSample: Boolean = false, trimBlob: Boolean = true, silent: Boolean = false): Unit
- Definition Classes
- DatabricksLogging
-
def
recordUsage(metric: MetricDefinition, quantity: Double, additionalTags: Map[TagDefinition, String] = Map.empty, blob: String = null, forceSample: Boolean = false, trimBlob: Boolean = true, silent: Boolean = false): Unit
- Definition Classes
- DatabricksLogging
-
final
def
synchronized[T0](arg0: ⇒ T0): T0
- Definition Classes
- AnyRef
-
def
toString(): String
- Definition Classes
- AnyRef → Any
-
final
def
wait(): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws( ... )
-
final
def
wait(arg0: Long, arg1: Int): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws( ... )
-
final
def
wait(arg0: Long): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws( ... ) @native()
-
def
withStatusCode[T](statusCode: String, defaultMessage: String, data: Map[String, Any] = Map.empty)(body: ⇒ T): T
Report a log to indicate some command is running.
Report a log to indicate some command is running.
- Definition Classes
- DeltaProgressReporter