case class DeltaSource(spark: SparkSession, deltaLog: DeltaLog, options: DeltaOptions, snapshotAtSourceInit: SnapshotDescriptor, metadataPath: String, metadataTrackingLog: Option[DeltaSourceMetadataTrackingLog] = None, filters: Seq[Expression] = Nil) extends DeltaSourceBase with DeltaSourceCDCSupport with DeltaSourceMetadataEvolutionSupport with Product with Serializable

A streaming source for a Delta table.

When a new stream is started, delta starts by constructing a org.apache.spark.sql.delta.Snapshot at the current version of the table. This snapshot is broken up into batches until all existing data has been processed. Subsequent processing is done by tailing the change log looking for new data. This results in the streaming query returning the same answer as a batch query that had processed the entire dataset at any given point.

Linear Supertypes
Serializable, Serializable, Product, Equals, DeltaSourceMetadataEvolutionSupport, DeltaSourceCDCSupport, DeltaSourceBase, DeltaLogging, DatabricksLogging, DeltaProgressReporter, LoggingShims, Logging, SupportsTriggerAvailableNow, SupportsAdmissionControl, Source, SparkDataStream, AnyRef, Any
Ordering
  1. Alphabetic
  2. By Inheritance
Inherited
  1. DeltaSource
  2. Serializable
  3. Serializable
  4. Product
  5. Equals
  6. DeltaSourceMetadataEvolutionSupport
  7. DeltaSourceCDCSupport
  8. DeltaSourceBase
  9. DeltaLogging
  10. DatabricksLogging
  11. DeltaProgressReporter
  12. LoggingShims
  13. Logging
  14. SupportsTriggerAvailableNow
  15. SupportsAdmissionControl
  16. Source
  17. SparkDataStream
  18. AnyRef
  19. Any
  1. Hide All
  2. Show All
Visibility
  1. Public
  2. All

Instance Constructors

  1. new DeltaSource(spark: SparkSession, deltaLog: DeltaLog, options: DeltaOptions, snapshotAtSourceInit: SnapshotDescriptor, metadataPath: String, metadataTrackingLog: Option[DeltaSourceMetadataTrackingLog] = None, filters: Seq[Expression] = Nil)

Type Members

  1. implicit class LogStringContext extends AnyRef
    Definition Classes
    LoggingShims
  2. case class AdmissionLimits(maxFiles: Option[Int] = options.maxFilesPerTrigger, bytesToTake: Long = ...) extends DeltaSourceAdmissionBase with Product with Serializable

    Class that helps controlling how much data should be processed by a single micro-batch.

  3. trait DeltaSourceAdmissionBase extends AnyRef
  4. class IndexedChangeFileSeq extends AnyRef

    This class represents an iterator of Change metadata(AddFile, RemoveFile, AddCDCFile) for a particular version.

    This class represents an iterator of Change metadata(AddFile, RemoveFile, AddCDCFile) for a particular version.

    Definition Classes
    DeltaSourceCDCSupport

Value Members

  1. final def !=(arg0: Any): Boolean
    Definition Classes
    AnyRef → Any
  2. final def ##(): Int
    Definition Classes
    AnyRef → Any
  3. final def ==(arg0: Any): Boolean
    Definition Classes
    AnyRef → Any
  4. def addBeginAndEndIndexOffsetsForVersion(version: Long, iterator: Iterator[IndexedFile]): Iterator[IndexedFile]

    Adds dummy BEGIN_INDEX and END_INDEX IndexedFiles for @version before and after the contents of the iterator.

    Adds dummy BEGIN_INDEX and END_INDEX IndexedFiles for @version before and after the contents of the iterator. The contents of the iterator must be the IndexedFiles that correspond to this version.

    Attributes
    protected
  5. lazy val allowUnsafeStreamingReadOnColumnMappingSchemaChanges: Boolean

    Flag that allows user to force enable unsafe streaming read on Delta table with column mapping enabled AND drop/rename actions.

    Flag that allows user to force enable unsafe streaming read on Delta table with column mapping enabled AND drop/rename actions.

    Attributes
    protected
    Definition Classes
    DeltaSourceBase
  6. lazy val allowUnsafeStreamingReadOnPartitionColumnChanges: Boolean
    Attributes
    protected
    Definition Classes
    DeltaSourceBase
  7. final def asInstanceOf[T0]: T0
    Definition Classes
    Any
  8. def checkReadIncompatibleSchemaChangeOnStreamStartOnce(batchStartVersion: Long, batchEndVersionOpt: Option[Long] = None): Unit

    Check read-incompatible schema changes during stream (re)start so we could fail fast.

    Check read-incompatible schema changes during stream (re)start so we could fail fast.

    This only needs to be called ONCE in the life cycle of a stream, either at the very first latestOffset, or the very first getBatch to make sure we have detected an incompatible schema change. Typically, the verifyStreamHygiene that was called maybe good enough to detect these schema changes, there may be cases that wouldn't work, e.g. consider this sequence: 1. User starts a new stream @ startingVersion 1 2. latestOffset is called before getBatch() because there was no previous commits so getBatch won't be called as a recovery mechanism. Suppose there's a single rename/drop/nullability change S during computing next offset, S would look exactly the same as the latest schema so verifyStreamHygiene would not work. 3. latestOffset would return this new offset cross the schema boundary.

    If a schema log is already initialized, we don't have to run the initialization nor schema checks any more.

    batchStartVersion

    Start version we want to verify read compatibility against

    batchEndVersionOpt

    Optionally, if we are checking against an existing constructed batch during streaming initialization, we would also like to verify all schema changes in between as well before we can lazily initialize the schema log if needed.

    Attributes
    protected
    Definition Classes
    DeltaSourceBase
  9. def checkReadIncompatibleSchemaChanges(metadata: Metadata, version: Long, batchStartVersion: Long, batchEndVersionOpt: Option[Long] = None, validatedDuringStreamStart: Boolean = false): Unit

    Narrow waist to verify a metadata action for read-incompatible schema changes, specifically: 1.

    Narrow waist to verify a metadata action for read-incompatible schema changes, specifically: 1. Any column mapping related schema changes (rename / drop) columns 2. Standard read-compatibility changes including: a) No missing columns b) No data type changes c) No read-incompatible nullability changes If the check fails, we throw an exception to exit the stream. If lazy log initialization is required, we also run a one time scan to safely initialize the metadata tracking log upon any non-additive schema change failures.

    metadata

    Metadata that contains a potential schema change

    version

    Version for the metadata action

    validatedDuringStreamStart

    Whether this check is being done during stream start.

    Attributes
    protected
    Definition Classes
    DeltaSourceBase
  10. def cleanUpSnapshotResources(): Unit
    Attributes
    protected
    Definition Classes
    DeltaSourceBase
  11. def clone(): AnyRef
    Attributes
    protected[lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( ... ) @native()
  12. def collectMetadataActions(startVersion: Long, endVersion: Long): Seq[(Long, Metadata)]
    Attributes
    protected
    Definition Classes
    DeltaSourceMetadataEvolutionSupport
  13. def collectProtocolActions(startVersion: Long, endVersion: Long): Seq[(Long, Protocol)]
    Attributes
    protected
    Definition Classes
    DeltaSourceMetadataEvolutionSupport
  14. def commit(end: Offset): Unit
    Definition Classes
    DeltaSource → Source
  15. def commit(end: Offset): Unit
    Definition Classes
    Source → SparkDataStream
  16. def createDataFrame(indexedFiles: Iterator[IndexedFile]): DataFrame

    Given an iterator of file actions, create a DataFrame representing the files added to a table Only AddFile actions will be used to create the DataFrame.

    Given an iterator of file actions, create a DataFrame representing the files added to a table Only AddFile actions will be used to create the DataFrame.

    indexedFiles

    actions iterator from which to generate the DataFrame.

    Attributes
    protected
    Definition Classes
    DeltaSourceBase
  17. def createDataFrameBetweenOffsets(startVersion: Long, startIndex: Long, isInitialSnapshot: Boolean, startOffsetOption: Option[DeltaSourceOffset], endOffset: DeltaSourceOffset): DataFrame

    Return the DataFrame between start and end offset.

    Return the DataFrame between start and end offset.

    Attributes
    protected
    Definition Classes
    DeltaSourceBase
  18. def deltaAssert(check: ⇒ Boolean, name: String, msg: String, deltaLog: DeltaLog = null, data: AnyRef = null, path: Option[Path] = None): Unit

    Helper method to check invariants in Delta code.

    Helper method to check invariants in Delta code. Fails when running in tests, records a delta assertion event and logs a warning otherwise.

    Attributes
    protected
    Definition Classes
    DeltaLogging
  19. val deltaLog: DeltaLog
  20. def deserializeOffset(json: String): Offset
    Definition Classes
    Source → SparkDataStream
  21. val emptyDataFrame: DataFrame
    Attributes
    protected
    Definition Classes
    DeltaSourceBase
  22. final def eq(arg0: AnyRef): Boolean
    Definition Classes
    AnyRef
  23. val excludeRegex: Option[Regex]
    Attributes
    protected
  24. val filters: Seq[Expression]
  25. def finalize(): Unit
    Attributes
    protected[lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( classOf[java.lang.Throwable] )
  26. lazy val forceEnableStreamingReadOnReadIncompatibleSchemaChangesDuringStreamStart: Boolean

    Flag that allows user to disable the read-compatibility check during stream start which protects against an corner case in which verifyStreamHygiene could not detect.

    Flag that allows user to disable the read-compatibility check during stream start which protects against an corner case in which verifyStreamHygiene could not detect. This is a bug fix but yet a potential behavior change, so we add a flag to fallback.

    Attributes
    protected
    Definition Classes
    DeltaSourceBase
  27. lazy val forceEnableUnsafeReadOnNullabilityChange: Boolean

    Flag that allow user to fallback to the legacy behavior in which user can allow nullable=false schema to read nullable=true data, which is incorrect but a behavior change regardless.

    Flag that allow user to fallback to the legacy behavior in which user can allow nullable=false schema to read nullable=true data, which is incorrect but a behavior change regardless.

    Attributes
    protected
    Definition Classes
    DeltaSourceBase
  28. def getBatch(startOffsetOption: Option[Offset], end: Offset): DataFrame
    Definition Classes
    DeltaSource → Source
  29. def getCDCFileChangesAndCreateDataFrame(startVersion: Long, startIndex: Long, isInitialSnapshot: Boolean, endOffset: DeltaSourceOffset): DataFrame

    Get the changes from startVersion, startIndex to the end for CDC case.

    Get the changes from startVersion, startIndex to the end for CDC case. We need to call CDCReader to get the CDC DataFrame.

    startVersion

    - calculated starting version

    startIndex

    - calculated starting index

    isInitialSnapshot

    - whether the stream has to return the initial snapshot or not

    endOffset

    - Offset that signifies the end of the stream.

    returns

    the DataFrame containing the file changes (AddFile, RemoveFile, AddCDCFile)

    Attributes
    protected
    Definition Classes
    DeltaSourceCDCSupport
  30. final def getClass(): Class[_]
    Definition Classes
    AnyRef → Any
    Annotations
    @native()
  31. def getCommonTags(deltaLog: DeltaLog, tahoeId: String): Map[TagDefinition, String]
    Definition Classes
    DeltaLogging
  32. def getDefaultReadLimit(): ReadLimit
    Definition Classes
    DeltaSource → SupportsAdmissionControl
  33. def getErrorData(e: Throwable): Map[String, Any]
    Definition Classes
    DeltaLogging
  34. def getFileChanges(fromVersion: Long, fromIndex: Long, isInitialSnapshot: Boolean, endOffset: Option[DeltaSourceOffset] = None, verifyMetadataAction: Boolean = true): ClosableIterator[IndexedFile]

    Get the changes starting from (startVersion, startIndex).

    Get the changes starting from (startVersion, startIndex). The start point should not be included in the result.

    endOffset

    If defined, do not return changes beyond this offset. If not defined, we must be scanning the log to find the next offset.

    verifyMetadataAction

    If true, we will break the stream when we detect any read-incompatible metadata changes.

    Attributes
    protected
  35. def getFileChangesAndCreateDataFrame(startVersion: Long, startIndex: Long, isInitialSnapshot: Boolean, endOffset: DeltaSourceOffset): DataFrame

    get the changes from startVersion, startIndex to the end

    get the changes from startVersion, startIndex to the end

    startVersion

    - calculated starting version

    startIndex

    - calculated starting index

    isInitialSnapshot

    - whether the stream has to return the initial snapshot or not

    endOffset

    - Offset that signifies the end of the stream.

    Attributes
    protected
    Definition Classes
    DeltaSourceBase
  36. def getFileChangesForCDC(fromVersion: Long, fromIndex: Long, isInitialSnapshot: Boolean, limits: Option[AdmissionLimits], endOffset: Option[DeltaSourceOffset], verifyMetadataAction: Boolean = true): Iterator[(Long, Iterator[IndexedFile], Option[CommitInfo])]

    Get the changes starting from (fromVersion, fromIndex).

    Get the changes starting from (fromVersion, fromIndex). fromVersion is included. It returns an iterator of (log_version, fileActions, Optional[CommitInfo]). The commit info is needed later on so that the InCommitTimestamp of the log files can be determined.

    If verifyMetadataAction = true, we will break the stream when we detect any read-incompatible metadata changes.

    Attributes
    protected
    Definition Classes
    DeltaSourceCDCSupport
  37. def getFileChangesWithRateLimit(fromVersion: Long, fromIndex: Long, isInitialSnapshot: Boolean, limits: Option[AdmissionLimits] = Some(AdmissionLimits())): ClosableIterator[IndexedFile]
    Attributes
    protected
    Definition Classes
    DeltaSourceBase
  38. def getMetadataOrProtocolChangeIndexedFileIterator(metadataChangeOpt: Option[Metadata], protocolChangeOpt: Option[Protocol], version: Long): ClosableIterator[IndexedFile]

    If the current stream metadata is not equal to the metadata change in metadataChangeOpt, return a metadata change barrier IndexedFile.

    If the current stream metadata is not equal to the metadata change in metadataChangeOpt, return a metadata change barrier IndexedFile. Only returns something if trackingMetadataChangeis true.

    Attributes
    protected
    Definition Classes
    DeltaSourceMetadataEvolutionSupport
  39. def getNextOffsetFromPreviousOffset(previousOffset: DeltaSourceOffset, limits: Option[AdmissionLimits]): Option[DeltaSourceOffset]

    Return the next offset when previous offset exists.

    Return the next offset when previous offset exists.

    Attributes
    protected
    Definition Classes
    DeltaSourceBase
  40. def getNextOffsetFromPreviousOffsetIfPendingSchemaChange(previousOffset: DeltaSourceOffset): Option[DeltaSourceOffset]

    If the given previous Delta source offset is a schema change offset, returns the appropriate next offset.

    If the given previous Delta source offset is a schema change offset, returns the appropriate next offset. This should be called before trying any other means of determining the next offset. If this returns None, then there is no schema change, and the caller should determine the next offset in the normal way.

    Attributes
    protected
    Definition Classes
    DeltaSourceMetadataEvolutionSupport
  41. def getOffset: Option[Offset]
    Definition Classes
    DeltaSource → Source
  42. def getSnapshotAt(version: Long): (Iterator[IndexedFile], Option[Long])

    This method computes the initial snapshot to read when Delta Source was initialized on a fresh stream.

    This method computes the initial snapshot to read when Delta Source was initialized on a fresh stream.

    returns

    A tuple where the first element is an iterator of IndexedFiles and the second element is the in-commit timestamp of the initial snapshot if available.

    Attributes
    protected
  43. def getSnapshotFromDeltaLog(version: Long): Snapshot

    Narrow-waist for generating snapshot from Delta Log within Delta Source

    Narrow-waist for generating snapshot from Delta Log within Delta Source

    Attributes
    protected
  44. def getStartingOffsetFromSpecificDeltaVersion(fromVersion: Long, isInitialSnapshot: Boolean, limits: Option[AdmissionLimits]): Option[DeltaSourceOffset]

    Returns the offset that starts from a specific delta table version.

    Returns the offset that starts from a specific delta table version. This function is called when starting a new stream query.

    fromVersion

    The version of the delta table to calculate the offset from.

    isInitialSnapshot

    Whether the delta version is for the initial snapshot or not.

    limits

    Indicates how much data can be processed by a micro batch.

    Attributes
    protected
    Definition Classes
    DeltaSourceBase
  45. lazy val getStartingVersion: Option[Long]

    Extracts whether users provided the option to time travel a relation.

    Extracts whether users provided the option to time travel a relation. If a query restarts from a checkpoint and the checkpoint has recorded the offset, this method should never been called.

    Attributes
    protected
  46. val hasCheckedReadIncompatibleSchemaChangesOnStreamStart: Boolean

    A global flag to mark whether we have done a per-stream start check for column mapping schema changes (rename / drop).

    A global flag to mark whether we have done a per-stream start check for column mapping schema changes (rename / drop).

    Attributes
    protected
    Definition Classes
    DeltaSourceBase
    Annotations
    @volatile()
  47. def initForTriggerAvailableNowIfNeeded(startOffsetOpt: Option[DeltaSourceOffset]): Unit

    initialize the internal states for AvailableNow if this method is called first time after prepareForTriggerAvailableNow.

    initialize the internal states for AvailableNow if this method is called first time after prepareForTriggerAvailableNow.

    Attributes
    protected
    Definition Classes
    DeltaSourceBase
  48. def initLastOffsetForTriggerAvailableNow(startOffsetOpt: Option[DeltaSourceOffset]): Unit
    Attributes
    protected
    Definition Classes
    DeltaSourceBase
  49. def initialOffset(): Offset
    Definition Classes
    Source → SparkDataStream
  50. var initialState: DeltaSourceSnapshot
    Attributes
    protected
  51. var initialStateVersion: Long
    Attributes
    protected
  52. def initializeLogIfNecessary(isInterpreter: Boolean, silent: Boolean): Boolean
    Attributes
    protected
    Definition Classes
    Logging
  53. def initializeLogIfNecessary(isInterpreter: Boolean): Unit
    Attributes
    protected
    Definition Classes
    Logging
  54. def initializeMetadataTrackingAndExitStream(batchStartVersion: Long, batchEndVersionOpt: Option[Long] = None, alwaysFailUponLogInitialized: Boolean = false): Unit

    Initialize the schema tracking log if an empty schema tracking log is provided.

    Initialize the schema tracking log if an empty schema tracking log is provided. This method also checks the range between batchStartVersion and batchEndVersion to ensure we a safe schema to be initialized in the log.

    batchStartVersion

    Start version of the batch of data to be proceed, it should typically be the schema that is safe to process incoming data.

    batchEndVersionOpt

    Optionally, if we are looking at a constructed batch with existing end offset, we need to double verify to ensure no read-incompatible within the batch range.

    alwaysFailUponLogInitialized

    Whether we should always fail with the schema evolution exception.

    Attributes
    protected
    Definition Classes
    DeltaSourceMetadataEvolutionSupport
  55. final def isInstanceOf[T0]: Boolean
    Definition Classes
    Any
  56. val isStreamingFromColumnMappingTable: Boolean

    Whether we are streaming from a table with column mapping enabled

    Whether we are streaming from a table with column mapping enabled

    Attributes
    protected
    Definition Classes
    DeltaSourceBase
  57. def isTraceEnabled(): Boolean
    Attributes
    protected
    Definition Classes
    Logging
  58. val lastOffsetForTriggerAvailableNow: Option[DeltaSourceOffset]

    When AvailableNow is used, this offset will be the upper bound where this run of the query will process up.

    When AvailableNow is used, this offset will be the upper bound where this run of the query will process up. We may run multiple micro batches, but the query will stop itself when it reaches this offset.

    Attributes
    protected
    Definition Classes
    DeltaSourceBase
  59. def latestOffset(startOffset: Offset, limit: ReadLimit): Offset

    This should only be called by the engine.

    This should only be called by the engine. Call latestOffsetInternal instead if you need to get the latest offset.

    Definition Classes
    DeltaSource → SupportsAdmissionControl
  60. def latestOffsetInternal(startOffset: Option[DeltaSourceOffset], limit: ReadLimit): Option[DeltaSourceOffset]

    An internal latestOffsetInternal to get the latest offset.

    An internal latestOffsetInternal to get the latest offset.

    Attributes
    protected
    Definition Classes
    DeltaSourceDeltaSourceBase
  61. def log: Logger
    Attributes
    protected
    Definition Classes
    Logging
  62. def logConsole(line: String): Unit
    Definition Classes
    DatabricksLogging
  63. def logDebug(entry: LogEntry, throwable: Throwable): Unit
    Attributes
    protected
    Definition Classes
    LoggingShims
  64. def logDebug(entry: LogEntry): Unit
    Attributes
    protected
    Definition Classes
    LoggingShims
  65. def logDebug(msg: ⇒ String, throwable: Throwable): Unit
    Attributes
    protected
    Definition Classes
    Logging
  66. def logDebug(msg: ⇒ String): Unit
    Attributes
    protected
    Definition Classes
    Logging
  67. def logError(entry: LogEntry, throwable: Throwable): Unit
    Attributes
    protected
    Definition Classes
    LoggingShims
  68. def logError(entry: LogEntry): Unit
    Attributes
    protected
    Definition Classes
    LoggingShims
  69. def logError(msg: ⇒ String, throwable: Throwable): Unit
    Attributes
    protected
    Definition Classes
    Logging
  70. def logError(msg: ⇒ String): Unit
    Attributes
    protected
    Definition Classes
    Logging
  71. def logInfo(entry: LogEntry, throwable: Throwable): Unit
    Attributes
    protected
    Definition Classes
    LoggingShims
  72. def logInfo(entry: LogEntry): Unit
    Attributes
    protected
    Definition Classes
    LoggingShims
  73. def logInfo(msg: ⇒ String, throwable: Throwable): Unit
    Attributes
    protected
    Definition Classes
    Logging
  74. def logInfo(msg: ⇒ String): Unit
    Attributes
    protected
    Definition Classes
    Logging
  75. def logName: String
    Attributes
    protected
    Definition Classes
    Logging
  76. def logTrace(entry: LogEntry, throwable: Throwable): Unit
    Attributes
    protected
    Definition Classes
    LoggingShims
  77. def logTrace(entry: LogEntry): Unit
    Attributes
    protected
    Definition Classes
    LoggingShims
  78. def logTrace(msg: ⇒ String, throwable: Throwable): Unit
    Attributes
    protected
    Definition Classes
    Logging
  79. def logTrace(msg: ⇒ String): Unit
    Attributes
    protected
    Definition Classes
    Logging
  80. def logWarning(entry: LogEntry, throwable: Throwable): Unit
    Attributes
    protected
    Definition Classes
    LoggingShims
  81. def logWarning(entry: LogEntry): Unit
    Attributes
    protected
    Definition Classes
    LoggingShims
  82. def logWarning(msg: ⇒ String, throwable: Throwable): Unit
    Attributes
    protected
    Definition Classes
    Logging
  83. def logWarning(msg: ⇒ String): Unit
    Attributes
    protected
    Definition Classes
    Logging
  84. val metadataPath: String
  85. val metadataTrackingLog: Option[DeltaSourceMetadataTrackingLog]
  86. final def ne(arg0: AnyRef): Boolean
    Definition Classes
    AnyRef
  87. final def notify(): Unit
    Definition Classes
    AnyRef
    Annotations
    @native()
  88. final def notifyAll(): Unit
    Definition Classes
    AnyRef
    Annotations
    @native()
  89. val options: DeltaOptions
  90. val persistedMetadataAtSourceInit: Option[PersistedMetadata]

    The persisted schema from the schema log that must be used to read data files in this Delta streaming source.

    The persisted schema from the schema log that must be used to read data files in this Delta streaming source.

    Attributes
    protected
    Definition Classes
    DeltaSourceBase
  91. def prepareForTriggerAvailableNow(): Unit
    Definition Classes
    DeltaSourceBase → SupportsTriggerAvailableNow
  92. val readConfigurationsAtSourceInit: Map[String, String]
    Attributes
    protected
    Definition Classes
    DeltaSourceBase
  93. val readPartitionSchemaAtSourceInit: StructType
    Attributes
    protected
    Definition Classes
    DeltaSourceBase
  94. val readProtocolAtSourceInit: Protocol
    Attributes
    protected
    Definition Classes
    DeltaSourceBase
  95. val readSchemaAtSourceInit: StructType

    The read schema for this source during initialization, taking in account of SchemaLog.

    The read schema for this source during initialization, taking in account of SchemaLog.

    Attributes
    protected
    Definition Classes
    DeltaSourceBase
  96. lazy val readSnapshotDescriptor: SnapshotDescriptor

    Create a snapshot descriptor, customizing its metadata using metadata tracking if necessary

    Create a snapshot descriptor, customizing its metadata using metadata tracking if necessary

    Attributes
    protected
    Definition Classes
    DeltaSourceBase
  97. def readyToInitializeMetadataTrackingEagerly: Boolean

    Whether a schema tracking log is provided (and is empty), so we could initialize eagerly.

    Whether a schema tracking log is provided (and is empty), so we could initialize eagerly. This should only be used for the first write to the schema log, after then, schema tracking should not rely on this state any more.

    Attributes
    protected
    Definition Classes
    DeltaSourceMetadataEvolutionSupport
  98. def recordDeltaEvent(deltaLog: DeltaLog, opType: String, tags: Map[TagDefinition, String] = Map.empty, data: AnyRef = null, path: Option[Path] = None): Unit

    Used to record the occurrence of a single event or report detailed, operation specific statistics.

    Used to record the occurrence of a single event or report detailed, operation specific statistics.

    path

    Used to log the path of the delta table when deltaLog is null.

    Attributes
    protected
    Definition Classes
    DeltaLogging
  99. def recordDeltaOperation[A](deltaLog: DeltaLog, opType: String, tags: Map[TagDefinition, String] = Map.empty)(thunk: ⇒ A): A

    Used to report the duration as well as the success or failure of an operation on a deltaLog.

    Used to report the duration as well as the success or failure of an operation on a deltaLog.

    Attributes
    protected
    Definition Classes
    DeltaLogging
  100. def recordDeltaOperationForTablePath[A](tablePath: String, opType: String, tags: Map[TagDefinition, String] = Map.empty)(thunk: ⇒ A): A

    Used to report the duration as well as the success or failure of an operation on a tahoePath.

    Used to report the duration as well as the success or failure of an operation on a tahoePath.

    Attributes
    protected
    Definition Classes
    DeltaLogging
  101. def recordEvent(metric: MetricDefinition, additionalTags: Map[TagDefinition, String] = Map.empty, blob: String = null, trimBlob: Boolean = true): Unit
    Definition Classes
    DatabricksLogging
  102. def recordFrameProfile[T](group: String, name: String)(thunk: ⇒ T): T
    Attributes
    protected
    Definition Classes
    DeltaLogging
  103. def recordOperation[S](opType: OpType, opTarget: String = null, extraTags: Map[TagDefinition, String], isSynchronous: Boolean = true, alwaysRecordStats: Boolean = false, allowAuthTags: Boolean = false, killJvmIfStuck: Boolean = false, outputMetric: MetricDefinition = METRIC_OPERATION_DURATION, silent: Boolean = true)(thunk: ⇒ S): S
    Definition Classes
    DatabricksLogging
  104. def recordProductEvent(metric: MetricDefinition with CentralizableMetric, additionalTags: Map[TagDefinition, String] = Map.empty, blob: String = null, trimBlob: Boolean = true): Unit
    Definition Classes
    DatabricksLogging
  105. def recordProductUsage(metric: MetricDefinition with CentralizableMetric, quantity: Double, additionalTags: Map[TagDefinition, String] = Map.empty, blob: String = null, forceSample: Boolean = false, trimBlob: Boolean = true, silent: Boolean = false): Unit
    Definition Classes
    DatabricksLogging
  106. def recordUsage(metric: MetricDefinition, quantity: Double, additionalTags: Map[TagDefinition, String] = Map.empty, blob: String = null, forceSample: Boolean = false, trimBlob: Boolean = true, silent: Boolean = false): Unit
    Definition Classes
    DatabricksLogging
  107. def reportLatestOffset(): Offset
    Definition Classes
    SupportsAdmissionControl
  108. val schema: StructType
    Definition Classes
    DeltaSourceBase → Source
  109. val snapshotAtSourceInit: SnapshotDescriptor
  110. val spark: SparkSession
  111. def stop(): Unit
    Definition Classes
    DeltaSource → SparkDataStream
  112. def stopIndexedFileIteratorAtSchemaChangeBarrier(fileActionScanIter: ClosableIterator[IndexedFile]): ClosableIterator[IndexedFile]

    This is called from getFileChangesWithRateLimit() during latestOffset().

    This is called from getFileChangesWithRateLimit() during latestOffset().

    Attributes
    protected
    Definition Classes
    DeltaSourceMetadataEvolutionSupport
  113. final def synchronized[T0](arg0: ⇒ T0): T0
    Definition Classes
    AnyRef
  114. val tableId: String
    Attributes
    protected
  115. def toDeltaSourceOffset(offset: Offset): DeltaSourceOffset
  116. def toString(): String
    Definition Classes
    DeltaSource → AnyRef → Any
  117. def trackingMetadataChange: Boolean

    Whether this DeltaSource is utilizing a schema log entry as its read schema.

    Whether this DeltaSource is utilizing a schema log entry as its read schema.

    If user explicitly turn on the flag to fall back to using latest schema to read (i.e. the legacy mode), we will ignore the schema log.

    Attributes
    protected
    Definition Classes
    DeltaSourceMetadataEvolutionSupport
  118. def updateMetadataTrackingLogAndFailTheStreamIfNeeded(changedMetadataOpt: Option[Metadata], changedProtocolOpt: Option[Protocol], version: Long, replace: Boolean = false): Unit

    Write a new potentially changed metadata into the metadata tracking log.

    Write a new potentially changed metadata into the metadata tracking log. Then fail the stream to allow reanalysis if there are changes.

    changedMetadataOpt

    Potentially changed metadata action

    changedProtocolOpt

    Potentially changed protocol action

    version

    The version of change

    Attributes
    protected
    Definition Classes
    DeltaSourceMetadataEvolutionSupport
  119. def updateMetadataTrackingLogAndFailTheStreamIfNeeded(end: Offset): Unit

    Update the current stream schema in the schema tracking log and fail the stream.

    Update the current stream schema in the schema tracking log and fail the stream. This is called during commit(). It's ok to fail during commit() because in streaming's semantics, the batch with offset ending at end should've already being processed completely.

    Attributes
    protected
    Definition Classes
    DeltaSourceMetadataEvolutionSupport
  120. def validateCommitAndDecideSkipping(actions: Iterator[Action], version: Long, batchStartVersion: Long, batchEndOffsetOpt: Option[DeltaSourceOffset] = None, verifyMetadataAction: Boolean = true): (Boolean, Option[Metadata], Option[Protocol])

    Check stream for violating any constraints.

    Check stream for violating any constraints.

    If verifyMetadataAction = true, we will break the stream when we detect any read-incompatible metadata changes.

    returns

    (true if commit should be skipped, a metadata action if found)

    Attributes
    protected
  121. final def wait(): Unit
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  122. final def wait(arg0: Long, arg1: Int): Unit
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  123. final def wait(arg0: Long): Unit
    Definition Classes
    AnyRef
    Annotations
    @throws( ... ) @native()
  124. def withStatusCode[T](statusCode: String, defaultMessage: String, data: Map[String, Any] = Map.empty)(body: ⇒ T): T

    Report a log to indicate some command is running.

    Report a log to indicate some command is running.

    Definition Classes
    DeltaProgressReporter
  125. object AdmissionLimits extends Serializable

Inherited from Serializable

Inherited from Serializable

Inherited from Product

Inherited from Equals

Inherited from DeltaSourceCDCSupport

Inherited from DeltaSourceBase

Inherited from DeltaLogging

Inherited from DatabricksLogging

Inherited from DeltaProgressReporter

Inherited from LoggingShims

Inherited from Logging

Inherited from SupportsTriggerAvailableNow

Inherited from SupportsAdmissionControl

Inherited from Source

Inherited from SparkDataStream

Inherited from AnyRef

Inherited from Any

Ungrouped