Packages

package files

Ordering
  1. Alphabetic
Visibility
  1. Public
  2. All

Type Members

  1. class CdcAddFileIndex extends TahoeBatchFileIndex

    A TahoeFileIndex for scanning a sequence of added files as CDC.

    A TahoeFileIndex for scanning a sequence of added files as CDC. Similar to TahoeBatchFileIndex, with a bit of special handling to attach the log version and CDC type on a per-file basis.

  2. class DelayedCommitProtocol extends FileCommitProtocol with Serializable with Logging

    Writes out the files to path and returns a list of them in addedStatuses.

    Writes out the files to path and returns a list of them in addedStatuses. Includes special handling for partitioning on CDC_PARTITION_COL for compatibility between enabled and disabled CDC; partitions with a value of false in this column produce no corresponding partitioning directory.

  3. case class DeltaFileListingResult(partitions: Seq[(InternalRow, Seq[AddFile])], addFiles: Seq[AddFile], sortTime: Long = 0L) extends Product with Serializable

    Similar to FileListingResult, but maintains the partitions as AddFile.

  4. class DeltaSourceSnapshot extends StateCache

    Converts a Snapshot into the initial set of files read when starting a new streaming query.

    Converts a Snapshot into the initial set of files read when starting a new streaming query. The list of files that represent the table at the time the query starts are selected by: - Adding version and index to each file to enable splitting of the initial state into multiple batches. - Filtering files that don't match partition predicates, while preserving the aforementioned indexing.

  5. trait SQLMetricsReporting extends AnyRef

    This trait is used to register SQL metrics for a Delta Operation.

    This trait is used to register SQL metrics for a Delta Operation. Registering will allow the metrics to be instrumented via the CommitInfo and is accessible via DescribeHistory

  6. class ShallowSnapshotDescriptor extends SnapshotDescriptor

    A lightweight SnapshotDescriptor implementation that points to an actual Snapshot.

  7. trait SupportsRowIndexFilters extends AnyRef
  8. class TahoeBatchFileIndex extends TahoeFileIndexWithSnapshotDescriptor

    A TahoeFileIndex that generates the list of files from a given list of files that are within a version range of DeltaLog.

  9. class TahoeChangeFileIndex extends TahoeFileIndexWithSnapshotDescriptor

    A TahoeFileIndex for scanning a sequence of CDC files.

    A TahoeFileIndex for scanning a sequence of CDC files. Similar to TahoeBatchFileIndex, the equivalent for reading AddFile actions.

    Note: Please also consider other CDC-related file indexes like CdcAddFileIndex and TahoeRemoveFileIndex when modifying this file index.

  10. abstract class TahoeFileIndex extends FileIndex with SupportsRowIndexFilters with SnapshotDescriptor

    A FileIndex that generates the list of files managed by the Tahoe protocol.

  11. abstract class TahoeFileIndexWithSnapshotDescriptor extends TahoeFileIndex

    A TahoeFileIndex that works with a specific SnapshotDescriptor.

  12. case class TahoeLogFileIndex(spark: SparkSession, deltaLog: DeltaLog, path: Path, snapshotAtAnalysis: SnapshotDescriptor, partitionFilters: Seq[Expression], isTimeTravelQuery: Boolean) extends TahoeFileIndex with Product with Serializable

    A TahoeFileIndex that generates the list of files from DeltaLog with given partition filters.

    A TahoeFileIndex that generates the list of files from DeltaLog with given partition filters.

    NOTE: This is NOT a TahoeFileIndexWithSnapshotDescriptor because we only use snapshotAtAnalysis for actual data skipping if this is a time travel query.

  13. class TahoeRemoveFileIndex extends TahoeFileIndexWithSnapshotDescriptor

    A TahoeFileIndex for scanning a sequence of removed files as CDC.

    A TahoeFileIndex for scanning a sequence of removed files as CDC. Similar to TahoeBatchFileIndex, the equivalent for reading AddFile actions.

  14. trait TransactionalWrite extends DeltaLogging

    Adds the ability to write files out as part of a transaction.

    Adds the ability to write files out as part of a transaction. Checks are performed to ensure that the data being written matches either the current metadata or the new metadata being set by this transaction.

Value Members

  1. object DeltaFileFormatWriter extends LoggingShims

    A helper object for writing FileFormat data out to a location.

    A helper object for writing FileFormat data out to a location. Logic is copied from FileFormatWriter from Spark 3.5 with added functionality to write partition values to data files. Specifically L123-126, L132, and L140 where it adds option WRITE_PARTITION_COLUMNS

  2. object TahoeLogFileIndex extends Serializable
  3. object TransactionalWrite

Ungrouped