case class PreparedDeltaFileIndex(spark: SparkSession, deltaLog: DeltaLog, path: Path, preparedScan: DeltaScan, versionScanned: Option[Long]) extends TahoeFileIndexWithSnapshotDescriptor with DeltaLogging with Product with Serializable
A TahoeFileIndex that uses a prepared scan to return the list of relevant files. This is injected into a query right before query planning by PrepareDeltaScan so that CBO and metering can accurately understand how much data will be read.
- versionScanned
The version of the table that is being scanned, if a specific version has specifically been requested, e.g. by time travel.
- Alphabetic
- By Inheritance
- PreparedDeltaFileIndex
- Serializable
- Product
- Equals
- DeltaLogging
- DatabricksLogging
- DeltaProgressReporter
- Logging
- TahoeFileIndexWithSnapshotDescriptor
- TahoeFileIndex
- SnapshotDescriptor
- SupportsRowIndexFilters
- FileIndex
- AnyRef
- Any
- Hide All
- Show All
- Public
- Protected
Instance Constructors
Value Members
- final def !=(arg0: Any): Boolean
- Definition Classes
- AnyRef → Any
- final def ##: Int
- Definition Classes
- AnyRef → Any
- final def ==(arg0: Any): Boolean
- Definition Classes
- AnyRef → Any
- def absolutePath(child: String): Path
- Definition Classes
- TahoeFileIndex
- final def asInstanceOf[T0]: T0
- Definition Classes
- Any
- def clone(): AnyRef
- Attributes
- protected[lang]
- Definition Classes
- AnyRef
- Annotations
- @throws(classOf[java.lang.CloneNotSupportedException]) @native()
- def deltaAssert(check: => Boolean, name: String, msg: String, deltaLog: DeltaLog = null, data: AnyRef = null, path: Option[Path] = None): Unit
Helper method to check invariants in Delta code.
Helper method to check invariants in Delta code. Fails when running in tests, records a delta assertion event and logs a warning otherwise.
- Attributes
- protected
- Definition Classes
- DeltaLogging
- val deltaLog: DeltaLog
- Definition Classes
- PreparedDeltaFileIndex → TahoeFileIndex → SnapshotDescriptor
- final def eq(arg0: AnyRef): Boolean
- Definition Classes
- AnyRef
- def equals(other: Any): Boolean
- Definition Classes
- PreparedDeltaFileIndex → Equals → AnyRef → Any
- def fileStatusWithMetadataFromAddFile(addFile: AddFile): FileStatusWithMetadata
Generates a FileStatusWithMetadata using data extracted from a given AddFile.
Generates a FileStatusWithMetadata using data extracted from a given AddFile.
- Definition Classes
- TahoeFileIndex
- def finalize(): Unit
- Attributes
- protected[lang]
- Definition Classes
- AnyRef
- Annotations
- @throws(classOf[java.lang.Throwable])
- def getBasePath(filePath: Path): Option[Path]
Returns the path of the base directory of the given file path (i.e.
Returns the path of the base directory of the given file path (i.e. its parent directory with all the partition directories stripped off).
- Definition Classes
- TahoeFileIndex
- final def getClass(): Class[_ <: AnyRef]
- Definition Classes
- AnyRef → Any
- Annotations
- @native()
- def getCommonTags(deltaLog: DeltaLog, tahoeId: String): Map[TagDefinition, String]
- Definition Classes
- DeltaLogging
- def getErrorData(e: Throwable): Map[String, Any]
- Definition Classes
- DeltaLogging
- def getPartitionValuesRow(partitionValues: Map[String, String]): GenericInternalRow
- Attributes
- protected
- Definition Classes
- TahoeFileIndex
- def hashCode(): Int
- Definition Classes
- PreparedDeltaFileIndex → AnyRef → Any
- def initializeLogIfNecessary(isInterpreter: Boolean, silent: Boolean): Boolean
- Attributes
- protected
- Definition Classes
- Logging
- def initializeLogIfNecessary(isInterpreter: Boolean): Unit
- Attributes
- protected
- Definition Classes
- Logging
- def inputFiles: Array[String]
Returns the list of files that will be read when scanning this relation.
Returns the list of files that will be read when scanning this relation. This call may be very expensive for large tables.
- Definition Classes
- PreparedDeltaFileIndex → FileIndex
- final def isInstanceOf[T0]: Boolean
- Definition Classes
- Any
- def isTraceEnabled(): Boolean
- Attributes
- protected
- Definition Classes
- Logging
- def listFiles(partitionFilters: Seq[Expression], dataFilters: Seq[Expression]): Seq[PartitionDirectory]
- Definition Classes
- TahoeFileIndex → FileIndex
- def listPartitionsAsAddFiles(partitionFilters: Seq[Expression], dataFilters: Seq[Expression]): (Seq[(InternalRow, Seq[AddFile])], Seq[AddFile])
Returns (i) tuples of partition directories to their respective AddFile actions and (ii) a collection of matched AddFiles.
Returns (i) tuples of partition directories to their respective AddFile actions and (ii) a collection of matched AddFiles. The matched AddFiles are those that meet the criteria set by the partition and data filters. Essentially, this is a collection of all the files associated with the identified partitions.
- Definition Classes
- TahoeFileIndex
- def log: Logger
- Attributes
- protected
- Definition Classes
- Logging
- def logConsole(line: String): Unit
- Definition Classes
- DatabricksLogging
- def logDebug(msg: => String, throwable: Throwable): Unit
- Attributes
- protected
- Definition Classes
- Logging
- def logDebug(msg: => String): Unit
- Attributes
- protected
- Definition Classes
- Logging
- def logError(msg: => String, throwable: Throwable): Unit
- Attributes
- protected
- Definition Classes
- Logging
- def logError(msg: => String): Unit
- Attributes
- protected
- Definition Classes
- Logging
- def logInfo(msg: => String, throwable: Throwable): Unit
- Attributes
- protected
- Definition Classes
- Logging
- def logInfo(msg: => String): Unit
- Attributes
- protected
- Definition Classes
- Logging
- def logName: String
- Attributes
- protected
- Definition Classes
- Logging
- def logTrace(msg: => String, throwable: Throwable): Unit
- Attributes
- protected
- Definition Classes
- Logging
- def logTrace(msg: => String): Unit
- Attributes
- protected
- Definition Classes
- Logging
- def logWarning(msg: => String, throwable: Throwable): Unit
- Attributes
- protected
- Definition Classes
- Logging
- def logWarning(msg: => String): Unit
- Attributes
- protected
- Definition Classes
- Logging
- def makePartitionDirectories(partitionValuesToFiles: Seq[(InternalRow, Seq[AddFile])]): Seq[PartitionDirectory]
- Definition Classes
- TahoeFileIndex
- def matchingFiles(partitionFilters: Seq[Expression], dataFilters: Seq[Expression]): Seq[AddFile]
Returns all matching/valid files by the given
partitionFiltersanddataFiltersReturns all matching/valid files by the given
partitionFiltersanddataFilters- Definition Classes
- PreparedDeltaFileIndex → TahoeFileIndex
- def metadata: Metadata
- Definition Classes
- TahoeFileIndexWithSnapshotDescriptor → SnapshotDescriptor
- def metadataOpsTimeNs: Option[Long]
- Definition Classes
- FileIndex
- final def ne(arg0: AnyRef): Boolean
- Definition Classes
- AnyRef
- final def notify(): Unit
- Definition Classes
- AnyRef
- Annotations
- @native()
- final def notifyAll(): Unit
- Definition Classes
- AnyRef
- Annotations
- @native()
- def numOfFilesIfKnown: Option[Long]
- Attributes
- protected[delta]
- Definition Classes
- TahoeFileIndexWithSnapshotDescriptor → SnapshotDescriptor
- def partitionSchema: StructType
- Definition Classes
- TahoeFileIndex → FileIndex
- val path: Path
- Definition Classes
- PreparedDeltaFileIndex → TahoeFileIndex
- val preparedScan: DeltaScan
- def productElementNames: Iterator[String]
- Definition Classes
- Product
- def protocol: Protocol
- Definition Classes
- TahoeFileIndexWithSnapshotDescriptor → SnapshotDescriptor
- def recordDeltaEvent(deltaLog: DeltaLog, opType: String, tags: Map[TagDefinition, String] = Map.empty, data: AnyRef = null, path: Option[Path] = None): Unit
Used to record the occurrence of a single event or report detailed, operation specific statistics.
Used to record the occurrence of a single event or report detailed, operation specific statistics.
- path
Used to log the path of the delta table when
deltaLogis null.
- Attributes
- protected
- Definition Classes
- DeltaLogging
- def recordDeltaOperation[A](deltaLog: DeltaLog, opType: String, tags: Map[TagDefinition, String] = Map.empty)(thunk: => A): A
Used to report the duration as well as the success or failure of an operation on a
deltaLog.Used to report the duration as well as the success or failure of an operation on a
deltaLog.- Attributes
- protected
- Definition Classes
- DeltaLogging
- def recordDeltaOperationForTablePath[A](tablePath: String, opType: String, tags: Map[TagDefinition, String] = Map.empty)(thunk: => A): A
Used to report the duration as well as the success or failure of an operation on a
tahoePath.Used to report the duration as well as the success or failure of an operation on a
tahoePath.- Attributes
- protected
- Definition Classes
- DeltaLogging
- def recordEvent(metric: MetricDefinition, additionalTags: Map[TagDefinition, String] = Map.empty, blob: String = null, trimBlob: Boolean = true): Unit
- Definition Classes
- DatabricksLogging
- def recordFrameProfile[T](group: String, name: String)(thunk: => T): T
- Attributes
- protected
- Definition Classes
- DeltaLogging
- def recordOperation[S](opType: OpType, opTarget: String = null, extraTags: Map[TagDefinition, String], isSynchronous: Boolean = true, alwaysRecordStats: Boolean = false, allowAuthTags: Boolean = false, killJvmIfStuck: Boolean = false, outputMetric: MetricDefinition = METRIC_OPERATION_DURATION, silent: Boolean = true)(thunk: => S): S
- Definition Classes
- DatabricksLogging
- def recordProductEvent(metric: MetricDefinition with CentralizableMetric, additionalTags: Map[TagDefinition, String] = Map.empty, blob: String = null, trimBlob: Boolean = true): Unit
- Definition Classes
- DatabricksLogging
- def recordProductUsage(metric: MetricDefinition with CentralizableMetric, quantity: Double, additionalTags: Map[TagDefinition, String] = Map.empty, blob: String = null, forceSample: Boolean = false, trimBlob: Boolean = true, silent: Boolean = false): Unit
- Definition Classes
- DatabricksLogging
- def recordUsage(metric: MetricDefinition, quantity: Double, additionalTags: Map[TagDefinition, String] = Map.empty, blob: String = null, forceSample: Boolean = false, trimBlob: Boolean = true, silent: Boolean = false): Unit
- Definition Classes
- DatabricksLogging
- def refresh(): Unit
Refresh any cached file listings
Refresh any cached file listings
- Definition Classes
- PreparedDeltaFileIndex → FileIndex
- def rootPaths: Seq[Path]
- Definition Classes
- TahoeFileIndex → FileIndex
- def rowIndexFilters: Option[Map[String, RowIndexFilterType]]
If we know a-priori which exact rows we want to read (e.g., from a previous scan) find the per-file filter here, which must be passed down to the appropriate reader.
If we know a-priori which exact rows we want to read (e.g., from a previous scan) find the per-file filter here, which must be passed down to the appropriate reader.
- returns
a mapping from file names to the row index filter for that file.
- Definition Classes
- SupportsRowIndexFilters
- def schema: StructType
- Definition Classes
- SnapshotDescriptor
- def sizeInBytes: Long
Sum of table file sizes, in bytes
Sum of table file sizes, in bytes
- Definition Classes
- PreparedDeltaFileIndex → FileIndex
- def sizeInBytesIfKnown: Option[Long]
- Attributes
- protected[delta]
- Definition Classes
- TahoeFileIndexWithSnapshotDescriptor → SnapshotDescriptor
- val spark: SparkSession
- Definition Classes
- PreparedDeltaFileIndex → TahoeFileIndex
- final def synchronized[T0](arg0: => T0): T0
- Definition Classes
- AnyRef
- def toString(): String
- Definition Classes
- TahoeFileIndex → FileIndex → AnyRef → Any
- def version: Long
- Definition Classes
- TahoeFileIndexWithSnapshotDescriptor → SnapshotDescriptor
- val versionScanned: Option[Long]
- final def wait(): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws(classOf[java.lang.InterruptedException])
- final def wait(arg0: Long, arg1: Int): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws(classOf[java.lang.InterruptedException])
- final def wait(arg0: Long): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws(classOf[java.lang.InterruptedException]) @native()
- def withStatusCode[T](statusCode: String, defaultMessage: String, data: Map[String, Any] = Map.empty)(body: => T): T
Report a log to indicate some command is running.
Report a log to indicate some command is running.
- Definition Classes
- DeltaProgressReporter