c

org.apache.spark.sql.delta.stats

PreparedDeltaFileIndex

case class PreparedDeltaFileIndex(spark: SparkSession, deltaLog: DeltaLog, path: Path, preparedScan: DeltaScan, versionScanned: Option[Long]) extends TahoeFileIndexWithSnapshotDescriptor with DeltaLogging with Product with Serializable

A TahoeFileIndex that uses a prepared scan to return the list of relevant files. This is injected into a query right before query planning by PrepareDeltaScan so that CBO and metering can accurately understand how much data will be read.

versionScanned

The version of the table that is being scanned, if a specific version has specifically been requested, e.g. by time travel.

Linear Supertypes
Ordering
  1. Alphabetic
  2. By Inheritance
Inherited
  1. PreparedDeltaFileIndex
  2. Serializable
  3. Serializable
  4. Product
  5. Equals
  6. DeltaLogging
  7. DatabricksLogging
  8. DeltaProgressReporter
  9. LoggingShims
  10. Logging
  11. TahoeFileIndexWithSnapshotDescriptor
  12. TahoeFileIndex
  13. SnapshotDescriptor
  14. SupportsRowIndexFilters
  15. FileIndex
  16. AnyRef
  17. Any
  1. Hide All
  2. Show All
Visibility
  1. Public
  2. All

Instance Constructors

  1. new PreparedDeltaFileIndex(spark: SparkSession, deltaLog: DeltaLog, path: Path, preparedScan: DeltaScan, versionScanned: Option[Long])

    versionScanned

    The version of the table that is being scanned, if a specific version has specifically been requested, e.g. by time travel.

Type Members

  1. implicit class LogStringContext extends AnyRef
    Definition Classes
    LoggingShims

Value Members

  1. final def !=(arg0: Any): Boolean
    Definition Classes
    AnyRef → Any
  2. final def ##(): Int
    Definition Classes
    AnyRef → Any
  3. final def ==(arg0: Any): Boolean
    Definition Classes
    AnyRef → Any
  4. def absolutePath(child: String): Path
    Definition Classes
    TahoeFileIndex
  5. final def asInstanceOf[T0]: T0
    Definition Classes
    Any
  6. def clone(): AnyRef
    Attributes
    protected[lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( ... ) @native()
  7. def deltaAssert(check: ⇒ Boolean, name: String, msg: String, deltaLog: DeltaLog = null, data: AnyRef = null, path: Option[Path] = None): Unit

    Helper method to check invariants in Delta code.

    Helper method to check invariants in Delta code. Fails when running in tests, records a delta assertion event and logs a warning otherwise.

    Attributes
    protected
    Definition Classes
    DeltaLogging
  8. val deltaLog: DeltaLog
  9. final def eq(arg0: AnyRef): Boolean
    Definition Classes
    AnyRef
  10. def equals(other: Any): Boolean
    Definition Classes
    PreparedDeltaFileIndex → Equals → AnyRef → Any
  11. def fileStatusWithMetadataFromAddFile(addFile: AddFile): FileStatusWithMetadata

    Generates a FileStatusWithMetadata using data extracted from a given AddFile.

    Generates a FileStatusWithMetadata using data extracted from a given AddFile.

    Definition Classes
    TahoeFileIndex
  12. def finalize(): Unit
    Attributes
    protected[lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( classOf[java.lang.Throwable] )
  13. def getBasePath(filePath: Path): Option[Path]

    Returns the path of the base directory of the given file path (i.e.

    Returns the path of the base directory of the given file path (i.e. its parent directory with all the partition directories stripped off).

    Definition Classes
    TahoeFileIndex
  14. final def getClass(): Class[_]
    Definition Classes
    AnyRef → Any
    Annotations
    @native()
  15. def getCommonTags(deltaLog: DeltaLog, tahoeId: String): Map[TagDefinition, String]
    Definition Classes
    DeltaLogging
  16. def getErrorData(e: Throwable): Map[String, Any]
    Definition Classes
    DeltaLogging
  17. def getPartitionValuesRow(partitionValues: Map[String, String]): GenericInternalRow
    Attributes
    protected
    Definition Classes
    TahoeFileIndex
  18. def hashCode(): Int
    Definition Classes
    PreparedDeltaFileIndex → AnyRef → Any
  19. def initializeLogIfNecessary(isInterpreter: Boolean, silent: Boolean): Boolean
    Attributes
    protected
    Definition Classes
    Logging
  20. def initializeLogIfNecessary(isInterpreter: Boolean): Unit
    Attributes
    protected
    Definition Classes
    Logging
  21. def inputFiles: Array[String]

    Returns the list of files that will be read when scanning this relation.

    Returns the list of files that will be read when scanning this relation. This call may be very expensive for large tables.

    Definition Classes
    PreparedDeltaFileIndex → FileIndex
  22. final def isInstanceOf[T0]: Boolean
    Definition Classes
    Any
  23. def isTraceEnabled(): Boolean
    Attributes
    protected
    Definition Classes
    Logging
  24. def listFiles(partitionFilters: Seq[Expression], dataFilters: Seq[Expression]): Seq[PartitionDirectory]
    Definition Classes
    TahoeFileIndex → FileIndex
  25. def listPartitionsAsAddFiles(partitionFilters: Seq[Expression], dataFilters: Seq[Expression]): (Seq[(InternalRow, Seq[AddFile])], Seq[AddFile])

    Returns (i) tuples of partition directories to their respective AddFile actions and (ii) a collection of matched AddFiles.

    Returns (i) tuples of partition directories to their respective AddFile actions and (ii) a collection of matched AddFiles. The matched AddFiles are those that meet the criteria set by the partition and data filters. Essentially, this is a collection of all the files associated with the identified partitions.

    Definition Classes
    TahoeFileIndex
  26. def log: Logger
    Attributes
    protected
    Definition Classes
    Logging
  27. def logConsole(line: String): Unit
    Definition Classes
    DatabricksLogging
  28. def logDebug(entry: LogEntry, throwable: Throwable): Unit
    Attributes
    protected
    Definition Classes
    LoggingShims
  29. def logDebug(entry: LogEntry): Unit
    Attributes
    protected
    Definition Classes
    LoggingShims
  30. def logDebug(msg: ⇒ String, throwable: Throwable): Unit
    Attributes
    protected
    Definition Classes
    Logging
  31. def logDebug(msg: ⇒ String): Unit
    Attributes
    protected
    Definition Classes
    Logging
  32. def logError(entry: LogEntry, throwable: Throwable): Unit
    Attributes
    protected
    Definition Classes
    LoggingShims
  33. def logError(entry: LogEntry): Unit
    Attributes
    protected
    Definition Classes
    LoggingShims
  34. def logError(msg: ⇒ String, throwable: Throwable): Unit
    Attributes
    protected
    Definition Classes
    Logging
  35. def logError(msg: ⇒ String): Unit
    Attributes
    protected
    Definition Classes
    Logging
  36. def logInfo(entry: LogEntry, throwable: Throwable): Unit
    Attributes
    protected
    Definition Classes
    LoggingShims
  37. def logInfo(entry: LogEntry): Unit
    Attributes
    protected
    Definition Classes
    LoggingShims
  38. def logInfo(msg: ⇒ String, throwable: Throwable): Unit
    Attributes
    protected
    Definition Classes
    Logging
  39. def logInfo(msg: ⇒ String): Unit
    Attributes
    protected
    Definition Classes
    Logging
  40. def logName: String
    Attributes
    protected
    Definition Classes
    Logging
  41. def logTrace(entry: LogEntry, throwable: Throwable): Unit
    Attributes
    protected
    Definition Classes
    LoggingShims
  42. def logTrace(entry: LogEntry): Unit
    Attributes
    protected
    Definition Classes
    LoggingShims
  43. def logTrace(msg: ⇒ String, throwable: Throwable): Unit
    Attributes
    protected
    Definition Classes
    Logging
  44. def logTrace(msg: ⇒ String): Unit
    Attributes
    protected
    Definition Classes
    Logging
  45. def logWarning(entry: LogEntry, throwable: Throwable): Unit
    Attributes
    protected
    Definition Classes
    LoggingShims
  46. def logWarning(entry: LogEntry): Unit
    Attributes
    protected
    Definition Classes
    LoggingShims
  47. def logWarning(msg: ⇒ String, throwable: Throwable): Unit
    Attributes
    protected
    Definition Classes
    Logging
  48. def logWarning(msg: ⇒ String): Unit
    Attributes
    protected
    Definition Classes
    Logging
  49. def makePartitionDirectories(partitionValuesToFiles: Seq[(InternalRow, Seq[AddFile])]): Seq[PartitionDirectory]
    Definition Classes
    TahoeFileIndex
  50. def matchingFiles(partitionFilters: Seq[Expression], dataFilters: Seq[Expression]): Seq[AddFile]

    Returns all matching/valid files by the given partitionFilters and dataFilters

    Returns all matching/valid files by the given partitionFilters and dataFilters

    Definition Classes
    PreparedDeltaFileIndexTahoeFileIndex
  51. def metadata: Metadata
  52. def metadataOpsTimeNs: Option[Long]
    Definition Classes
    FileIndex
  53. final def ne(arg0: AnyRef): Boolean
    Definition Classes
    AnyRef
  54. final def notify(): Unit
    Definition Classes
    AnyRef
    Annotations
    @native()
  55. final def notifyAll(): Unit
    Definition Classes
    AnyRef
    Annotations
    @native()
  56. def numOfFilesIfKnown: Option[Long]
    Attributes
    protected[delta]
    Definition Classes
    TahoeFileIndexWithSnapshotDescriptorSnapshotDescriptor
  57. def partitionSchema: StructType
    Definition Classes
    TahoeFileIndex → FileIndex
  58. val path: Path
    Definition Classes
    PreparedDeltaFileIndexTahoeFileIndex
  59. val preparedScan: DeltaScan
  60. def protocol: Protocol
  61. def recordDeltaEvent(deltaLog: DeltaLog, opType: String, tags: Map[TagDefinition, String] = Map.empty, data: AnyRef = null, path: Option[Path] = None): Unit

    Used to record the occurrence of a single event or report detailed, operation specific statistics.

    Used to record the occurrence of a single event or report detailed, operation specific statistics.

    path

    Used to log the path of the delta table when deltaLog is null.

    Attributes
    protected
    Definition Classes
    DeltaLogging
  62. def recordDeltaOperation[A](deltaLog: DeltaLog, opType: String, tags: Map[TagDefinition, String] = Map.empty)(thunk: ⇒ A): A

    Used to report the duration as well as the success or failure of an operation on a deltaLog.

    Used to report the duration as well as the success or failure of an operation on a deltaLog.

    Attributes
    protected
    Definition Classes
    DeltaLogging
  63. def recordDeltaOperationForTablePath[A](tablePath: String, opType: String, tags: Map[TagDefinition, String] = Map.empty)(thunk: ⇒ A): A

    Used to report the duration as well as the success or failure of an operation on a tahoePath.

    Used to report the duration as well as the success or failure of an operation on a tahoePath.

    Attributes
    protected
    Definition Classes
    DeltaLogging
  64. def recordEvent(metric: MetricDefinition, additionalTags: Map[TagDefinition, String] = Map.empty, blob: String = null, trimBlob: Boolean = true): Unit
    Definition Classes
    DatabricksLogging
  65. def recordFrameProfile[T](group: String, name: String)(thunk: ⇒ T): T
    Attributes
    protected
    Definition Classes
    DeltaLogging
  66. def recordOperation[S](opType: OpType, opTarget: String = null, extraTags: Map[TagDefinition, String], isSynchronous: Boolean = true, alwaysRecordStats: Boolean = false, allowAuthTags: Boolean = false, killJvmIfStuck: Boolean = false, outputMetric: MetricDefinition = METRIC_OPERATION_DURATION, silent: Boolean = true)(thunk: ⇒ S): S
    Definition Classes
    DatabricksLogging
  67. def recordProductEvent(metric: MetricDefinition with CentralizableMetric, additionalTags: Map[TagDefinition, String] = Map.empty, blob: String = null, trimBlob: Boolean = true): Unit
    Definition Classes
    DatabricksLogging
  68. def recordProductUsage(metric: MetricDefinition with CentralizableMetric, quantity: Double, additionalTags: Map[TagDefinition, String] = Map.empty, blob: String = null, forceSample: Boolean = false, trimBlob: Boolean = true, silent: Boolean = false): Unit
    Definition Classes
    DatabricksLogging
  69. def recordUsage(metric: MetricDefinition, quantity: Double, additionalTags: Map[TagDefinition, String] = Map.empty, blob: String = null, forceSample: Boolean = false, trimBlob: Boolean = true, silent: Boolean = false): Unit
    Definition Classes
    DatabricksLogging
  70. def refresh(): Unit

    Refresh any cached file listings

    Refresh any cached file listings

    Definition Classes
    PreparedDeltaFileIndex → FileIndex
  71. def rootPaths: Seq[Path]
    Definition Classes
    TahoeFileIndex → FileIndex
  72. def rowIndexFilters: Option[Map[String, RowIndexFilterType]]

    If we know a-priori which exact rows we want to read (e.g., from a previous scan) find the per-file filter here, which must be passed down to the appropriate reader.

    If we know a-priori which exact rows we want to read (e.g., from a previous scan) find the per-file filter here, which must be passed down to the appropriate reader.

    returns

    a mapping from file names to the row index filter for that file.

    Definition Classes
    SupportsRowIndexFilters
  73. def schema: StructType
    Definition Classes
    SnapshotDescriptor
  74. def sizeInBytes: Long

    Sum of table file sizes, in bytes

    Sum of table file sizes, in bytes

    Definition Classes
    PreparedDeltaFileIndex → FileIndex
  75. def sizeInBytesIfKnown: Option[Long]
    Attributes
    protected[delta]
    Definition Classes
    TahoeFileIndexWithSnapshotDescriptorSnapshotDescriptor
  76. val spark: SparkSession
    Definition Classes
    PreparedDeltaFileIndexTahoeFileIndex
  77. final def synchronized[T0](arg0: ⇒ T0): T0
    Definition Classes
    AnyRef
  78. def toString(): String
    Definition Classes
    TahoeFileIndex → FileIndex → AnyRef → Any
  79. def version: Long
  80. val versionScanned: Option[Long]
  81. final def wait(): Unit
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  82. final def wait(arg0: Long, arg1: Int): Unit
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  83. final def wait(arg0: Long): Unit
    Definition Classes
    AnyRef
    Annotations
    @throws( ... ) @native()
  84. def withStatusCode[T](statusCode: String, defaultMessage: String, data: Map[String, Any] = Map.empty)(body: ⇒ T): T

    Report a log to indicate some command is running.

    Report a log to indicate some command is running.

    Definition Classes
    DeltaProgressReporter

Inherited from Serializable

Inherited from Serializable

Inherited from Product

Inherited from Equals

Inherited from DeltaLogging

Inherited from DatabricksLogging

Inherited from DeltaProgressReporter

Inherited from LoggingShims

Inherited from Logging

Inherited from TahoeFileIndex

Inherited from SnapshotDescriptor

Inherited from SupportsRowIndexFilters

Inherited from FileIndex

Inherited from AnyRef

Inherited from Any

Ungrouped