Packages

t

org.apache.spark.sql.delta.commands

VacuumCommandImpl

trait VacuumCommandImpl extends DeltaCommand

Linear Supertypes
Known Subclasses
Ordering
  1. Alphabetic
  2. By Inheritance
Inherited
  1. VacuumCommandImpl
  2. DeltaCommand
  3. DeltaLogging
  4. DatabricksLogging
  5. DeltaProgressReporter
  6. LoggingShims
  7. Logging
  8. AnyRef
  9. Any
  1. Hide All
  2. Show All
Visibility
  1. Public
  2. All

Type Members

  1. implicit class LogStringContext extends AnyRef
    Definition Classes
    LoggingShims

Value Members

  1. final def !=(arg0: Any): Boolean
    Definition Classes
    AnyRef → Any
  2. final def ##(): Int
    Definition Classes
    AnyRef → Any
  3. final def ==(arg0: Any): Boolean
    Definition Classes
    AnyRef → Any
  4. final def asInstanceOf[T0]: T0
    Definition Classes
    Any
  5. def buildBaseRelation(spark: SparkSession, txn: OptimisticTransaction, actionType: String, rootPath: Path, inputLeafFiles: Seq[String], nameToAddFileMap: Map[String, AddFile]): HadoopFsRelation

    Build a base relation of files that need to be rewritten as part of an update/delete/merge operation.

    Build a base relation of files that need to be rewritten as part of an update/delete/merge operation.

    Attributes
    protected
    Definition Classes
    DeltaCommand
  6. def clone(): AnyRef
    Attributes
    protected[lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( ... ) @native()
  7. def createSetTransaction(sparkSession: SparkSession, deltaLog: DeltaLog, options: Option[DeltaOptions] = None): Option[SetTransaction]

    Returns SetTransaction if a valid app ID and version are present.

    Returns SetTransaction if a valid app ID and version are present. Otherwise returns an empty list.

    Attributes
    protected
    Definition Classes
    DeltaCommand
  8. def delete(diff: Dataset[String], spark: SparkSession, basePath: String, hadoopConf: Broadcast[SerializableConfiguration], parallel: Boolean, parallelPartitions: Int): Long

    Attempts to delete the list of candidate files.

    Attempts to delete the list of candidate files. Returns the number of files deleted.

    Attributes
    protected
  9. def deltaAssert(check: ⇒ Boolean, name: String, msg: String, deltaLog: DeltaLog = null, data: AnyRef = null, path: Option[Path] = None): Unit

    Helper method to check invariants in Delta code.

    Helper method to check invariants in Delta code. Fails when running in tests, records a delta assertion event and logs a warning otherwise.

    Attributes
    protected
    Definition Classes
    DeltaLogging
  10. final def eq(arg0: AnyRef): Boolean
    Definition Classes
    AnyRef
  11. def equals(arg0: Any): Boolean
    Definition Classes
    AnyRef → Any
  12. def finalize(): Unit
    Attributes
    protected[lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( classOf[java.lang.Throwable] )
  13. def generateCandidateFileMap(basePath: Path, candidateFiles: Seq[AddFile]): Map[String, AddFile]

    Generates a map of file names to add file entries for operations where we will need to rewrite files such as delete, merge, update.

    Generates a map of file names to add file entries for operations where we will need to rewrite files such as delete, merge, update. We expect file names to be unique, because each file contains a UUID.

    Definition Classes
    DeltaCommand
  14. def getActionRelativePath(action: FileAction, fs: FileSystem, basePath: Path, relativizeIgnoreError: Boolean): Option[String]
    Attributes
    protected
  15. def getAllSubdirs(base: String, file: String, fs: FileSystem): Iterator[String]

    Wrapper function for DeltaFileOperations.getAllSubDirectories returns all subdirectories that file has with respect to base.

    Wrapper function for DeltaFileOperations.getAllSubDirectories returns all subdirectories that file has with respect to base.

    Attributes
    protected
  16. final def getClass(): Class[_]
    Definition Classes
    AnyRef → Any
    Annotations
    @native()
  17. def getCommonTags(deltaLog: DeltaLog, tahoeId: String): Map[TagDefinition, String]
    Definition Classes
    DeltaLogging
  18. def getDeletionVectorRelativePathAndSize(action: FileAction): Option[(String, Long)]

    Returns the path of the on-disk deletion vector if it is stored relative to the basePath and it's size otherwise None.

    Returns the path of the on-disk deletion vector if it is stored relative to the basePath and it's size otherwise None.

    Attributes
    protected
  19. def getDeltaLog(spark: SparkSession, path: Option[String], tableIdentifier: Option[TableIdentifier], operationName: String, hadoopConf: Map[String, String] = Map.empty): DeltaLog

    Utility method to return the DeltaLog of an existing Delta table referred by either the given path or tableIdentifier.

    Utility method to return the DeltaLog of an existing Delta table referred by either the given path or tableIdentifier.

    spark

    SparkSession reference to use.

    path

    Table location. Expects a non-empty tableIdentifier or path.

    tableIdentifier

    Table identifier. Expects a non-empty tableIdentifier or path.

    operationName

    Operation that is getting the DeltaLog, used in error messages.

    hadoopConf

    Hadoop file system options used to build DeltaLog.

    returns

    DeltaLog of the table

    Attributes
    protected
    Definition Classes
    DeltaCommand
    Exceptions thrown

    AnalysisException If either no Delta table exists at the given path/identifier or there is neither path nor tableIdentifier is provided.

  20. def getDeltaTable(target: LogicalPlan, cmd: String): DeltaTableV2

    Extracts the DeltaTableV2 from a LogicalPlan iff the LogicalPlan is a ResolvedTable with either a DeltaTableV2 or a V1Table that is referencing a Delta table.

    Extracts the DeltaTableV2 from a LogicalPlan iff the LogicalPlan is a ResolvedTable with either a DeltaTableV2 or a V1Table that is referencing a Delta table. In all other cases this method will throw a "Table not found" exception.

    Definition Classes
    DeltaCommand
  21. def getDeltaTablePathOrIdentifier(target: LogicalPlan, cmd: String): (Option[TableIdentifier], Option[String])

    Helper method to extract the table id or path from a LogicalPlan representing a Delta table.

    Helper method to extract the table id or path from a LogicalPlan representing a Delta table. This uses DeltaCommand.getDeltaTable to convert the LogicalPlan to a DeltaTableV2 and then extracts either the path or identifier from it. If the DeltaTableV2 has a CatalogTable, the table identifier will be returned. Otherwise, the table's path will be returned. Throws an exception if the LogicalPlan does not represent a Delta table.

    Definition Classes
    DeltaCommand
  22. def getErrorData(e: Throwable): Map[String, Any]
    Definition Classes
    DeltaLogging
  23. def getRelativePath(path: String, fs: FileSystem, basePath: Path, relativizeIgnoreError: Boolean): Option[String]

    Returns the relative path of a file or None if the file lives outside of the table.

    Returns the relative path of a file or None if the file lives outside of the table.

    Attributes
    protected
  24. def getTableCatalogTable(target: LogicalPlan, cmd: String): Option[CatalogTable]

    Extracts CatalogTable metadata from a LogicalPlan if the plan is a ResolvedTable.

    Extracts CatalogTable metadata from a LogicalPlan if the plan is a ResolvedTable. The table can be a non delta table.

    Definition Classes
    DeltaCommand
  25. def getTablePathOrIdentifier(target: LogicalPlan, cmd: String): (Option[TableIdentifier], Option[String])

    Helper method to extract the table id or path from a LogicalPlan representing a resolved table or path.

    Helper method to extract the table id or path from a LogicalPlan representing a resolved table or path. This calls getDeltaTablePathOrIdentifier if the resolved table is a delta table. For non delta table with identifier, we extract its identifier. For non delta table with path, it expects the path to be wrapped in an ResolvedPathBasedNonDeltaTable and extracts it from there.

    Definition Classes
    DeltaCommand
  26. def getTouchedFile(basePath: Path, escapedFilePath: String, nameToAddFileMap: Map[String, AddFile]): AddFile

    Find the AddFile record corresponding to the file that was read as part of a delete/update/merge operation.

    Find the AddFile record corresponding to the file that was read as part of a delete/update/merge operation.

    basePath

    The path of the table. Must not be escaped.

    escapedFilePath

    The path to a file that can be either absolute or relative. All special chars in this path must be already escaped by URI standards.

    nameToAddFileMap

    Map generated through generateCandidateFileMap().

    Definition Classes
    DeltaCommand
  27. def getValidRelativePathsAndSubdirs(action: FileAction, fs: FileSystem, basePath: Path, relativizeIgnoreError: Boolean): Seq[String]

    Returns the relative paths of all files and subdirectories for this action that must be retained during GC.

    Returns the relative paths of all files and subdirectories for this action that must be retained during GC.

    Attributes
    protected
  28. def hasBeenExecuted(txn: OptimisticTransaction, sparkSession: SparkSession, options: Option[DeltaOptions] = None): Boolean

    Returns true if there is information in the spark session that indicates that this write has already been successfully written.

    Returns true if there is information in the spark session that indicates that this write has already been successfully written.

    Attributes
    protected
    Definition Classes
    DeltaCommand
  29. def hashCode(): Int
    Definition Classes
    AnyRef → Any
    Annotations
    @native()
  30. def initializeLogIfNecessary(isInterpreter: Boolean, silent: Boolean): Boolean
    Attributes
    protected
    Definition Classes
    Logging
  31. def initializeLogIfNecessary(isInterpreter: Boolean): Unit
    Attributes
    protected
    Definition Classes
    Logging
  32. def isCatalogTable(analyzer: Analyzer, tableIdent: TableIdentifier): Boolean

    Use the analyzer to see whether the provided TableIdentifier is for a path based table or not

    Use the analyzer to see whether the provided TableIdentifier is for a path based table or not

    analyzer

    The session state analyzer to call

    tableIdent

    Table Identifier to determine whether is path based or not

    returns

    Boolean where true means that the table is a table in a metastore and false means the table is a path based table

    Definition Classes
    DeltaCommand
  33. final def isInstanceOf[T0]: Boolean
    Definition Classes
    Any
  34. def isPathIdentifier(tableIdent: TableIdentifier): Boolean

    Checks if the given identifier can be for a delta table's path

    Checks if the given identifier can be for a delta table's path

    tableIdent

    Table Identifier for which to check

    Attributes
    protected
    Definition Classes
    DeltaCommand
  35. def isTraceEnabled(): Boolean
    Attributes
    protected
    Definition Classes
    Logging
  36. def log: Logger
    Attributes
    protected
    Definition Classes
    Logging
  37. def logConsole(line: String): Unit
    Definition Classes
    DatabricksLogging
  38. def logDebug(entry: LogEntry, throwable: Throwable): Unit
    Attributes
    protected
    Definition Classes
    LoggingShims
  39. def logDebug(entry: LogEntry): Unit
    Attributes
    protected
    Definition Classes
    LoggingShims
  40. def logDebug(msg: ⇒ String, throwable: Throwable): Unit
    Attributes
    protected
    Definition Classes
    Logging
  41. def logDebug(msg: ⇒ String): Unit
    Attributes
    protected
    Definition Classes
    Logging
  42. def logError(entry: LogEntry, throwable: Throwable): Unit
    Attributes
    protected
    Definition Classes
    LoggingShims
  43. def logError(entry: LogEntry): Unit
    Attributes
    protected
    Definition Classes
    LoggingShims
  44. def logError(msg: ⇒ String, throwable: Throwable): Unit
    Attributes
    protected
    Definition Classes
    Logging
  45. def logError(msg: ⇒ String): Unit
    Attributes
    protected
    Definition Classes
    Logging
  46. def logInfo(entry: LogEntry, throwable: Throwable): Unit
    Attributes
    protected
    Definition Classes
    LoggingShims
  47. def logInfo(entry: LogEntry): Unit
    Attributes
    protected
    Definition Classes
    LoggingShims
  48. def logInfo(msg: ⇒ String, throwable: Throwable): Unit
    Attributes
    protected
    Definition Classes
    Logging
  49. def logInfo(msg: ⇒ String): Unit
    Attributes
    protected
    Definition Classes
    Logging
  50. def logName: String
    Attributes
    protected
    Definition Classes
    Logging
  51. def logTrace(entry: LogEntry, throwable: Throwable): Unit
    Attributes
    protected
    Definition Classes
    LoggingShims
  52. def logTrace(entry: LogEntry): Unit
    Attributes
    protected
    Definition Classes
    LoggingShims
  53. def logTrace(msg: ⇒ String, throwable: Throwable): Unit
    Attributes
    protected
    Definition Classes
    Logging
  54. def logTrace(msg: ⇒ String): Unit
    Attributes
    protected
    Definition Classes
    Logging
  55. def logVacuumEnd(deltaLog: DeltaLog, spark: SparkSession, path: Path, commandMetrics: Map[String, SQLMetric], filesDeleted: Option[Long] = None, dirCounts: Option[Long] = None): Unit

    Record Vacuum specific metrics in the commit log at the END of vacuum.

    Record Vacuum specific metrics in the commit log at the END of vacuum.

    deltaLog

    - DeltaLog of the table

    spark

    - spark session

    path

    - the (data) path to the root of the table

    filesDeleted

    - if the vacuum completed this will contain the number of files deleted. if the vacuum failed, this will be None.

    dirCounts

    - if the vacuum completed this will contain the number of directories vacuumed. if the vacuum failed, this will be None.

    Attributes
    protected
  56. def logVacuumStart(spark: SparkSession, deltaLog: DeltaLog, path: Path, diff: Dataset[String], sizeOfDataToDelete: Long, specifiedRetentionMillis: Option[Long], defaultRetentionMillis: Long): Unit

    Record Vacuum specific metrics in the commit log at the START of vacuum.

    Record Vacuum specific metrics in the commit log at the START of vacuum.

    spark

    - spark session

    deltaLog

    - DeltaLog of the table

    path

    - the (data) path to the root of the table

    diff

    - the list of paths (files, directories) that are safe to delete

    sizeOfDataToDelete

    - the amount of data (bytes) to be deleted

    specifiedRetentionMillis

    - the optional override retention period (millis) to keep logically removed files before deleting them

    defaultRetentionMillis

    - the default retention period (millis)

    Attributes
    protected
  57. def logWarning(entry: LogEntry, throwable: Throwable): Unit
    Attributes
    protected
    Definition Classes
    LoggingShims
  58. def logWarning(entry: LogEntry): Unit
    Attributes
    protected
    Definition Classes
    LoggingShims
  59. def logWarning(msg: ⇒ String, throwable: Throwable): Unit
    Attributes
    protected
    Definition Classes
    Logging
  60. def logWarning(msg: ⇒ String): Unit
    Attributes
    protected
    Definition Classes
    Logging
  61. final def ne(arg0: AnyRef): Boolean
    Definition Classes
    AnyRef
  62. final def notify(): Unit
    Definition Classes
    AnyRef
    Annotations
    @native()
  63. final def notifyAll(): Unit
    Definition Classes
    AnyRef
    Annotations
    @native()
  64. def parsePredicates(spark: SparkSession, predicate: String): Seq[Expression]

    Converts string predicates into Expressions relative to a transaction.

    Converts string predicates into Expressions relative to a transaction.

    Attributes
    protected
    Definition Classes
    DeltaCommand
    Exceptions thrown

    AnalysisException if a non-partition column is referenced.

  65. def pathStringtoUrlEncodedString(path: String): String
    Attributes
    protected
  66. def pathToUrlEncodedString(path: Path): String
    Attributes
    protected
  67. def recordDeltaEvent(deltaLog: DeltaLog, opType: String, tags: Map[TagDefinition, String] = Map.empty, data: AnyRef = null, path: Option[Path] = None): Unit

    Used to record the occurrence of a single event or report detailed, operation specific statistics.

    Used to record the occurrence of a single event or report detailed, operation specific statistics.

    path

    Used to log the path of the delta table when deltaLog is null.

    Attributes
    protected
    Definition Classes
    DeltaLogging
  68. def recordDeltaOperation[A](deltaLog: DeltaLog, opType: String, tags: Map[TagDefinition, String] = Map.empty)(thunk: ⇒ A): A

    Used to report the duration as well as the success or failure of an operation on a deltaLog.

    Used to report the duration as well as the success or failure of an operation on a deltaLog.

    Attributes
    protected
    Definition Classes
    DeltaLogging
  69. def recordDeltaOperationForTablePath[A](tablePath: String, opType: String, tags: Map[TagDefinition, String] = Map.empty)(thunk: ⇒ A): A

    Used to report the duration as well as the success or failure of an operation on a tahoePath.

    Used to report the duration as well as the success or failure of an operation on a tahoePath.

    Attributes
    protected
    Definition Classes
    DeltaLogging
  70. def recordEvent(metric: MetricDefinition, additionalTags: Map[TagDefinition, String] = Map.empty, blob: String = null, trimBlob: Boolean = true): Unit
    Definition Classes
    DatabricksLogging
  71. def recordFrameProfile[T](group: String, name: String)(thunk: ⇒ T): T
    Attributes
    protected
    Definition Classes
    DeltaLogging
  72. def recordOperation[S](opType: OpType, opTarget: String = null, extraTags: Map[TagDefinition, String], isSynchronous: Boolean = true, alwaysRecordStats: Boolean = false, allowAuthTags: Boolean = false, killJvmIfStuck: Boolean = false, outputMetric: MetricDefinition = METRIC_OPERATION_DURATION, silent: Boolean = true)(thunk: ⇒ S): S
    Definition Classes
    DatabricksLogging
  73. def recordProductEvent(metric: MetricDefinition with CentralizableMetric, additionalTags: Map[TagDefinition, String] = Map.empty, blob: String = null, trimBlob: Boolean = true): Unit
    Definition Classes
    DatabricksLogging
  74. def recordProductUsage(metric: MetricDefinition with CentralizableMetric, quantity: Double, additionalTags: Map[TagDefinition, String] = Map.empty, blob: String = null, forceSample: Boolean = false, trimBlob: Boolean = true, silent: Boolean = false): Unit
    Definition Classes
    DatabricksLogging
  75. def recordUsage(metric: MetricDefinition, quantity: Double, additionalTags: Map[TagDefinition, String] = Map.empty, blob: String = null, forceSample: Boolean = false, trimBlob: Boolean = true, silent: Boolean = false): Unit
    Definition Classes
    DatabricksLogging
  76. def relativize(path: Path, fs: FileSystem, reservoirBase: Path, isDir: Boolean): String

    Attempts to relativize the path with respect to the reservoirBase and converts the path to a string.

    Attempts to relativize the path with respect to the reservoirBase and converts the path to a string.

    Attributes
    protected
  77. def removeFilesFromPaths(deltaLog: DeltaLog, nameToAddFileMap: Map[String, AddFile], filesToRewrite: Seq[String], operationTimestamp: Long): Seq[RemoveFile]

    This method provides the RemoveFile actions that are necessary for files that are touched and need to be rewritten in methods like Delete, Update, and Merge.

    This method provides the RemoveFile actions that are necessary for files that are touched and need to be rewritten in methods like Delete, Update, and Merge.

    deltaLog

    The DeltaLog of the table that is being operated on

    nameToAddFileMap

    A map generated using generateCandidateFileMap.

    filesToRewrite

    Absolute paths of the files that were touched. We will search for these in candidateFiles. Obtained as the output of the input_file_name function.

    operationTimestamp

    The timestamp of the operation

    Attributes
    protected
    Definition Classes
    DeltaCommand
  78. def resolveIdentifier(analyzer: Analyzer, identifier: TableIdentifier): LogicalPlan

    Use the analyzer to resolve the identifier provided

    Use the analyzer to resolve the identifier provided

    analyzer

    The session state analyzer to call

    identifier

    Table Identifier to determine whether is path based or not

    Attributes
    protected
    Definition Classes
    DeltaCommand
  79. def sendDriverMetrics(spark: SparkSession, metrics: Map[String, SQLMetric]): Unit

    Send the driver-side metrics.

    Send the driver-side metrics.

    This is needed to make the SQL metrics visible in the Spark UI. All metrics are default initialized with 0 so that's what we're reporting in case we skip an already executed action.

    Attributes
    protected
    Definition Classes
    DeltaCommand
  80. def setCommitClock(deltaLog: DeltaLog, version: Long): Unit
    Attributes
    protected
  81. final def synchronized[T0](arg0: ⇒ T0): T0
    Definition Classes
    AnyRef
  82. def toString(): String
    Definition Classes
    AnyRef → Any
  83. def urlEncodedStringToPath(path: String): Path
    Attributes
    protected
  84. def verifyPartitionPredicates(spark: SparkSession, partitionColumns: Seq[String], predicates: Seq[Expression]): Unit
    Definition Classes
    DeltaCommand
  85. final def wait(): Unit
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  86. final def wait(arg0: Long, arg1: Int): Unit
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  87. final def wait(arg0: Long): Unit
    Definition Classes
    AnyRef
    Annotations
    @throws( ... ) @native()
  88. def withStatusCode[T](statusCode: String, defaultMessage: String, data: Map[String, Any] = Map.empty)(body: ⇒ T): T

    Report a log to indicate some command is running.

    Report a log to indicate some command is running.

    Definition Classes
    DeltaProgressReporter

Inherited from DeltaCommand

Inherited from DeltaLogging

Inherited from DatabricksLogging

Inherited from DeltaProgressReporter

Inherited from LoggingShims

Inherited from Logging

Inherited from AnyRef

Inherited from Any

Ungrouped