trait TransactionalWrite extends DeltaLogging

Adds the ability to write files out as part of a transaction. Checks are performed to ensure that the data being written matches either the current metadata or the new metadata being set by this transaction.

Self Type
OptimisticTransactionImpl
Linear Supertypes
Ordering
  1. Alphabetic
  2. By Inheritance
Inherited
  1. TransactionalWrite
  2. DeltaLogging
  3. DatabricksLogging
  4. DeltaProgressReporter
  5. LoggingShims
  6. Logging
  7. AnyRef
  8. Any
  1. Hide All
  2. Show All
Visibility
  1. Public
  2. All

Type Members

  1. implicit class LogStringContext extends AnyRef
    Definition Classes
    LoggingShims

Abstract Value Members

  1. abstract def deltaLog: DeltaLog
  2. abstract def metadata: Metadata
    Attributes
    protected
  3. abstract def protocol: Protocol
  4. abstract def snapshot: Snapshot
    Attributes
    protected

Concrete Value Members

  1. final def !=(arg0: Any): Boolean
    Definition Classes
    AnyRef → Any
  2. final def ##(): Int
    Definition Classes
    AnyRef → Any
  3. final def ==(arg0: Any): Boolean
    Definition Classes
    AnyRef → Any
  4. final def asInstanceOf[T0]: T0
    Definition Classes
    Any
  5. def checkPartitionColumns(partitionSchema: StructType, output: Seq[Attribute], colsDropped: Boolean): Unit
    Attributes
    protected
  6. def clone(): AnyRef
    Attributes
    protected[lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( ... ) @native()
  7. def convertEmptyToNullIfNeeded(plan: SparkPlan, partCols: Seq[Attribute], constraints: Seq[Constraint]): SparkPlan

    If there is any string partition column and there are constraints defined, add a projection to convert empty string to null for that column.

    If there is any string partition column and there are constraints defined, add a projection to convert empty string to null for that column. The empty strings will be converted to null eventually even without this convert, but we want to do this earlier before check constraints so that empty strings are correctly rejected. Note that this should not cause the downstream logic in FileFormatWriter to add duplicate conversions because the logic there checks the partition column using the original plan's output. When the plan is modified with additional projections, the partition column check won't match and will not add more conversion.

    plan

    The original SparkPlan.

    partCols

    The partition columns.

    constraints

    The defined constraints.

    returns

    A SparkPlan potentially modified with an additional projection on top of plan

    Attributes
    protected
  8. def deltaAssert(check: ⇒ Boolean, name: String, msg: String, deltaLog: DeltaLog = null, data: AnyRef = null, path: Option[Path] = None): Unit

    Helper method to check invariants in Delta code.

    Helper method to check invariants in Delta code. Fails when running in tests, records a delta assertion event and logs a warning otherwise.

    Attributes
    protected
    Definition Classes
    DeltaLogging
  9. final def eq(arg0: AnyRef): Boolean
    Definition Classes
    AnyRef
  10. def equals(arg0: Any): Boolean
    Definition Classes
    AnyRef → Any
  11. def finalize(): Unit
    Attributes
    protected[lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( classOf[java.lang.Throwable] )
  12. final def getClass(): Class[_]
    Definition Classes
    AnyRef → Any
    Annotations
    @native()
  13. def getCommitter(outputPath: Path): DelayedCommitProtocol
    Attributes
    protected
  14. def getCommonTags(deltaLog: DeltaLog, tahoeId: String): Map[TagDefinition, String]
    Definition Classes
    DeltaLogging
  15. def getErrorData(e: Throwable): Map[String, Any]
    Definition Classes
    DeltaLogging
  16. def getOptionalStatsTrackerAndStatsCollection(output: Seq[Attribute], outputPath: Path, partitionSchema: StructType, data: DataFrame): (Option[DeltaJobStatisticsTracker], Option[StatisticsCollection])

    Return the pair of optional stats tracker and stats collection class

    Return the pair of optional stats tracker and stats collection class

    Attributes
    protected
  17. def getPartitioningColumns(partitionSchema: StructType, output: Seq[Attribute]): Seq[Attribute]
    Attributes
    protected
  18. def getStatsColExpr(statsDataSchema: Seq[Attribute], statsCollection: StatisticsCollection): Expression
    Attributes
    protected
  19. def getStatsSchema(dataFrameOutput: Seq[Attribute], partitionSchema: StructType): (Seq[Attribute], Seq[Attribute])

    Return a tuple of (outputStatsCollectionSchema, statsCollectionSchema).

    Return a tuple of (outputStatsCollectionSchema, statsCollectionSchema). outputStatsCollectionSchema is the data source schema from DataFrame used for stats collection. It contains the columns in the DataFrame output, excluding the partition columns. tableStatsCollectionSchema is the schema to collect stats for. It contains the columns in the table schema, excluding the partition columns. Note: We only collect NULL_COUNT stats (as the number of rows) for the columns in statsCollectionSchema but missing in outputStatsCollectionSchema

    Attributes
    protected
  20. val hasWritten: Boolean
    Attributes
    protected
  21. def hashCode(): Int
    Definition Classes
    AnyRef → Any
    Annotations
    @native()
  22. def initializeLogIfNecessary(isInterpreter: Boolean, silent: Boolean): Boolean
    Attributes
    protected
    Definition Classes
    Logging
  23. def initializeLogIfNecessary(isInterpreter: Boolean): Unit
    Attributes
    protected
    Definition Classes
    Logging
  24. final def isInstanceOf[T0]: Boolean
    Definition Classes
    Any
  25. def isTraceEnabled(): Boolean
    Attributes
    protected
    Definition Classes
    Logging
  26. def log: Logger
    Attributes
    protected
    Definition Classes
    Logging
  27. def logConsole(line: String): Unit
    Definition Classes
    DatabricksLogging
  28. def logDebug(entry: LogEntry, throwable: Throwable): Unit
    Attributes
    protected
    Definition Classes
    LoggingShims
  29. def logDebug(entry: LogEntry): Unit
    Attributes
    protected
    Definition Classes
    LoggingShims
  30. def logDebug(msg: ⇒ String, throwable: Throwable): Unit
    Attributes
    protected
    Definition Classes
    Logging
  31. def logDebug(msg: ⇒ String): Unit
    Attributes
    protected
    Definition Classes
    Logging
  32. def logError(entry: LogEntry, throwable: Throwable): Unit
    Attributes
    protected
    Definition Classes
    LoggingShims
  33. def logError(entry: LogEntry): Unit
    Attributes
    protected
    Definition Classes
    LoggingShims
  34. def logError(msg: ⇒ String, throwable: Throwable): Unit
    Attributes
    protected
    Definition Classes
    Logging
  35. def logError(msg: ⇒ String): Unit
    Attributes
    protected
    Definition Classes
    Logging
  36. def logInfo(entry: LogEntry, throwable: Throwable): Unit
    Attributes
    protected
    Definition Classes
    LoggingShims
  37. def logInfo(entry: LogEntry): Unit
    Attributes
    protected
    Definition Classes
    LoggingShims
  38. def logInfo(msg: ⇒ String, throwable: Throwable): Unit
    Attributes
    protected
    Definition Classes
    Logging
  39. def logInfo(msg: ⇒ String): Unit
    Attributes
    protected
    Definition Classes
    Logging
  40. def logName: String
    Attributes
    protected
    Definition Classes
    Logging
  41. def logTrace(entry: LogEntry, throwable: Throwable): Unit
    Attributes
    protected
    Definition Classes
    LoggingShims
  42. def logTrace(entry: LogEntry): Unit
    Attributes
    protected
    Definition Classes
    LoggingShims
  43. def logTrace(msg: ⇒ String, throwable: Throwable): Unit
    Attributes
    protected
    Definition Classes
    Logging
  44. def logTrace(msg: ⇒ String): Unit
    Attributes
    protected
    Definition Classes
    Logging
  45. def logWarning(entry: LogEntry, throwable: Throwable): Unit
    Attributes
    protected
    Definition Classes
    LoggingShims
  46. def logWarning(entry: LogEntry): Unit
    Attributes
    protected
    Definition Classes
    LoggingShims
  47. def logWarning(msg: ⇒ String, throwable: Throwable): Unit
    Attributes
    protected
    Definition Classes
    Logging
  48. def logWarning(msg: ⇒ String): Unit
    Attributes
    protected
    Definition Classes
    Logging
  49. def makeOutputNullable(output: Seq[Attribute]): Seq[Attribute]

    Makes the output attributes nullable, so that we don't write unreadable parquet files.

    Makes the output attributes nullable, so that we don't write unreadable parquet files.

    Attributes
    protected
  50. def mapColumnAttributes(output: Seq[Attribute], mappingMode: DeltaColumnMappingMode): Seq[Attribute]

    Replace the output attributes with the physical mapping information.

    Replace the output attributes with the physical mapping information.

    Attributes
    protected
  51. final def ne(arg0: AnyRef): Boolean
    Definition Classes
    AnyRef
  52. def normalizeData(deltaLog: DeltaLog, options: Option[DeltaOptions], data: Dataset[_]): (QueryExecution, Seq[Attribute], Seq[Constraint], Set[String])

    Normalize the schema of the query, and return the QueryExecution to execute.

    Normalize the schema of the query, and return the QueryExecution to execute. If the table has generated columns and users provide these columns in the output, we will also return constraints that should be respected. If any constraints are returned, the caller should apply these constraints when writing data.

    Note: The output attributes of the QueryExecution may not match the attributes we return as the output schema. This is because streaming queries create IncrementalExecution, which cannot be further modified. We can however have the Parquet writer use the physical plan from IncrementalExecution and the output schema provided through the attributes.

    Attributes
    protected
  53. final def notify(): Unit
    Definition Classes
    AnyRef
    Annotations
    @native()
  54. final def notifyAll(): Unit
    Definition Classes
    AnyRef
    Annotations
    @native()
  55. def performCDCPartition(inputData: Dataset[_]): (DataFrame, StructType)

    Returns a tuple of (data, partition schema).

    Returns a tuple of (data, partition schema). For CDC writes, a is_cdc column is added to the data and is_cdc=true/false is added to the front of the partition schema.

    Attributes
    protected
  56. def recordDeltaEvent(deltaLog: DeltaLog, opType: String, tags: Map[TagDefinition, String] = Map.empty, data: AnyRef = null, path: Option[Path] = None): Unit

    Used to record the occurrence of a single event or report detailed, operation specific statistics.

    Used to record the occurrence of a single event or report detailed, operation specific statistics.

    path

    Used to log the path of the delta table when deltaLog is null.

    Attributes
    protected
    Definition Classes
    DeltaLogging
  57. def recordDeltaOperation[A](deltaLog: DeltaLog, opType: String, tags: Map[TagDefinition, String] = Map.empty)(thunk: ⇒ A): A

    Used to report the duration as well as the success or failure of an operation on a deltaLog.

    Used to report the duration as well as the success or failure of an operation on a deltaLog.

    Attributes
    protected
    Definition Classes
    DeltaLogging
  58. def recordDeltaOperationForTablePath[A](tablePath: String, opType: String, tags: Map[TagDefinition, String] = Map.empty)(thunk: ⇒ A): A

    Used to report the duration as well as the success or failure of an operation on a tahoePath.

    Used to report the duration as well as the success or failure of an operation on a tahoePath.

    Attributes
    protected
    Definition Classes
    DeltaLogging
  59. def recordEvent(metric: MetricDefinition, additionalTags: Map[TagDefinition, String] = Map.empty, blob: String = null, trimBlob: Boolean = true): Unit
    Definition Classes
    DatabricksLogging
  60. def recordFrameProfile[T](group: String, name: String)(thunk: ⇒ T): T
    Attributes
    protected
    Definition Classes
    DeltaLogging
  61. def recordOperation[S](opType: OpType, opTarget: String = null, extraTags: Map[TagDefinition, String], isSynchronous: Boolean = true, alwaysRecordStats: Boolean = false, allowAuthTags: Boolean = false, killJvmIfStuck: Boolean = false, outputMetric: MetricDefinition = METRIC_OPERATION_DURATION, silent: Boolean = true)(thunk: ⇒ S): S
    Definition Classes
    DatabricksLogging
  62. def recordProductEvent(metric: MetricDefinition with CentralizableMetric, additionalTags: Map[TagDefinition, String] = Map.empty, blob: String = null, trimBlob: Boolean = true): Unit
    Definition Classes
    DatabricksLogging
  63. def recordProductUsage(metric: MetricDefinition with CentralizableMetric, quantity: Double, additionalTags: Map[TagDefinition, String] = Map.empty, blob: String = null, forceSample: Boolean = false, trimBlob: Boolean = true, silent: Boolean = false): Unit
    Definition Classes
    DatabricksLogging
  64. def recordUsage(metric: MetricDefinition, quantity: Double, additionalTags: Map[TagDefinition, String] = Map.empty, blob: String = null, forceSample: Boolean = false, trimBlob: Boolean = true, silent: Boolean = false): Unit
    Definition Classes
    DatabricksLogging
  65. final def synchronized[T0](arg0: ⇒ T0): T0
    Definition Classes
    AnyRef
  66. def toString(): String
    Definition Classes
    AnyRef → Any
  67. final def wait(): Unit
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  68. final def wait(arg0: Long, arg1: Int): Unit
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  69. final def wait(arg0: Long): Unit
    Definition Classes
    AnyRef
    Annotations
    @throws( ... ) @native()
  70. def withStatusCode[T](statusCode: String, defaultMessage: String, data: Map[String, Any] = Map.empty)(body: ⇒ T): T

    Report a log to indicate some command is running.

    Report a log to indicate some command is running.

    Definition Classes
    DeltaProgressReporter
  71. def writeFiles(inputData: Dataset[_], writeOptions: Option[DeltaOptions], isOptimize: Boolean, additionalConstraints: Seq[Constraint]): Seq[FileAction]

    Writes out the dataframe after performing schema validation.

    Writes out the dataframe after performing schema validation. Returns a list of actions to append these files to the reservoir.

    inputData

    Data to write out.

    writeOptions

    Options to decide how to write out the data.

    isOptimize

    Whether the operation writing this is Optimize or not.

    additionalConstraints

    Additional constraints on the write.

  72. def writeFiles(data: Dataset[_], deltaOptions: Option[DeltaOptions], additionalConstraints: Seq[Constraint]): Seq[FileAction]
  73. def writeFiles(data: Dataset[_]): Seq[FileAction]
  74. def writeFiles(data: Dataset[_], writeOptions: Option[DeltaOptions]): Seq[FileAction]
  75. def writeFiles(data: Dataset[_], additionalConstraints: Seq[Constraint]): Seq[FileAction]

Inherited from DeltaLogging

Inherited from DatabricksLogging

Inherited from DeltaProgressReporter

Inherited from LoggingShims

Inherited from Logging

Inherited from AnyRef

Inherited from Any

Ungrouped