case class PreprocessTableMerge(conf: SQLConf) extends Rule[LogicalPlan] with UpdateExpressionsSupport with Product with Serializable
- Alphabetic
- By Inheritance
- PreprocessTableMerge
- Serializable
- Serializable
- Product
- Equals
- UpdateExpressionsSupport
- DeltaLogging
- DatabricksLogging
- DeltaProgressReporter
- LoggingShims
- AnalysisHelper
- Rule
- Logging
- SQLConfHelper
- AnyRef
- Any
- Hide All
- Show All
- Public
- All
Instance Constructors
- new PreprocessTableMerge(conf: SQLConf)
Type Members
-
implicit
class
LogStringContext extends AnyRef
- Definition Classes
- LoggingShims
-
case class
UpdateOperation(targetColNameParts: Seq[String], updateExpr: Expression) extends Product with Serializable
Specifies an operation that updates a target column with the given expression.
Specifies an operation that updates a target column with the given expression. The target column may or may not be a nested field and it is specified as a full quoted name or as a sequence of split into parts.
- Definition Classes
- UpdateExpressionsSupport
Value Members
-
final
def
!=(arg0: Any): Boolean
- Definition Classes
- AnyRef → Any
-
final
def
##(): Int
- Definition Classes
- AnyRef → Any
-
final
def
==(arg0: Any): Boolean
- Definition Classes
- AnyRef → Any
- def apply(mergeInto: DeltaMergeInto, transformToCommand: Boolean): LogicalPlan
-
def
apply(plan: LogicalPlan): LogicalPlan
- Definition Classes
- PreprocessTableMerge → Rule
-
final
def
asInstanceOf[T0]: T0
- Definition Classes
- Any
-
def
castIfNeeded(fromExpression: Expression, dataType: DataType, allowStructEvolution: Boolean, columnName: String): Expression
Add a cast to the child expression if it differs from the specified data type.
Add a cast to the child expression if it differs from the specified data type. Note that structs here are cast by name, rather than the Spark SQL default of casting by position.
- fromExpression
the expression to cast
- dataType
The data type to cast to.
- allowStructEvolution
Whether to allow structs to evolve. When this is false (default), struct casting will throw an error if the target struct type contains more fields than the expression to cast.
- columnName
The name of the column written to. It is used for the error message.
- Attributes
- protected
- Definition Classes
- UpdateExpressionsSupport
-
def
clone(): AnyRef
- Attributes
- protected[lang]
- Definition Classes
- AnyRef
- Annotations
- @throws( ... ) @native()
-
val
conf: SQLConf
- Definition Classes
- PreprocessTableMerge → SQLConfHelper
-
def
deltaAssert(check: ⇒ Boolean, name: String, msg: String, deltaLog: DeltaLog = null, data: AnyRef = null, path: Option[Path] = None): Unit
Helper method to check invariants in Delta code.
Helper method to check invariants in Delta code. Fails when running in tests, records a delta assertion event and logs a warning otherwise.
- Attributes
- protected
- Definition Classes
- DeltaLogging
-
final
def
eq(arg0: AnyRef): Boolean
- Definition Classes
- AnyRef
-
def
finalize(): Unit
- Attributes
- protected[lang]
- Definition Classes
- AnyRef
- Annotations
- @throws( classOf[java.lang.Throwable] )
-
def
generateUpdateExpressions(targetSchema: StructType, defaultExprs: Seq[NamedExpression], nameParts: Seq[Seq[String]], updateExprs: Seq[Expression], resolver: Resolver, generatedColumns: Seq[StructField]): Seq[Option[Expression]]
See docs on overloaded method.
See docs on overloaded method.
- Attributes
- protected
- Definition Classes
- UpdateExpressionsSupport
-
def
generateUpdateExpressions(targetSchema: StructType, updateOps: Seq[UpdateOperation], defaultExprs: Seq[NamedExpression], resolver: Resolver, pathPrefix: Seq[String] = Nil, allowSchemaEvolution: Boolean = false, generatedColumns: Seq[StructField] = Nil): Seq[Option[Expression]]
Given a target schema and a set of update operations, generate a list of update expressions, which are aligned with the given schema.
Given a target schema and a set of update operations, generate a list of update expressions, which are aligned with the given schema.
For update operations to nested struct fields, this method recursively walks down schema tree and apply the update expressions along the way. For example, assume table
targethas the following schema: s1 struct<a: int, b: int, c: int>, s2 struct<a: int, b: int>, z intGiven an update command:
- UPDATE target SET s1.a = 1, s1.b = 2, z = 3
this method works as follows:
generateUpdateExpressions( targetSchema=[s1,s2,z], defaultExprs=[s1,s2, z], updateOps=[(s1.a, 1), (s1.b, 2), (z, 3)]) -> generates expression for s1 - build recursively from child assignments generateUpdateExpressions( targetSchema=[a,b,c], defaultExprs=[a, b, c], updateOps=[(a, 1),(b, 2)], pathPrefix=["s1"]) end-of-recursion -> returns (1, 2, a.c) -> generates expression for s2 - no child assignment and no update expression: use default expression
s2-> generates expression for z - use available update expression3-> returns ((1, 2, a.c), s2, 3)- targetSchema
schema to follow to generate update expressions. Due to schema evolution, it may contain additional columns or fields not present in the original table schema.
- updateOps
a set of update operations.
- defaultExprs
the expressions to use when no update operation is provided for a column or field. This is typically the output from the base table.
- pathPrefix
the path from root to the current (nested) column. Only used for printing out full column path in error messages.
- allowSchemaEvolution
Whether to allow generating expressions for new columns or fields added by schema evolution.
- generatedColumns
the list of the generated columns in the table. When a column is a generated column and the user doesn't provide a update expression, its update expression in the return result will be None. If
generatedColumnsis empty, any of the options in the return result must be non-empty.- returns
a sequence of expression options. The elements in the sequence are options because when a column is a generated column but the user doesn't provide an update expression for this column, we need to generate the update expression according to the generated column definition. But this method doesn't have enough context to do that. Hence, we return a
Nonefor this case so that the caller knows it should generate the update expression for such column. For other cases, we will always return Some(expr).
- Attributes
- protected
- Definition Classes
- UpdateExpressionsSupport
-
def
generateUpdateExprsForGeneratedColumns(updateTarget: LogicalPlan, generatedColumns: Seq[StructField], updateExprs: Seq[Option[Expression]], postEvolutionTargetSchema: Option[StructType] = None): Seq[Expression]
Generate update expressions for generated columns that the user doesn't provide a update expression.
Generate update expressions for generated columns that the user doesn't provide a update expression. For each item in
updateExprsthat's None, we will find its generation expression fromgeneratedColumns. In order to resolve this generation expression, we will create a fake Project which contains all update expressions and resolve the generation expression with this project. Source columns of a generation expression will also be replaced with their corresponding update expressions.For example, given a table that has a generated column
gdefined asc1 + 10. For the following update command:UPDATE target SET c1 = c2 + 100, c2 = 1000
We will generate the update expression
(c2 + 100) + 10for columng. Note: in this update expression, we should use the oldc2attribute rather than its new value 1000.- updateTarget
The logical plan of the table to be updated.
- generatedColumns
A list of generated columns.
- updateExprs
The aligned (with
postEvolutionTargetSchemaif not None, orupdateTarget.outputotherwise) update actions.- postEvolutionTargetSchema
In case of UPDATE in MERGE when schema evolution happened, this is the final schema of the target table. This might not be the same as the output of
updateTarget.- returns
a sequence of update expressions for all of columns in the table.
- Attributes
- protected
- Definition Classes
- UpdateExpressionsSupport
-
final
def
getClass(): Class[_]
- Definition Classes
- AnyRef → Any
- Annotations
- @native()
-
def
getCommonTags(deltaLog: DeltaLog, tahoeId: String): Map[TagDefinition, String]
- Definition Classes
- DeltaLogging
-
def
getErrorData(e: Throwable): Map[String, Any]
- Definition Classes
- DeltaLogging
-
def
improveUnsupportedOpError(f: ⇒ Unit): Unit
- Attributes
- protected
- Definition Classes
- AnalysisHelper
-
def
initializeLogIfNecessary(isInterpreter: Boolean, silent: Boolean): Boolean
- Attributes
- protected
- Definition Classes
- Logging
-
def
initializeLogIfNecessary(isInterpreter: Boolean): Unit
- Attributes
- protected
- Definition Classes
- Logging
-
final
def
isInstanceOf[T0]: Boolean
- Definition Classes
- Any
-
def
isTraceEnabled(): Boolean
- Attributes
- protected
- Definition Classes
- Logging
-
def
log: Logger
- Attributes
- protected
- Definition Classes
- Logging
-
def
logConsole(line: String): Unit
- Definition Classes
- DatabricksLogging
-
def
logDebug(entry: LogEntry, throwable: Throwable): Unit
- Attributes
- protected
- Definition Classes
- LoggingShims
-
def
logDebug(entry: LogEntry): Unit
- Attributes
- protected
- Definition Classes
- LoggingShims
-
def
logDebug(msg: ⇒ String, throwable: Throwable): Unit
- Attributes
- protected
- Definition Classes
- Logging
-
def
logDebug(msg: ⇒ String): Unit
- Attributes
- protected
- Definition Classes
- Logging
-
def
logError(entry: LogEntry, throwable: Throwable): Unit
- Attributes
- protected
- Definition Classes
- LoggingShims
-
def
logError(entry: LogEntry): Unit
- Attributes
- protected
- Definition Classes
- LoggingShims
-
def
logError(msg: ⇒ String, throwable: Throwable): Unit
- Attributes
- protected
- Definition Classes
- Logging
-
def
logError(msg: ⇒ String): Unit
- Attributes
- protected
- Definition Classes
- Logging
-
def
logInfo(entry: LogEntry, throwable: Throwable): Unit
- Attributes
- protected
- Definition Classes
- LoggingShims
-
def
logInfo(entry: LogEntry): Unit
- Attributes
- protected
- Definition Classes
- LoggingShims
-
def
logInfo(msg: ⇒ String, throwable: Throwable): Unit
- Attributes
- protected
- Definition Classes
- Logging
-
def
logInfo(msg: ⇒ String): Unit
- Attributes
- protected
- Definition Classes
- Logging
-
def
logName: String
- Attributes
- protected
- Definition Classes
- Logging
-
def
logTrace(entry: LogEntry, throwable: Throwable): Unit
- Attributes
- protected
- Definition Classes
- LoggingShims
-
def
logTrace(entry: LogEntry): Unit
- Attributes
- protected
- Definition Classes
- LoggingShims
-
def
logTrace(msg: ⇒ String, throwable: Throwable): Unit
- Attributes
- protected
- Definition Classes
- Logging
-
def
logTrace(msg: ⇒ String): Unit
- Attributes
- protected
- Definition Classes
- Logging
-
def
logWarning(entry: LogEntry, throwable: Throwable): Unit
- Attributes
- protected
- Definition Classes
- LoggingShims
-
def
logWarning(entry: LogEntry): Unit
- Attributes
- protected
- Definition Classes
- LoggingShims
-
def
logWarning(msg: ⇒ String, throwable: Throwable): Unit
- Attributes
- protected
- Definition Classes
- Logging
-
def
logWarning(msg: ⇒ String): Unit
- Attributes
- protected
- Definition Classes
- Logging
-
final
def
ne(arg0: AnyRef): Boolean
- Definition Classes
- AnyRef
-
final
def
notify(): Unit
- Definition Classes
- AnyRef
- Annotations
- @native()
-
final
def
notifyAll(): Unit
- Definition Classes
- AnyRef
- Annotations
- @native()
-
def
recordDeltaEvent(deltaLog: DeltaLog, opType: String, tags: Map[TagDefinition, String] = Map.empty, data: AnyRef = null, path: Option[Path] = None): Unit
Used to record the occurrence of a single event or report detailed, operation specific statistics.
Used to record the occurrence of a single event or report detailed, operation specific statistics.
- path
Used to log the path of the delta table when
deltaLogis null.
- Attributes
- protected
- Definition Classes
- DeltaLogging
-
def
recordDeltaOperation[A](deltaLog: DeltaLog, opType: String, tags: Map[TagDefinition, String] = Map.empty)(thunk: ⇒ A): A
Used to report the duration as well as the success or failure of an operation on a
deltaLog.Used to report the duration as well as the success or failure of an operation on a
deltaLog.- Attributes
- protected
- Definition Classes
- DeltaLogging
-
def
recordDeltaOperationForTablePath[A](tablePath: String, opType: String, tags: Map[TagDefinition, String] = Map.empty)(thunk: ⇒ A): A
Used to report the duration as well as the success or failure of an operation on a
tahoePath.Used to report the duration as well as the success or failure of an operation on a
tahoePath.- Attributes
- protected
- Definition Classes
- DeltaLogging
-
def
recordEvent(metric: MetricDefinition, additionalTags: Map[TagDefinition, String] = Map.empty, blob: String = null, trimBlob: Boolean = true): Unit
- Definition Classes
- DatabricksLogging
-
def
recordFrameProfile[T](group: String, name: String)(thunk: ⇒ T): T
- Attributes
- protected
- Definition Classes
- DeltaLogging
-
def
recordOperation[S](opType: OpType, opTarget: String = null, extraTags: Map[TagDefinition, String], isSynchronous: Boolean = true, alwaysRecordStats: Boolean = false, allowAuthTags: Boolean = false, killJvmIfStuck: Boolean = false, outputMetric: MetricDefinition = METRIC_OPERATION_DURATION, silent: Boolean = true)(thunk: ⇒ S): S
- Definition Classes
- DatabricksLogging
-
def
recordProductEvent(metric: MetricDefinition with CentralizableMetric, additionalTags: Map[TagDefinition, String] = Map.empty, blob: String = null, trimBlob: Boolean = true): Unit
- Definition Classes
- DatabricksLogging
-
def
recordProductUsage(metric: MetricDefinition with CentralizableMetric, quantity: Double, additionalTags: Map[TagDefinition, String] = Map.empty, blob: String = null, forceSample: Boolean = false, trimBlob: Boolean = true, silent: Boolean = false): Unit
- Definition Classes
- DatabricksLogging
-
def
recordUsage(metric: MetricDefinition, quantity: Double, additionalTags: Map[TagDefinition, String] = Map.empty, blob: String = null, forceSample: Boolean = false, trimBlob: Boolean = true, silent: Boolean = false): Unit
- Definition Classes
- DatabricksLogging
-
def
resolveReferencesForExpressions(sparkSession: SparkSession, exprs: Seq[Expression], planProvidingAttrs: LogicalPlan): Seq[Expression]
Resolve expressions using the attributes provided by
planProvidingAttrs.Resolve expressions using the attributes provided by
planProvidingAttrs. Throw an error if failing to resolve any expressions.- Attributes
- protected
- Definition Classes
- AnalysisHelper
-
lazy val
ruleId: RuleId
- Attributes
- protected
- Definition Classes
- Rule
-
val
ruleName: String
- Definition Classes
- Rule
-
val
supportMergeAndUpdateLegacyCastBehavior: Boolean
Whether casting behavior can revert to following 'spark.sql.ansi.enabled' instead of 'spark.sql.storeAssignmentPolicy' to preserve legacy behavior for UPDATE and MERGE.
Whether casting behavior can revert to following 'spark.sql.ansi.enabled' instead of 'spark.sql.storeAssignmentPolicy' to preserve legacy behavior for UPDATE and MERGE. Legacy behavior is applied only if 'spark.databricks.delta.updateAndMergeCastingFollowsAnsiEnabledFlag' is set to true.
- Attributes
- protected
- Definition Classes
- PreprocessTableMerge → UpdateExpressionsSupport
-
final
def
synchronized[T0](arg0: ⇒ T0): T0
- Definition Classes
- AnyRef
-
def
toDataset(sparkSession: SparkSession, logicalPlan: LogicalPlan): Dataset[Row]
- Attributes
- protected
- Definition Classes
- AnalysisHelper
-
def
tryResolveReferences(sparkSession: SparkSession)(expr: Expression, planContainingExpr: LogicalPlan): Expression
- Attributes
- protected
- Definition Classes
- AnalysisHelper
-
def
tryResolveReferencesForExpressions(sparkSession: SparkSession)(exprs: Seq[Expression], plansProvidingAttrs: Seq[LogicalPlan]): Seq[Expression]
Resolve expressions using the attributes provided by
planProvidingAttrs, ignoring errors.Resolve expressions using the attributes provided by
planProvidingAttrs, ignoring errors.- Attributes
- protected
- Definition Classes
- AnalysisHelper
-
def
tryResolveReferencesForExpressions(sparkSession: SparkSession, exprs: Seq[Expression], planContainingExpr: LogicalPlan): Seq[Expression]
- Attributes
- protected
- Definition Classes
- AnalysisHelper
-
final
def
wait(): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws( ... )
-
final
def
wait(arg0: Long, arg1: Int): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws( ... )
-
final
def
wait(arg0: Long): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws( ... ) @native()
-
def
withStatusCode[T](statusCode: String, defaultMessage: String, data: Map[String, Any] = Map.empty)(body: ⇒ T): T
Report a log to indicate some command is running.
Report a log to indicate some command is running.
- Definition Classes
- DeltaProgressReporter