Packages

c

org.apache.spark.sql.delta.commands

MergeIntoCommand

case class MergeIntoCommand(source: LogicalPlan, target: LogicalPlan, catalogTable: Option[CatalogTable], targetFileIndex: TahoeFileIndex, condition: Expression, matchedClauses: Seq[DeltaMergeIntoMatchedClause], notMatchedClauses: Seq[DeltaMergeIntoNotMatchedClause], notMatchedBySourceClauses: Seq[DeltaMergeIntoNotMatchedBySourceClause], migratedSchema: Option[StructType], schemaEvolutionEnabled: Boolean = false) extends LogicalPlan with MergeIntoCommandBase with InsertOnlyMergeExecutor with ClassicMergeExecutor with Product with Serializable

Performs a merge of a source query/table into a Delta table.

Issues an error message when the ON search_condition of the MERGE statement can match a single row from the target table with multiple rows of the source table-reference.

Algorithm:

Phase 1: Find the input files in target that are touched by the rows that satisfy the condition and verify that no two source rows match with the same target row. This is implemented as an inner-join using the given condition. See ClassicMergeExecutor for more details.

Phase 2: Read the touched files again and write new files with updated and/or inserted rows.

Phase 3: Use the Delta protocol to atomically remove the touched files and add the new files.

source

Source data to merge from

target

Target table to merge into

targetFileIndex

TahoeFileIndex of the target table

condition

Condition for a source row to match with a target row

matchedClauses

All info related to matched clauses.

notMatchedClauses

All info related to not matched clauses.

notMatchedBySourceClauses

All info related to not matched by source clauses.

migratedSchema

The final schema of the target - may be changed by schema evolution.

Linear Supertypes
Serializable, Serializable, ClassicMergeExecutor, InsertOnlyMergeExecutor, MergeOutputGeneration, MergeIntoCommandBase, UpdateExpressionsSupport, AnalysisHelper, MergeIntoMaterializeSource, DeltaSparkPlanUtils, ImplicitMetadataOperation, PredicateHelper, AliasHelper, DeltaCommand, DeltaLogging, DatabricksLogging, DeltaProgressReporter, LeafRunnableCommand, LeafLike[LogicalPlan], RunnableCommand, Command, LogicalPlan, Logging, QueryPlanConstraints, ConstraintHelper, LogicalPlanDistinctKeys, LogicalPlanStats, AnalysisHelper, QueryPlan[LogicalPlan], SQLConfHelper, TreeNode[LogicalPlan], WithOrigin, TreePatternBits, Product, Equals, AnyRef, Any
Ordering
  1. Alphabetic
  2. By Inheritance
Inherited
  1. MergeIntoCommand
  2. Serializable
  3. Serializable
  4. ClassicMergeExecutor
  5. InsertOnlyMergeExecutor
  6. MergeOutputGeneration
  7. MergeIntoCommandBase
  8. UpdateExpressionsSupport
  9. AnalysisHelper
  10. MergeIntoMaterializeSource
  11. DeltaSparkPlanUtils
  12. ImplicitMetadataOperation
  13. PredicateHelper
  14. AliasHelper
  15. DeltaCommand
  16. DeltaLogging
  17. DatabricksLogging
  18. DeltaProgressReporter
  19. LeafRunnableCommand
  20. LeafLike
  21. RunnableCommand
  22. Command
  23. LogicalPlan
  24. Logging
  25. QueryPlanConstraints
  26. ConstraintHelper
  27. LogicalPlanDistinctKeys
  28. LogicalPlanStats
  29. AnalysisHelper
  30. QueryPlan
  31. SQLConfHelper
  32. TreeNode
  33. WithOrigin
  34. TreePatternBits
  35. Product
  36. Equals
  37. AnyRef
  38. Any
  1. Hide All
  2. Show All
Visibility
  1. Public
  2. All

Instance Constructors

  1. new MergeIntoCommand(source: LogicalPlan, target: LogicalPlan, catalogTable: Option[CatalogTable], targetFileIndex: TahoeFileIndex, condition: Expression, matchedClauses: Seq[DeltaMergeIntoMatchedClause], notMatchedClauses: Seq[DeltaMergeIntoNotMatchedClause], notMatchedBySourceClauses: Seq[DeltaMergeIntoNotMatchedBySourceClause], migratedSchema: Option[StructType], schemaEvolutionEnabled: Boolean = false)

    source

    Source data to merge from

    target

    Target table to merge into

    targetFileIndex

    TahoeFileIndex of the target table

    condition

    Condition for a source row to match with a target row

    matchedClauses

    All info related to matched clauses.

    notMatchedClauses

    All info related to not matched clauses.

    notMatchedBySourceClauses

    All info related to not matched by source clauses.

    migratedSchema

    The final schema of the target - may be changed by schema evolution.

Type Members

  1. case class UpdateOperation(targetColNameParts: Seq[String], updateExpr: Expression) extends Product with Serializable

    Specifies an operation that updates a target column with the given expression.

    Specifies an operation that updates a target column with the given expression. The target column may or may not be a nested field and it is specified as a full quoted name or as a sequence of split into parts.

    Definition Classes
    UpdateExpressionsSupport
  2. type PlanOrExpression = Either[LogicalPlan, Expression]
    Definition Classes
    DeltaSparkPlanUtils
  3. case class ProcessedClause(condition: Option[Expression], actions: Seq[Expression]) extends Product with Serializable

    Represents a merge clause after its condition and action expressions have been processed before generating the final output expression.

    Represents a merge clause after its condition and action expressions have been processed before generating the final output expression.

    condition

    Optional precomputed condition.

    actions

    List of output expressions generated from every action of the clause.

    Attributes
    protected
    Definition Classes
    MergeOutputGeneration

Value Members

  1. final def !=(arg0: Any): Boolean
    Definition Classes
    AnyRef → Any
  2. final def ##(): Int
    Definition Classes
    AnyRef → Any
  3. final def ==(arg0: Any): Boolean
    Definition Classes
    AnyRef → Any
  4. lazy val allAttributes: AttributeSeq
    Definition Classes
    QueryPlan
  5. def analyzed: Boolean
    Definition Classes
    AnalysisHelper
  6. def apply(number: Int): TreeNode[_]
    Definition Classes
    TreeNode
  7. def argString(maxFields: Int): String
    Definition Classes
    TreeNode
  8. def asCode: String
    Definition Classes
    TreeNode
  9. final def asInstanceOf[T0]: T0
    Definition Classes
    Any
  10. def assertNotAnalysisRule(): Unit
    Attributes
    protected
    Definition Classes
    AnalysisHelper
  11. val attempt: Int

    Track which attempt or retry it is in runWithMaterializedSourceAndRetries

    Track which attempt or retry it is in runWithMaterializedSourceAndRetries

    Attributes
    protected
    Definition Classes
    MergeIntoMaterializeSource
  12. lazy val baseMetrics: Map[String, SQLMetric]
    Definition Classes
    MergeIntoCommandBase
  13. def buildBalancedPredicate(expressions: Seq[Expression], op: (Expression, Expression) ⇒ Expression): Expression
    Attributes
    protected
    Definition Classes
    PredicateHelper
  14. def buildBaseRelation(spark: SparkSession, txn: OptimisticTransaction, actionType: String, rootPath: Path, inputLeafFiles: Seq[String], nameToAddFileMap: Map[String, AddFile]): HadoopFsRelation

    Build a base relation of files that need to be rewritten as part of an update/delete/merge operation.

    Build a base relation of files that need to be rewritten as part of an update/delete/merge operation.

    Attributes
    protected
    Definition Classes
    DeltaCommand
  15. def buildTargetPlanWithFiles(spark: SparkSession, deltaTxn: OptimisticTransaction, files: Seq[AddFile], columnsToDrop: Seq[String]): LogicalPlan

    Builds a new logical plan to read the given files instead of the whole target table.

    Builds a new logical plan to read the given files instead of the whole target table. The plan returned has the same output columns (exprIds) as the target logical plan, so that existing update/insert expressions can be applied on this new plan. Unneeded non-partition columns may be dropped.

    Attributes
    protected
    Definition Classes
    MergeIntoCommandBase
  16. def buildTargetPlanWithIndex(spark: SparkSession, fileIndex: TahoeFileIndex, columnsToDrop: Seq[String]): LogicalPlan

    Builds a new logical plan to read the target table using the given fileIndex.

    Builds a new logical plan to read the target table using the given fileIndex. The plan returned has the same output columns (exprIds) as the target logical plan, so that existing update/insert expressions can be applied on this new plan.

    columnsToDrop

    unneeded non-partition columns to be dropped

    Attributes
    protected
    Definition Classes
    MergeIntoCommandBase
  17. def canEvaluate(expr: Expression, plan: LogicalPlan): Boolean
    Attributes
    protected
    Definition Classes
    PredicateHelper
  18. def canEvaluateWithinJoin(expr: Expression): Boolean
    Attributes
    protected
    Definition Classes
    PredicateHelper
  19. val canMergeSchema: Boolean
  20. val canOverwriteSchema: Boolean
  21. final lazy val canonicalized: LogicalPlan
    Definition Classes
    QueryPlan
    Annotations
    @transient()
  22. def castIfNeeded(fromExpression: Expression, dataType: DataType, allowStructEvolution: Boolean, columnName: String): Expression

    Add a cast to the child expression if it differs from the specified data type.

    Add a cast to the child expression if it differs from the specified data type. Note that structs here are cast by name, rather than the Spark SQL default of casting by position.

    fromExpression

    the expression to cast

    dataType

    The data type to cast to.

    allowStructEvolution

    Whether to allow structs to evolve. When this is false (default), struct casting will throw an error if the target struct type contains more fields than the expression to cast.

    columnName

    The name of the column written to. It is used for the error message.

    Attributes
    protected
    Definition Classes
    UpdateExpressionsSupport
  23. val catalogTable: Option[CatalogTable]
  24. def checkNonDeterministicSource(spark: SparkSession): Unit

    Throws an exception if merge metrics indicate that the source table changed between the first and the second source table scans.

    Throws an exception if merge metrics indicate that the source table changed between the first and the second source table scans.

    Attributes
    protected
    Definition Classes
    MergeIntoCommandBase
  25. final def children: Seq[LogicalPlan]
    Definition Classes
    LeafLike
  26. def childrenResolved: Boolean
    Definition Classes
    LogicalPlan
  27. def clauseDisjunction(clauses: Seq[DeltaMergeIntoClause]): Expression

    Helper function that produces an expression by combining a sequence of clauses with OR.

    Helper function that produces an expression by combining a sequence of clauses with OR. Requires the sequence to be non-empty.

    Attributes
    protected
    Definition Classes
    ClassicMergeExecutor
  28. def clone(): LogicalPlan
    Definition Classes
    AnalysisHelper → TreeNode → AnyRef
  29. def collect[B](pf: PartialFunction[LogicalPlan, B]): Seq[B]
    Definition Classes
    TreeNode
  30. def collectFirst[In, Out](input: Iterable[In], recurse: (In) ⇒ Option[Out]): Option[Out]
    Attributes
    protected
    Definition Classes
    DeltaSparkPlanUtils
  31. def collectFirst[B](pf: PartialFunction[LogicalPlan, B]): Option[B]
    Definition Classes
    TreeNode
  32. def collectLeaves(): Seq[LogicalPlan]
    Definition Classes
    TreeNode
  33. def collectMergeStats(deltaTxn: OptimisticTransaction, materializeSourceReason: MergeIntoMaterializeSourceReason): MergeStats

    Collects the merge operation stats and metrics into a MergeStats object that can be recorded with recordDeltaEvent.

    Collects the merge operation stats and metrics into a MergeStats object that can be recorded with recordDeltaEvent. Merge stats should be collected after committing all new actions as metrics may still be updated during commit.

    Attributes
    protected
    Definition Classes
    MergeIntoCommandBase
  34. def collectWithSubqueries[B](f: PartialFunction[LogicalPlan, B]): Seq[B]
    Definition Classes
    QueryPlan
  35. val condition: Expression
    Definition Classes
    MergeIntoCommandMergeIntoCommandBase
  36. def conf: SQLConf
    Definition Classes
    SQLConfHelper
  37. lazy val constraints: ExpressionSet
    Definition Classes
    QueryPlanConstraints
  38. def constructIsNotNullConstraints(constraints: ExpressionSet, output: Seq[Attribute]): ExpressionSet
    Definition Classes
    ConstraintHelper
  39. final def containsAllPatterns(patterns: TreePattern*): Boolean
    Definition Classes
    TreePatternBits
  40. final def containsAnyPattern(patterns: TreePattern*): Boolean
    Definition Classes
    TreePatternBits
  41. lazy val containsChild: Set[TreeNode[_]]
    Definition Classes
    TreeNode
  42. def containsDeterministicUDF(expr: Expression): Boolean

    Returns whether an expression contains any deterministic UDFs.

    Returns whether an expression contains any deterministic UDFs.

    Definition Classes
    DeltaSparkPlanUtils
  43. def containsDeterministicUDF(predicates: Seq[DeltaTableReadPredicate], partitionedOnly: Boolean): Boolean

    Returns whether the read predicates of a transaction contain any deterministic UDFs.

    Returns whether the read predicates of a transaction contain any deterministic UDFs.

    Definition Classes
    DeltaSparkPlanUtils
  44. final def containsPattern(t: TreePattern): Boolean
    Definition Classes
    TreePatternBits
    Annotations
    @inline()
  45. def copyTagsFrom(other: LogicalPlan): Unit
    Definition Classes
    TreeNode
  46. def createSetTransaction(sparkSession: SparkSession, deltaLog: DeltaLog, options: Option[DeltaOptions] = None): Option[SetTransaction]

    Returns SetTransaction if a valid app ID and version are present.

    Returns SetTransaction if a valid app ID and version are present. Otherwise returns an empty list.

    Attributes
    protected
    Definition Classes
    DeltaCommand
  47. def deltaAssert(check: ⇒ Boolean, name: String, msg: String, deltaLog: DeltaLog = null, data: AnyRef = null, path: Option[Path] = None): Unit

    Helper method to check invariants in Delta code.

    Helper method to check invariants in Delta code. Fails when running in tests, records a delta assertion event and logs a warning otherwise.

    Attributes
    protected
    Definition Classes
    DeltaLogging
  48. lazy val deterministic: Boolean
    Definition Classes
    QueryPlan
  49. lazy val distinctKeys: Set[ExpressionSet]
    Definition Classes
    LogicalPlanDistinctKeys
  50. def doCanonicalize(): LogicalPlan
    Attributes
    protected
    Definition Classes
    QueryPlan
  51. final def eq(arg0: AnyRef): Boolean
    Definition Classes
    AnyRef
  52. def exists(f: (LogicalPlan) ⇒ Boolean): Boolean
    Definition Classes
    TreeNode
  53. final def expressions: Seq[Expression]
    Definition Classes
    QueryPlan
  54. def extractPredicatesWithinOutputSet(condition: Expression, outputSet: AttributeSet): Option[Expression]
    Attributes
    protected
    Definition Classes
    PredicateHelper
  55. def fastEquals(other: TreeNode[_]): Boolean
    Definition Classes
    TreeNode
  56. def finalize(): Unit
    Attributes
    protected[lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( classOf[java.lang.Throwable] )
  57. def find(f: (LogicalPlan) ⇒ Boolean): Option[LogicalPlan]
    Definition Classes
    TreeNode
  58. def findExpressionAndTrackLineageDown(exp: Expression, plan: LogicalPlan): Option[(Expression, LogicalPlan)]
    Definition Classes
    PredicateHelper
  59. def findFirstNonDeltaScan(source: LogicalPlan): Option[LogicalPlan]
    Attributes
    protected
    Definition Classes
    DeltaSparkPlanUtils
  60. def findFirstNonDeterministicChildNode(children: Seq[Expression], checkDeterministicOptions: CheckDeterministicOptions): Option[PlanOrExpression]
    Attributes
    protected
    Definition Classes
    DeltaSparkPlanUtils
  61. def findFirstNonDeterministicNode(child: Expression, checkDeterministicOptions: CheckDeterministicOptions): Option[PlanOrExpression]
    Attributes
    protected
    Definition Classes
    DeltaSparkPlanUtils
  62. def findFirstNonDeterministicNode(plan: LogicalPlan, checkDeterministicOptions: CheckDeterministicOptions): Option[PlanOrExpression]

    Returns a part of the plan that does not have a safe level of determinism.

    Returns a part of the plan that does not have a safe level of determinism. This is a conservative approximation of plan being a truly deterministic query.

    Attributes
    protected
    Definition Classes
    DeltaSparkPlanUtils
  63. def findTouchedFiles(spark: SparkSession, deltaTxn: OptimisticTransaction): (Seq[AddFile], DeduplicateCDFDeletes)

    Find the target table files that contain the rows that satisfy the merge condition.

    Find the target table files that contain the rows that satisfy the merge condition. This is implemented as an inner-join between the source query/table and the target table using the merge condition.

    Attributes
    protected
    Definition Classes
    ClassicMergeExecutor
  64. def flatMap[A](f: (LogicalPlan) ⇒ TraversableOnce[A]): Seq[A]
    Definition Classes
    TreeNode
  65. def foreach(f: (LogicalPlan) ⇒ Unit): Unit
    Definition Classes
    TreeNode
  66. def foreachUp(f: (LogicalPlan) ⇒ Unit): Unit
    Definition Classes
    TreeNode
  67. def formattedNodeName: String
    Attributes
    protected
    Definition Classes
    QueryPlan
  68. def generateAllActionExprs(targetWriteCols: Seq[Expression], rowIdColumnExpressionOpt: Option[NamedExpression], rowCommitVersionColumnExpressionOpt: Option[NamedExpression], clausesWithPrecompConditions: Seq[DeltaMergeIntoClause], cdcEnabled: Boolean, shouldCountDeletedRows: Boolean): Seq[ProcessedClause]

    Generate expressions for every output column and every merge clause based on the corresponding UPDATE, DELETE and/or INSERT action(s).

    Generate expressions for every output column and every merge clause based on the corresponding UPDATE, DELETE and/or INSERT action(s).

    targetWriteCols

    List of output column expressions from the target table. Used to generate CDC data for DELETE.

    rowIdColumnExpressionOpt

    The optional Row ID preservation column with the physical Row ID name, it stores stable Row IDs of the table.

    rowCommitVersionColumnExpressionOpt

    The optional Row Commit Version preservation column with the physical Row Commit Version name, it stores stable Row Commit Versions.

    clausesWithPrecompConditions

    List of merge clauses with precomputed conditions. Action expressions are generated for each of these clauses.

    cdcEnabled

    Whether the generated expressions should include CDC information.

    shouldCountDeletedRows

    Whether metrics for number of deleted rows should be incremented here.

    returns

    For each merge clause, a list of ProcessedClause each with a precomputed condition and N+2 action expressions (N output columns + ROW_DROPPED_COL + CDC_TYPE_COLUMN_NAME) to apply on a row when that clause matches.

    Attributes
    protected
    Definition Classes
    MergeOutputGeneration
  69. def generateCandidateFileMap(basePath: Path, candidateFiles: Seq[AddFile]): Map[String, AddFile]

    Generates a map of file names to add file entries for operations where we will need to rewrite files such as delete, merge, update.

    Generates a map of file names to add file entries for operations where we will need to rewrite files such as delete, merge, update. We expect file names to be unique, because each file contains a UUID.

    Definition Classes
    DeltaCommand
  70. def generateCdcAndOutputRows(sourceDf: DataFrame, outputCols: Seq[Column], outputColNames: Seq[String], noopCopyExprs: Seq[Expression], rowIdColumnNameOpt: Option[String], rowCommitVersionColumnNameOpt: Option[String], deduplicateDeletes: DeduplicateCDFDeletes): DataFrame

    Build the full output as an array of packed rows, then explode into the final result.

    Build the full output as an array of packed rows, then explode into the final result. Based on the CDC type as originally marked, we produce both rows for the CDC_TYPE_NOT_CDC partition to be written to the main table and rows for the CDC partitions to be written as CDC files.

    See CDCReader for general details on how partitioning on the CDC type column works.

    Attributes
    protected
    Definition Classes
    MergeOutputGeneration
  71. def generateClauseOutputExprs(numOutputCols: Int, clauses: Seq[ProcessedClause], noopExprs: Seq[Expression]): Seq[Expression]

    Generate the output expression for each output column to apply the correct action for a type of merge clause.

    Generate the output expression for each output column to apply the correct action for a type of merge clause. For each output column, the resulting expression dispatches the correct action based on all clause conditions.

    numOutputCols

    Number of output columns.

    clauses

    List of preprocessed merge clauses to bind together.

    noopExprs

    Default expression to apply when no condition holds.

    returns

    A list of one expression per output column to apply for a type of merge clause.

    Attributes
    protected
    Definition Classes
    MergeOutputGeneration
  72. def generateFilterForModifiedRows(): Expression

    Returns the expression that can be used for selecting the modified rows generated by the merge operation.

    Returns the expression that can be used for selecting the modified rows generated by the merge operation. The expression is to designed to work irrespectively of the join type used between the source and target tables.

    The expression consists of two parts, one for each of the action clause types that produce row modifications: MATCHED, NOT MATCHED BY SOURCE. All actions of the same clause type form a disjunctive clause. The result is then conjucted to an expression that filters the rows of the particular action clause type. For example:

    MERGE INTO t USING s ON s.id = t.id WHEN MATCHED AND id < 5 THEN ... WHEN MATCHED AND id > 10 THEN ... WHEN NOT MATCHED BY SOURCE AND id > 20 THEN ...

    Produces the following expression:

    ((as.id = t.id) AND (id < 5 OR id > 10)) OR ((SOURCE TABLE IS NULL) AND (id > 20))

    Attributes
    protected
    Definition Classes
    ClassicMergeExecutor
  73. def generateFilterForNewRows(): Expression

    Returns the expression that can be used for selecting the new rows generated by the merge operation.

    Returns the expression that can be used for selecting the new rows generated by the merge operation.

    Attributes
    protected
    Definition Classes
    ClassicMergeExecutor
  74. def generatePrecomputedConditionsAndDF(sourceDF: DataFrame, clauses: Seq[DeltaMergeIntoClause]): (DataFrame, Seq[DeltaMergeIntoClause])

    Precompute conditions in MATCHED and NOT MATCHED clauses and generate the source data frame with precomputed boolean columns.

    Precompute conditions in MATCHED and NOT MATCHED clauses and generate the source data frame with precomputed boolean columns.

    sourceDF

    the source DataFrame.

    clauses

    the merge clauses to precompute.

    returns

    Generated sourceDF with precomputed boolean columns, matched clauses with possible rewritten clause conditions, insert clauses with possible rewritten clause conditions

    Attributes
    protected
    Definition Classes
    MergeOutputGeneration
  75. def generateTreeString(depth: Int, lastChildren: ArrayList[Boolean], append: (String) ⇒ Unit, verbose: Boolean, prefix: String, addSuffix: Boolean, maxFields: Int, printNodeId: Boolean, indent: Int): Unit
    Definition Classes
    TreeNode
  76. def generateUpdateExpressions(targetSchema: StructType, defaultExprs: Seq[NamedExpression], nameParts: Seq[Seq[String]], updateExprs: Seq[Expression], resolver: Resolver, generatedColumns: Seq[StructField]): Seq[Option[Expression]]

    See docs on overloaded method.

    See docs on overloaded method.

    Attributes
    protected
    Definition Classes
    UpdateExpressionsSupport
  77. def generateUpdateExpressions(targetSchema: StructType, updateOps: Seq[UpdateOperation], defaultExprs: Seq[NamedExpression], resolver: Resolver, pathPrefix: Seq[String] = Nil, allowSchemaEvolution: Boolean = false, generatedColumns: Seq[StructField] = Nil): Seq[Option[Expression]]

    Given a target schema and a set of update operations, generate a list of update expressions, which are aligned with the given schema.

    Given a target schema and a set of update operations, generate a list of update expressions, which are aligned with the given schema.

    For update operations to nested struct fields, this method recursively walks down schema tree and apply the update expressions along the way. For example, assume table target has the following schema: s1 struct<a: int, b: int, c: int>, s2 struct<a: int, b: int>, z int

    Given an update command:

    • UPDATE target SET s1.a = 1, s1.b = 2, z = 3

    this method works as follows:

    generateUpdateExpressions( targetSchema=[s1,s2,z], defaultExprs=[s1,s2, z], updateOps=[(s1.a, 1), (s1.b, 2), (z, 3)]) -> generates expression for s1 - build recursively from child assignments generateUpdateExpressions( targetSchema=[a,b,c], defaultExprs=[a, b, c], updateOps=[(a, 1),(b, 2)], pathPrefix=["s1"]) end-of-recursion -> returns (1, 2, a.c) -> generates expression for s2 - no child assignment and no update expression: use default expression s2 -> generates expression for z - use available update expression 3 -> returns ((1, 2, a.c), s2, 3)

    targetSchema

    schema to follow to generate update expressions. Due to schema evolution, it may contain additional columns or fields not present in the original table schema.

    updateOps

    a set of update operations.

    defaultExprs

    the expressions to use when no update operation is provided for a column or field. This is typically the output from the base table.

    pathPrefix

    the path from root to the current (nested) column. Only used for printing out full column path in error messages.

    allowSchemaEvolution

    Whether to allow generating expressions for new columns or fields added by schema evolution.

    generatedColumns

    the list of the generated columns in the table. When a column is a generated column and the user doesn't provide a update expression, its update expression in the return result will be None. If generatedColumns is empty, any of the options in the return result must be non-empty.

    returns

    a sequence of expression options. The elements in the sequence are options because when a column is a generated column but the user doesn't provide an update expression for this column, we need to generate the update expression according to the generated column definition. But this method doesn't have enough context to do that. Hence, we return a None for this case so that the caller knows it should generate the update expression for such column. For other cases, we will always return Some(expr).

    Attributes
    protected
    Definition Classes
    UpdateExpressionsSupport
  78. def generateUpdateExprsForGeneratedColumns(updateTarget: LogicalPlan, generatedColumns: Seq[StructField], updateExprs: Seq[Option[Expression]], postEvolutionTargetSchema: Option[StructType] = None): Seq[Expression]

    Generate update expressions for generated columns that the user doesn't provide a update expression.

    Generate update expressions for generated columns that the user doesn't provide a update expression. For each item in updateExprs that's None, we will find its generation expression from generatedColumns. In order to resolve this generation expression, we will create a fake Project which contains all update expressions and resolve the generation expression with this project. Source columns of a generation expression will also be replaced with their corresponding update expressions.

    For example, given a table that has a generated column g defined as c1 + 10. For the following update command:

    UPDATE target SET c1 = c2 + 100, c2 = 1000

    We will generate the update expression (c2 + 100) + 10 for column g. Note: in this update expression, we should use the old c2 attribute rather than its new value 1000.

    updateTarget

    The logical plan of the table to be updated.

    generatedColumns

    A list of generated columns.

    updateExprs

    The aligned (with postEvolutionTargetSchema if not None, or updateTarget.output otherwise) update actions.

    postEvolutionTargetSchema

    In case of UPDATE in MERGE when schema evolution happened, this is the final schema of the target table. This might not be the same as the output of updateTarget.

    returns

    a sequence of update expressions for all of columns in the table.

    Attributes
    protected
    Definition Classes
    UpdateExpressionsSupport
  79. def generateWriteAllChangesOutputCols(targetWriteCols: Seq[Expression], rowIdColumnExpressionOpt: Option[NamedExpression], rowCommitVersionColumnExpressionOpt: Option[NamedExpression], targetWriteColNames: Seq[String], noopCopyExprs: Seq[Expression], clausesWithPrecompConditions: Seq[DeltaMergeIntoClause], cdcEnabled: Boolean, shouldCountDeletedRows: Boolean = true): IndexedSeq[Column]

    Generate the expressions to process full-outer join output and generate target rows.

    Generate the expressions to process full-outer join output and generate target rows.

    To generate these N + 2 columns, we generate N + 2 expressions and apply them on the joinedDF. The CDC column will be either used for CDC generation or dropped before performing the final write, and the other column will always be dropped after executing the increment metric expression and filtering on ROW_DROPPED_COL.

    Attributes
    protected
    Definition Classes
    MergeOutputGeneration
  80. def getAliasMap(exprs: Seq[NamedExpression]): AttributeMap[Alias]
    Attributes
    protected
    Definition Classes
    AliasHelper
  81. def getAliasMap(plan: Aggregate): AttributeMap[Alias]
    Attributes
    protected
    Definition Classes
    AliasHelper
  82. def getAliasMap(plan: Project): AttributeMap[Alias]
    Attributes
    protected
    Definition Classes
    AliasHelper
  83. final def getClass(): Class[_]
    Definition Classes
    AnyRef → Any
    Annotations
    @native()
  84. def getCommonTags(deltaLog: DeltaLog, tahoeId: String): Map[TagDefinition, String]
    Definition Classes
    DeltaLogging
  85. def getDefaultTreePatternBits: BitSet
    Attributes
    protected
    Definition Classes
    TreeNode
  86. def getDeltaLog(spark: SparkSession, path: Option[String], tableIdentifier: Option[TableIdentifier], operationName: String, hadoopConf: Map[String, String] = Map.empty): DeltaLog

    Utility method to return the DeltaLog of an existing Delta table referred by either the given path or

    Utility method to return the DeltaLog of an existing Delta table referred by either the given path or

    spark

    SparkSession reference to use.

    path

    Table location. Expects a non-empty tableIdentifier or path.

    tableIdentifier

    Table identifier. Expects a non-empty tableIdentifier or path.

    operationName

    Operation that is getting the DeltaLog, used in error messages.

    hadoopConf

    Hadoop file system options used to build DeltaLog.

    returns

    DeltaLog of the table

    Attributes
    protected
    Definition Classes
    DeltaCommand
    Exceptions thrown

    AnalysisException If either no Delta table exists at the given path/identifier or there is neither path nor tableIdentifier is provided.

  87. def getDeltaTable(target: LogicalPlan, cmd: String): DeltaTableV2

    Extracts the DeltaTableV2 from a LogicalPlan iff the LogicalPlan is a ResolvedTable with either a DeltaTableV2 or a V1Table that is referencing a Delta table.

    Extracts the DeltaTableV2 from a LogicalPlan iff the LogicalPlan is a ResolvedTable with either a DeltaTableV2 or a V1Table that is referencing a Delta table. In all other cases this method will throw a "Table not found" exception.

    Definition Classes
    DeltaCommand
  88. def getDeltaTablePathOrIdentifier(target: LogicalPlan, cmd: String): (Option[TableIdentifier], Option[String])

    Helper method to extract the table id or path from a LogicalPlan representing a Delta table.

    Helper method to extract the table id or path from a LogicalPlan representing a Delta table. This uses DeltaCommand.getDeltaTable to convert the LogicalPlan to a DeltaTableV2 and then extracts either the path or identifier from it. If the DeltaTableV2 has a CatalogTable, the table identifier will be returned. Otherwise, the table's path will be returned. Throws an exception if the LogicalPlan does not represent a Delta table.

    Definition Classes
    DeltaCommand
  89. def getErrorData(e: Throwable): Map[String, Any]
    Definition Classes
    DeltaLogging
  90. def getMergeSource: MergeSource

    Returns the prepared merge source.

    Returns the prepared merge source.

    Attributes
    protected
    Definition Classes
    MergeIntoMaterializeSource
  91. def getMetadataAttributeByName(name: String): AttributeReference
    Definition Classes
    LogicalPlan
  92. def getMetadataAttributeByNameOpt(name: String): Option[AttributeReference]
    Definition Classes
    LogicalPlan
  93. final def getNewDomainMetadata(txn: OptimisticTransaction, canUpdateMetadata: Boolean, isReplacingTable: Boolean, clusterBySpecOpt: Option[ClusterBySpec] = None): Seq[DomainMetadata]

    Returns a sequence of new DomainMetadata if canUpdateMetadata is true and the operation is either create table or replace the whole table (not replaceWhere operation).

    Returns a sequence of new DomainMetadata if canUpdateMetadata is true and the operation is either create table or replace the whole table (not replaceWhere operation). This is because we only update Domain Metadata when creating or replacing table, and replace table for DDL and DataFrameWriterV2 are already handled in CreateDeltaTableCommand. In that case, canUpdateMetadata is false, so we don't update again.

    txn

    OptimisticTransaction being used to create or replace table.

    canUpdateMetadata

    true if the metadata is not updated yet.

    isReplacingTable

    true if the operation is replace table without replaceWhere option.

    clusterBySpecOpt

    optional ClusterBySpec containing user-specified clustering columns.

    Attributes
    protected
    Definition Classes
    ImplicitMetadataOperation
  94. def getTableCatalogTable(target: LogicalPlan, cmd: String): Option[CatalogTable]

    Extracts CatalogTable metadata from a LogicalPlan if the plan is a ResolvedTable.

    Extracts CatalogTable metadata from a LogicalPlan if the plan is a ResolvedTable. The table can be a non delta table.

    Definition Classes
    DeltaCommand
  95. def getTablePathOrIdentifier(target: LogicalPlan, cmd: String): (Option[TableIdentifier], Option[String])

    Helper method to extract the table id or path from a LogicalPlan representing a resolved table or path.

    Helper method to extract the table id or path from a LogicalPlan representing a resolved table or path. This calls getDeltaTablePathOrIdentifier if the resolved table is a delta table. For non delta table with identifier, we extract its identifier. For non delta table with path, it expects the path to be wrapped in an ResolvedPathBasedNonDeltaTable and extracts it from there.

    Definition Classes
    DeltaCommand
  96. def getTagValue[T](tag: TreeNodeTag[T]): Option[T]
    Definition Classes
    TreeNode
  97. def getTargetOnlyPredicates(spark: SparkSession): Seq[Expression]
    Attributes
    protected
    Definition Classes
    MergeIntoCommandBase
  98. def getTouchedFile(basePath: Path, escapedFilePath: String, nameToAddFileMap: Map[String, AddFile]): AddFile

    Find the AddFile record corresponding to the file that was read as part of a delete/update/merge operation.

    Find the AddFile record corresponding to the file that was read as part of a delete/update/merge operation.

    basePath

    The path of the table. Must not be escaped.

    escapedFilePath

    The path to a file that can be either absolute or relative. All special chars in this path must be already escaped by URI standards.

    nameToAddFileMap

    Map generated through generateCandidateFileMap().

    Definition Classes
    DeltaCommand
  99. def hasBeenExecuted(txn: OptimisticTransaction, sparkSession: SparkSession, options: Option[DeltaOptions] = None): Boolean

    Returns true if there is information in the spark session that indicates that this write has already been successfully written.

    Returns true if there is information in the spark session that indicates that this write has already been successfully written.

    Attributes
    protected
    Definition Classes
    DeltaCommand
  100. def hashCode(): Int
    Definition Classes
    TreeNode → AnyRef → Any
  101. def improveUnsupportedOpError(f: ⇒ Unit): Unit
    Attributes
    protected
    Definition Classes
    AnalysisHelper
  102. def includesDeletes: Boolean

    Whether this merge statement includes delete statements.

    Whether this merge statement includes delete statements.

    Attributes
    protected
    Definition Classes
    MergeIntoCommandBase
  103. def includesInserts: Boolean

    Whether this merge statement includes inserts statements.

    Whether this merge statement includes inserts statements.

    Attributes
    protected
    Definition Classes
    MergeIntoCommandBase
  104. def incrementMetricAndReturnBool(name: String, valueToReturn: Boolean): Expression

    returns

    An Expression to increment a SQL metric

    Attributes
    protected
    Definition Classes
    MergeIntoCommandBase
  105. def incrementMetricsAndReturnBool(names: Seq[String], valueToReturn: Boolean): Expression

    returns

    An Expression to increment SQL metrics

    Attributes
    protected
    Definition Classes
    MergeIntoCommandBase
  106. def inferAdditionalConstraints(constraints: ExpressionSet): ExpressionSet
    Definition Classes
    ConstraintHelper
  107. def initializeLogIfNecessary(isInterpreter: Boolean, silent: Boolean): Boolean
    Attributes
    protected
    Definition Classes
    Logging
  108. def initializeLogIfNecessary(isInterpreter: Boolean): Unit
    Attributes
    protected
    Definition Classes
    Logging
  109. def innerChildren: Seq[QueryPlan[_]]
    Definition Classes
    QueryPlan → TreeNode
  110. def inputSet: AttributeSet
    Definition Classes
    QueryPlan
  111. final def invalidateStatsCache(): Unit
    Definition Classes
    LogicalPlanStats
  112. def isCanonicalizedPlan: Boolean
    Attributes
    protected
    Definition Classes
    QueryPlan
  113. def isCatalogTable(analyzer: Analyzer, tableIdent: TableIdentifier): Boolean

    Use the analyzer to see whether the provided TableIdentifier is for a path based table or not

    Use the analyzer to see whether the provided TableIdentifier is for a path based table or not

    analyzer

    The session state analyzer to call

    tableIdent

    Table Identifier to determine whether is path based or not

    returns

    Boolean where true means that the table is a table in a metastore and false means the table is a path based table

    Definition Classes
    DeltaCommand
  114. def isCdcEnabled(deltaTxn: OptimisticTransaction): Boolean
    Attributes
    protected
    Definition Classes
    MergeIntoCommandBase
  115. def isInsertOnly: Boolean

    Whether this merge statement only has only insert (NOT MATCHED) clauses.

    Whether this merge statement only has only insert (NOT MATCHED) clauses.

    Attributes
    protected
    Definition Classes
    MergeIntoCommandBase
  116. final def isInstanceOf[T0]: Boolean
    Definition Classes
    Any
  117. def isLikelySelective(e: Expression): Boolean
    Definition Classes
    PredicateHelper
  118. def isMatchedOnly: Boolean

    Whether this merge statement has only MATCHED clauses.

    Whether this merge statement has only MATCHED clauses.

    Attributes
    protected
    Definition Classes
    MergeIntoCommandBase
  119. def isNullIntolerant(expr: Expression): Boolean
    Attributes
    protected
    Definition Classes
    PredicateHelper
  120. val isOnlyOneUnconditionalDelete: Boolean
    Attributes
    protected
    Definition Classes
    MergeIntoCommandBase
  121. def isPathIdentifier(tableIdent: TableIdentifier): Boolean

    Checks if the given identifier can be for a delta table's path

    Checks if the given identifier can be for a delta table's path

    tableIdent

    Table Identifier for which to check

    Attributes
    protected
    Definition Classes
    DeltaCommand
  122. def isRuleIneffective(ruleId: RuleId): Boolean
    Attributes
    protected
    Definition Classes
    TreeNode
  123. def isStreaming: Boolean
    Definition Classes
    LogicalPlan
  124. def isTraceEnabled(): Boolean
    Attributes
    protected
    Definition Classes
    Logging
  125. def jsonFields: List[JField]
    Attributes
    protected
    Definition Classes
    TreeNode
  126. final def legacyWithNewChildren(newChildren: Seq[LogicalPlan]): LogicalPlan
    Attributes
    protected
    Definition Classes
    TreeNode
  127. def log: Logger
    Attributes
    protected
    Definition Classes
    Logging
  128. def logConsole(line: String): Unit
    Definition Classes
    DatabricksLogging
  129. def logDebug(msg: ⇒ String, throwable: Throwable): Unit
    Attributes
    protected
    Definition Classes
    Logging
  130. def logDebug(msg: ⇒ String): Unit
    Attributes
    protected
    Definition Classes
    Logging
  131. def logError(msg: ⇒ String, throwable: Throwable): Unit
    Attributes
    protected
    Definition Classes
    Logging
  132. def logError(msg: ⇒ String): Unit
    Attributes
    protected
    Definition Classes
    Logging
  133. def logInfo(msg: ⇒ String, throwable: Throwable): Unit
    Attributes
    protected
    Definition Classes
    Logging
  134. def logInfo(msg: ⇒ String): Unit
    Attributes
    protected
    Definition Classes
    Logging
  135. def logName: String
    Attributes
    protected
    Definition Classes
    Logging
  136. def logTrace(msg: ⇒ String, throwable: Throwable): Unit
    Attributes
    protected
    Definition Classes
    Logging
  137. def logTrace(msg: ⇒ String): Unit
    Attributes
    protected
    Definition Classes
    Logging
  138. def logWarning(msg: ⇒ String, throwable: Throwable): Unit
    Attributes
    protected
    Definition Classes
    Logging
  139. def logWarning(msg: ⇒ String): Unit
    Attributes
    protected
    Definition Classes
    Logging
  140. def makeCopy(newArgs: Array[AnyRef]): LogicalPlan
    Definition Classes
    TreeNode
  141. def map[A](f: (LogicalPlan) ⇒ A): Seq[A]
    Definition Classes
    TreeNode
  142. final def mapChildren(f: (LogicalPlan) ⇒ LogicalPlan): LogicalPlan
    Definition Classes
    LeafLike
  143. def mapExpressions(f: (Expression) ⇒ Expression): MergeIntoCommand.this.type
    Definition Classes
    QueryPlan
  144. def mapProductIterator[B](f: (Any) ⇒ B)(implicit arg0: ClassTag[B]): Array[B]
    Attributes
    protected
    Definition Classes
    TreeNode
  145. def markRuleAsIneffective(ruleId: RuleId): Unit
    Attributes
    protected
    Definition Classes
    TreeNode
  146. val matchedClauses: Seq[DeltaMergeIntoMatchedClause]
    Definition Classes
    MergeIntoCommandMergeIntoCommandBase
  147. val materializedSourceRDD: Option[RDD[InternalRow]]

    If the source was materialized, reference to the checkpointed RDD.

    If the source was materialized, reference to the checkpointed RDD.

    Attributes
    protected
    Definition Classes
    MergeIntoMaterializeSource
  148. def maxRows: Option[Long]
    Definition Classes
    LogicalPlan
  149. def maxRowsPerPartition: Option[Long]
    Definition Classes
    LogicalPlan
  150. def metadataOutput: Seq[Attribute]
    Definition Classes
    LogicalPlan
  151. lazy val metrics: Map[String, SQLMetric]
    Definition Classes
    MergeIntoCommandBase → RunnableCommand
  152. val migratedSchema: Option[StructType]
    Definition Classes
    MergeIntoCommandMergeIntoCommandBase
  153. final def missingInput: AttributeSet
    Definition Classes
    QueryPlan
  154. def multiTransformDown(rule: PartialFunction[LogicalPlan, Seq[LogicalPlan]]): Stream[LogicalPlan]
    Definition Classes
    TreeNode
  155. def multiTransformDownWithPruning(cond: (TreePatternBits) ⇒ Boolean, ruleId: RuleId)(rule: PartialFunction[LogicalPlan, Seq[LogicalPlan]]): Stream[LogicalPlan]
    Definition Classes
    TreeNode
  156. val multipleMatchDeleteOnlyOvercount: Option[Long]
    Attributes
    protected
    Definition Classes
    MergeIntoCommandBase
  157. final def ne(arg0: AnyRef): Boolean
    Definition Classes
    AnyRef
  158. def nodeName: String
    Definition Classes
    TreeNode
  159. final val nodePatterns: Seq[TreePattern]
    Definition Classes
    Command → TreeNode
  160. val notMatchedBySourceClauses: Seq[DeltaMergeIntoNotMatchedBySourceClause]
    Definition Classes
    MergeIntoCommandMergeIntoCommandBase
  161. val notMatchedClauses: Seq[DeltaMergeIntoNotMatchedClause]
    Definition Classes
    MergeIntoCommandMergeIntoCommandBase
  162. final def notify(): Unit
    Definition Classes
    AnyRef
    Annotations
    @native()
  163. final def notifyAll(): Unit
    Definition Classes
    AnyRef
    Annotations
    @native()
  164. def numberedTreeString: String
    Definition Classes
    TreeNode
  165. val origin: Origin
    Definition Classes
    TreeNode → WithOrigin
  166. def otherCopyArgs: Seq[AnyRef]
    Attributes
    protected
    Definition Classes
    TreeNode
  167. val output: Seq[Attribute]
    Definition Classes
    MergeIntoCommand → Command → QueryPlan
  168. def outputOrdering: Seq[SortOrder]
    Definition Classes
    QueryPlan
  169. lazy val outputSet: AttributeSet
    Definition Classes
    QueryPlan
    Annotations
    @transient()
  170. def outputWithNullability(output: Seq[Attribute], nonNullAttrExprIds: Seq[ExprId]): Seq[Attribute]
    Attributes
    protected
    Definition Classes
    PredicateHelper
  171. def p(number: Int): LogicalPlan
    Definition Classes
    TreeNode
  172. def parsePredicates(spark: SparkSession, predicate: String): Seq[Expression]

    Converts string predicates into Expressions relative to a transaction.

    Converts string predicates into Expressions relative to a transaction.

    Attributes
    protected
    Definition Classes
    DeltaCommand
    Exceptions thrown

    AnalysisException if a non-partition column is referenced.

  173. val performedSecondSourceScan: Boolean
    Attributes
    protected
    Definition Classes
    MergeIntoCommandBase
  174. def planContainsOnlyDeltaScans(source: LogicalPlan): Boolean
    Attributes
    protected
    Definition Classes
    DeltaSparkPlanUtils
  175. def planContainsUdf(plan: LogicalPlan): Boolean
    Attributes
    protected
    Definition Classes
    DeltaSparkPlanUtils
  176. def planIsDeterministic(plan: LogicalPlan, checkDeterministicOptions: CheckDeterministicOptions): Boolean

    Returns true if plan has a safe level of determinism.

    Returns true if plan has a safe level of determinism. This is a conservative approximation of plan being a truly deterministic query.

    Attributes
    protected
    Definition Classes
    DeltaSparkPlanUtils
  177. def postEvolutionTargetExpressions(makeNullable: Boolean = false): Seq[NamedExpression]

    Expressions to convert from a pre-evolution target row to the post-evolution target row.

    Expressions to convert from a pre-evolution target row to the post-evolution target row. These expressions are used for columns that are not modified in updated rows or to copy rows that are not modified. There are two kinds of expressions here: * References to existing columns in the target dataframe. Note that these references may have a different data type than they originally did due to schema evolution so we add a cast that supports schema evolution. The references will be marked as nullable if makeNullable is set to true, which allows the attributes to reference the output of an outer join. * Literal nulls, for new columns which are being added to the target table as part of this transaction, since new columns will have a value of null for all existing rows.

    Attributes
    protected
    Definition Classes
    MergeIntoCommandBase
  178. def prepareMergeSource(spark: SparkSession, source: LogicalPlan, condition: Expression, matchedClauses: Seq[DeltaMergeIntoMatchedClause], notMatchedClauses: Seq[DeltaMergeIntoNotMatchedClause], isInsertOnly: Boolean): Unit

    If source needs to be materialized, prepare the materialized dataframe in sourceDF Otherwise, prepare regular dataframe.

    If source needs to be materialized, prepare the materialized dataframe in sourceDF Otherwise, prepare regular dataframe.

    returns

    the source materialization reason

    Attributes
    protected
    Definition Classes
    MergeIntoMaterializeSource
  179. def prettyJson: String
    Definition Classes
    TreeNode
  180. def printSchema(): Unit
    Definition Classes
    QueryPlan
  181. def producedAttributes: AttributeSet
    Definition Classes
    Command → QueryPlan
  182. def recordDeltaEvent(deltaLog: DeltaLog, opType: String, tags: Map[TagDefinition, String] = Map.empty, data: AnyRef = null, path: Option[Path] = None): Unit

    Used to record the occurrence of a single event or report detailed, operation specific statistics.

    Used to record the occurrence of a single event or report detailed, operation specific statistics.

    path

    Used to log the path of the delta table when deltaLog is null.

    Attributes
    protected
    Definition Classes
    DeltaLogging
  183. def recordDeltaOperation[A](deltaLog: DeltaLog, opType: String, tags: Map[TagDefinition, String] = Map.empty)(thunk: ⇒ A): A

    Used to report the duration as well as the success or failure of an operation on a deltaLog.

    Used to report the duration as well as the success or failure of an operation on a deltaLog.

    Attributes
    protected
    Definition Classes
    DeltaLogging
  184. def recordDeltaOperationForTablePath[A](tablePath: String, opType: String, tags: Map[TagDefinition, String] = Map.empty)(thunk: ⇒ A): A

    Used to report the duration as well as the success or failure of an operation on a tahoePath.

    Used to report the duration as well as the success or failure of an operation on a tahoePath.

    Attributes
    protected
    Definition Classes
    DeltaLogging
  185. def recordEvent(metric: MetricDefinition, additionalTags: Map[TagDefinition, String] = Map.empty, blob: String = null, trimBlob: Boolean = true): Unit
    Definition Classes
    DatabricksLogging
  186. def recordFrameProfile[T](group: String, name: String)(thunk: ⇒ T): T
    Attributes
    protected
    Definition Classes
    DeltaLogging
  187. def recordMergeOperation[A](extraOpType: String = "", status: String = null, sqlMetricName: String = null)(thunk: ⇒ A): A

    Execute the given thunk and return its result while recording the time taken to do it and setting additional local properties for better UI visibility.

    Execute the given thunk and return its result while recording the time taken to do it and setting additional local properties for better UI visibility.

    extraOpType

    extra operation name recorded in the logs

    status

    human readable status string describing what the thunk is doing

    sqlMetricName

    name of SQL metric to update with the time taken by the thunk

    thunk

    the code to execute

    Attributes
    protected
    Definition Classes
    MergeIntoCommandBase
  188. def recordOperation[S](opType: OpType, opTarget: String = null, extraTags: Map[TagDefinition, String], isSynchronous: Boolean = true, alwaysRecordStats: Boolean = false, allowAuthTags: Boolean = false, killJvmIfStuck: Boolean = false, outputMetric: MetricDefinition = METRIC_OPERATION_DURATION, silent: Boolean = true)(thunk: ⇒ S): S
    Definition Classes
    DatabricksLogging
  189. def recordProductEvent(metric: MetricDefinition with CentralizableMetric, additionalTags: Map[TagDefinition, String] = Map.empty, blob: String = null, trimBlob: Boolean = true): Unit
    Definition Classes
    DatabricksLogging
  190. def recordProductUsage(metric: MetricDefinition with CentralizableMetric, quantity: Double, additionalTags: Map[TagDefinition, String] = Map.empty, blob: String = null, forceSample: Boolean = false, trimBlob: Boolean = true, silent: Boolean = false): Unit
    Definition Classes
    DatabricksLogging
  191. def recordUsage(metric: MetricDefinition, quantity: Double, additionalTags: Map[TagDefinition, String] = Map.empty, blob: String = null, forceSample: Boolean = false, trimBlob: Boolean = true, silent: Boolean = false): Unit
    Definition Classes
    DatabricksLogging
  192. lazy val references: AttributeSet
    Definition Classes
    QueryPlan
    Annotations
    @transient()
  193. def refresh(): Unit
    Definition Classes
    LogicalPlan
  194. def removeFilesFromPaths(deltaLog: DeltaLog, nameToAddFileMap: Map[String, AddFile], filesToRewrite: Seq[String], operationTimestamp: Long): Seq[RemoveFile]

    This method provides the RemoveFile actions that are necessary for files that are touched and need to be rewritten in methods like Delete, Update, and Merge.

    This method provides the RemoveFile actions that are necessary for files that are touched and need to be rewritten in methods like Delete, Update, and Merge.

    deltaLog

    The DeltaLog of the table that is being operated on

    nameToAddFileMap

    A map generated using generateCandidateFileMap.

    filesToRewrite

    Absolute paths of the files that were touched. We will search for these in candidateFiles. Obtained as the output of the input_file_name function.

    operationTimestamp

    The timestamp of the operation

    Attributes
    protected
    Definition Classes
    DeltaCommand
  195. def replaceAlias(expr: Expression, aliasMap: AttributeMap[Alias]): Expression
    Attributes
    protected
    Definition Classes
    AliasHelper
  196. def replaceAliasButKeepName(expr: NamedExpression, aliasMap: AttributeMap[Alias]): NamedExpression
    Attributes
    protected
    Definition Classes
    AliasHelper
  197. def resolve(nameParts: Seq[String], resolver: Resolver): Option[NamedExpression]
    Definition Classes
    LogicalPlan
  198. def resolve(schema: StructType, resolver: Resolver): Seq[Attribute]
    Definition Classes
    LogicalPlan
  199. def resolveChildren(nameParts: Seq[String], resolver: Resolver): Option[NamedExpression]
    Definition Classes
    LogicalPlan
  200. def resolveExpressions(r: PartialFunction[Expression, Expression]): LogicalPlan
    Definition Classes
    AnalysisHelper
  201. def resolveExpressionsWithPruning(cond: (TreePatternBits) ⇒ Boolean, ruleId: RuleId)(rule: PartialFunction[Expression, Expression]): LogicalPlan
    Definition Classes
    AnalysisHelper
  202. def resolveIdentifier(analyzer: Analyzer, identifier: TableIdentifier): LogicalPlan

    Use the analyzer to resolve the identifier provided

    Use the analyzer to resolve the identifier provided

    analyzer

    The session state analyzer to call

    identifier

    Table Identifier to determine whether is path based or not

    Attributes
    protected
    Definition Classes
    DeltaCommand
  203. def resolveOperators(rule: PartialFunction[LogicalPlan, LogicalPlan]): LogicalPlan
    Definition Classes
    AnalysisHelper
  204. def resolveOperatorsDown(rule: PartialFunction[LogicalPlan, LogicalPlan]): LogicalPlan
    Definition Classes
    AnalysisHelper
  205. def resolveOperatorsDownWithPruning(cond: (TreePatternBits) ⇒ Boolean, ruleId: RuleId)(rule: PartialFunction[LogicalPlan, LogicalPlan]): LogicalPlan
    Definition Classes
    AnalysisHelper
  206. def resolveOperatorsUp(rule: PartialFunction[LogicalPlan, LogicalPlan]): LogicalPlan
    Definition Classes
    AnalysisHelper
  207. def resolveOperatorsUpWithNewOutput(rule: PartialFunction[LogicalPlan, (LogicalPlan, Seq[(Attribute, Attribute)])]): LogicalPlan
    Definition Classes
    AnalysisHelper
  208. def resolveOperatorsUpWithPruning(cond: (TreePatternBits) ⇒ Boolean, ruleId: RuleId)(rule: PartialFunction[LogicalPlan, LogicalPlan]): LogicalPlan
    Definition Classes
    AnalysisHelper
  209. def resolveOperatorsWithPruning(cond: (TreePatternBits) ⇒ Boolean, ruleId: RuleId)(rule: PartialFunction[LogicalPlan, LogicalPlan]): LogicalPlan
    Definition Classes
    AnalysisHelper
  210. def resolveQuoted(name: String, resolver: Resolver): Option[NamedExpression]
    Definition Classes
    LogicalPlan
  211. def resolveReferencesForExpressions(sparkSession: SparkSession, exprs: Seq[Expression], planProvidingAttrs: LogicalPlan): Seq[Expression]

    Resolve expressions using the attributes provided by planProvidingAttrs.

    Resolve expressions using the attributes provided by planProvidingAttrs. Throw an error if failing to resolve any expressions.

    Attributes
    protected
    Definition Classes
    AnalysisHelper
  212. lazy val resolved: Boolean
    Definition Classes
    LogicalPlan
  213. def rewriteAttrs(attrMap: AttributeMap[Attribute]): LogicalPlan
    Definition Classes
    QueryPlan
  214. def run(spark: SparkSession): Seq[Row]
    Definition Classes
    MergeIntoCommandBase → RunnableCommand
  215. def runMerge(spark: SparkSession): Seq[Row]
    Attributes
    protected
    Definition Classes
    MergeIntoCommandMergeIntoCommandBase
  216. def runWithMaterializedSourceLostRetries(spark: SparkSession, deltaLog: DeltaLog, metrics: Map[String, SQLMetric], runMergeFunc: (SparkSession) ⇒ Seq[Row]): Seq[Row]

    Run the Merge with retries in case it detects an RDD block lost error of the materialized source RDD.

    Run the Merge with retries in case it detects an RDD block lost error of the materialized source RDD. It will also record out of disk error, if such happens - possibly because of increased disk pressure from the materialized source RDD.

    Attributes
    protected
    Definition Classes
    MergeIntoMaterializeSource
  217. def sameOutput(other: LogicalPlan): Boolean
    Definition Classes
    LogicalPlan
  218. final def sameResult(other: LogicalPlan): Boolean
    Definition Classes
    QueryPlan
  219. lazy val sc: SparkContext
    Attributes
    protected
    Definition Classes
    MergeIntoCommandBase
    Annotations
    @transient()
  220. lazy val schema: StructType
    Definition Classes
    QueryPlan
  221. val schemaEvolutionEnabled: Boolean
    Definition Classes
    MergeIntoCommandMergeIntoCommandBase
  222. def schemaString: String
    Definition Classes
    QueryPlan
  223. final def semanticHash(): Int
    Definition Classes
    QueryPlan
  224. def sendDriverMetrics(spark: SparkSession, metrics: Map[String, SQLMetric]): Unit

    Send the driver-side metrics.

    Send the driver-side metrics.

    This is needed to make the SQL metrics visible in the Spark UI. All metrics are default initialized with 0 so that's what we're reporting in case we skip an already executed action.

    Attributes
    protected
    Definition Classes
    DeltaCommand
  225. def seqToString(exprs: Seq[Expression]): String
    Attributes
    protected
    Definition Classes
    MergeIntoCommandBase
  226. def setTagValue[T](tag: TreeNodeTag[T], value: T): Unit
    Definition Classes
    TreeNode
  227. def shouldMaterializeSource(spark: SparkSession, source: LogicalPlan, isInsertOnly: Boolean): (Boolean, MergeIntoMaterializeSourceReason)

    returns

    pair of boolean whether source should be materialized and the source materialization reason

    Attributes
    protected
    Definition Classes
    MergeIntoMaterializeSource
  228. def shouldOptimizeMatchedOnlyMerge(spark: SparkSession): Boolean
    Attributes
    protected
    Definition Classes
    MergeIntoCommandBase
  229. def shouldWritePersistentDeletionVectors(spark: SparkSession, txn: OptimisticTransaction): Boolean
    Attributes
    protected
    Definition Classes
    MergeIntoCommandBase
  230. def simpleString(maxFields: Int): String
    Definition Classes
    QueryPlan → TreeNode
  231. def simpleStringWithNodeId(): String
    Definition Classes
    QueryPlan → TreeNode
  232. val source: LogicalPlan
    Definition Classes
    MergeIntoCommandMergeIntoCommandBase
  233. def splitConjunctivePredicates(condition: Expression): Seq[Expression]
    Attributes
    protected
    Definition Classes
    PredicateHelper
  234. def splitDisjunctivePredicates(condition: Expression): Seq[Expression]
    Attributes
    protected
    Definition Classes
    PredicateHelper
  235. def statePrefix: String
    Attributes
    protected
    Definition Classes
    LogicalPlan → QueryPlan
  236. def stats: Statistics
    Definition Classes
    Command → LogicalPlanStats
  237. val statsCache: Option[Statistics]
    Attributes
    protected
    Definition Classes
    LogicalPlanStats
  238. def stringArgs: Iterator[Any]
    Attributes
    protected
    Definition Classes
    TreeNode
  239. lazy val subqueries: Seq[LogicalPlan]
    Definition Classes
    QueryPlan
    Annotations
    @transient()
  240. def subqueriesAll: Seq[LogicalPlan]
    Definition Classes
    QueryPlan
  241. final def synchronized[T0](arg0: ⇒ T0): T0
    Definition Classes
    AnyRef
  242. val target: LogicalPlan
    Definition Classes
    MergeIntoCommandMergeIntoCommandBase
  243. lazy val targetDeltaLog: DeltaLog
    Attributes
    protected
    Definition Classes
    MergeIntoCommandBase
    Annotations
    @transient()
  244. val targetFileIndex: TahoeFileIndex
    Definition Classes
    MergeIntoCommandMergeIntoCommandBase
  245. def throwErrorOnMultipleMatches(hasMultipleMatches: Boolean, spark: SparkSession): Unit
    Attributes
    protected
    Definition Classes
    MergeIntoCommandBase
  246. def toDataset(sparkSession: SparkSession, logicalPlan: LogicalPlan): Dataset[Row]
    Attributes
    protected
    Definition Classes
    AnalysisHelper
  247. def toJSON: String
    Definition Classes
    TreeNode
  248. def toString(): String
    Definition Classes
    TreeNode → AnyRef → Any
  249. def transform(rule: PartialFunction[LogicalPlan, LogicalPlan]): LogicalPlan
    Definition Classes
    TreeNode
  250. def transformAllExpressions(rule: PartialFunction[Expression, Expression]): MergeIntoCommand.this.type
    Definition Classes
    QueryPlan
  251. def transformAllExpressionsWithPruning(cond: (TreePatternBits) ⇒ Boolean, ruleId: RuleId)(rule: PartialFunction[Expression, Expression]): MergeIntoCommand.this.type
    Definition Classes
    AnalysisHelper → QueryPlan
  252. def transformAllExpressionsWithSubqueries(rule: PartialFunction[Expression, Expression]): MergeIntoCommand.this.type
    Definition Classes
    QueryPlan
  253. def transformDown(rule: PartialFunction[LogicalPlan, LogicalPlan]): LogicalPlan
    Definition Classes
    TreeNode
  254. def transformDownWithPruning(cond: (TreePatternBits) ⇒ Boolean, ruleId: RuleId)(rule: PartialFunction[LogicalPlan, LogicalPlan]): LogicalPlan
    Definition Classes
    AnalysisHelper → TreeNode
  255. def transformDownWithSubqueries(f: PartialFunction[LogicalPlan, LogicalPlan]): LogicalPlan
    Definition Classes
    QueryPlan
  256. def transformDownWithSubqueriesAndPruning(cond: (TreePatternBits) ⇒ Boolean, ruleId: RuleId)(f: PartialFunction[LogicalPlan, LogicalPlan]): LogicalPlan
    Definition Classes
    QueryPlan
  257. def transformExpressions(rule: PartialFunction[Expression, Expression]): MergeIntoCommand.this.type
    Definition Classes
    QueryPlan
  258. def transformExpressionsDown(rule: PartialFunction[Expression, Expression]): MergeIntoCommand.this.type
    Definition Classes
    QueryPlan
  259. def transformExpressionsDownWithPruning(cond: (TreePatternBits) ⇒ Boolean, ruleId: RuleId)(rule: PartialFunction[Expression, Expression]): MergeIntoCommand.this.type
    Definition Classes
    QueryPlan
  260. def transformExpressionsUp(rule: PartialFunction[Expression, Expression]): MergeIntoCommand.this.type
    Definition Classes
    QueryPlan
  261. def transformExpressionsUpWithPruning(cond: (TreePatternBits) ⇒ Boolean, ruleId: RuleId)(rule: PartialFunction[Expression, Expression]): MergeIntoCommand.this.type
    Definition Classes
    QueryPlan
  262. def transformExpressionsWithPruning(cond: (TreePatternBits) ⇒ Boolean, ruleId: RuleId)(rule: PartialFunction[Expression, Expression]): MergeIntoCommand.this.type
    Definition Classes
    QueryPlan
  263. def transformUp(rule: PartialFunction[LogicalPlan, LogicalPlan]): LogicalPlan
    Definition Classes
    TreeNode
  264. def transformUpWithBeforeAndAfterRuleOnChildren(cond: (LogicalPlan) ⇒ Boolean, ruleId: RuleId)(rule: PartialFunction[(LogicalPlan, LogicalPlan), LogicalPlan]): LogicalPlan
    Definition Classes
    TreeNode
  265. def transformUpWithNewOutput(rule: PartialFunction[LogicalPlan, (LogicalPlan, Seq[(Attribute, Attribute)])], skipCond: (LogicalPlan) ⇒ Boolean, canGetOutput: (LogicalPlan) ⇒ Boolean): LogicalPlan
    Definition Classes
    AnalysisHelper → QueryPlan
  266. def transformUpWithPruning(cond: (TreePatternBits) ⇒ Boolean, ruleId: RuleId)(rule: PartialFunction[LogicalPlan, LogicalPlan]): LogicalPlan
    Definition Classes
    AnalysisHelper → TreeNode
  267. def transformUpWithSubqueries(f: PartialFunction[LogicalPlan, LogicalPlan]): LogicalPlan
    Definition Classes
    QueryPlan
  268. def transformWithPruning(cond: (TreePatternBits) ⇒ Boolean, ruleId: RuleId)(rule: PartialFunction[LogicalPlan, LogicalPlan]): LogicalPlan
    Definition Classes
    TreeNode
  269. def transformWithSubqueries(f: PartialFunction[LogicalPlan, LogicalPlan]): LogicalPlan
    Definition Classes
    QueryPlan
  270. lazy val treePatternBits: BitSet
    Definition Classes
    QueryPlan → TreeNode → TreePatternBits
  271. def treeString(append: (String) ⇒ Unit, verbose: Boolean, addSuffix: Boolean, maxFields: Int, printOperatorId: Boolean): Unit
    Definition Classes
    TreeNode
  272. final def treeString(verbose: Boolean, addSuffix: Boolean, maxFields: Int, printOperatorId: Boolean): String
    Definition Classes
    TreeNode
  273. final def treeString: String
    Definition Classes
    TreeNode
  274. def trimAliases(e: Expression): Expression
    Attributes
    protected
    Definition Classes
    AliasHelper
  275. def trimNonTopLevelAliases[T <: Expression](e: T): T
    Attributes
    protected
    Definition Classes
    AliasHelper
  276. def tryResolveReferences(sparkSession: SparkSession)(expr: Expression, planContainingExpr: LogicalPlan): Expression
    Attributes
    protected
    Definition Classes
    AnalysisHelper
  277. def tryResolveReferencesForExpressions(sparkSession: SparkSession)(exprs: Seq[Expression], plansProvidingAttrs: Seq[LogicalPlan]): Seq[Expression]

    Resolve expressions using the attributes provided by planProvidingAttrs, ignoring errors.

    Resolve expressions using the attributes provided by planProvidingAttrs, ignoring errors.

    Attributes
    protected
    Definition Classes
    AnalysisHelper
  278. def tryResolveReferencesForExpressions(sparkSession: SparkSession, exprs: Seq[Expression], planContainingExpr: LogicalPlan): Seq[Expression]
    Attributes
    protected
    Definition Classes
    AnalysisHelper
  279. def unsetTagValue[T](tag: TreeNodeTag[T]): Unit
    Definition Classes
    TreeNode
  280. final def updateMetadata(spark: SparkSession, txn: OptimisticTransaction, schema: StructType, partitionColumns: Seq[String], configuration: Map[String, String], isOverwriteMode: Boolean, rearrangeOnly: Boolean): Unit
    Attributes
    protected
    Definition Classes
    ImplicitMetadataOperation
  281. def updateOuterReferencesInSubquery(plan: LogicalPlan, attrMap: AttributeMap[Attribute]): LogicalPlan
    Definition Classes
    AnalysisHelper → QueryPlan
  282. lazy val validConstraints: ExpressionSet
    Attributes
    protected
    Definition Classes
    QueryPlanConstraints
  283. def verboseString(maxFields: Int): String
    Definition Classes
    QueryPlan → TreeNode
  284. def verboseStringWithOperatorId(): String
    Definition Classes
    QueryPlan
  285. def verboseStringWithSuffix(maxFields: Int): String
    Definition Classes
    LogicalPlan → TreeNode
  286. def verifyPartitionPredicates(spark: SparkSession, partitionColumns: Seq[String], predicates: Seq[Expression]): Unit
    Definition Classes
    DeltaCommand
  287. final def wait(): Unit
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  288. final def wait(arg0: Long, arg1: Int): Unit
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  289. final def wait(arg0: Long): Unit
    Definition Classes
    AnyRef
    Annotations
    @throws( ... ) @native()
  290. final def withNewChildren(newChildren: Seq[LogicalPlan]): LogicalPlan
    Definition Classes
    TreeNode
  291. def withNewChildrenInternal(newChildren: IndexedSeq[LogicalPlan]): LogicalPlan
    Definition Classes
    LeafLike
  292. def withStatusCode[T](statusCode: String, defaultMessage: String, data: Map[String, Any] = Map.empty)(body: ⇒ T): T

    Report a log to indicate some command is running.

    Report a log to indicate some command is running.

    Definition Classes
    DeltaProgressReporter
  293. def writeAllChanges(spark: SparkSession, deltaTxn: OptimisticTransaction, filesToRewrite: Seq[AddFile], deduplicateCDFDeletes: DeduplicateCDFDeletes, writeUnmodifiedRows: Boolean): Seq[FileAction]

    Write new files by reading the touched files and updating/inserting data using the source query/table.

    Write new files by reading the touched files and updating/inserting data using the source query/table. This is implemented using a full-outer-join using the merge condition.

    Note that unlike the insert-only code paths with just one control column ROW_DROPPED_COL, this method has a second control column CDC_TYPE_COL_NAME used for handling CDC when enabled.

    Attributes
    protected
    Definition Classes
    ClassicMergeExecutor
  294. def writeDVs(spark: SparkSession, deltaTxn: OptimisticTransaction, filesToRewrite: Seq[AddFile]): Seq[FileAction]

    Writes Deletion Vectors for rows modified by the merge operation.

    Writes Deletion Vectors for rows modified by the merge operation.

    Attributes
    protected
    Definition Classes
    ClassicMergeExecutor
  295. def writeFiles(spark: SparkSession, txn: OptimisticTransaction, outputDF: DataFrame): Seq[FileAction]

    Write the output data to files, repartitioning the output DataFrame by the partition columns if table is partitioned and merge.repartitionBeforeWrite.enabled is set to true.

    Write the output data to files, repartitioning the output DataFrame by the partition columns if table is partitioned and merge.repartitionBeforeWrite.enabled is set to true.

    Attributes
    protected
    Definition Classes
    MergeIntoCommandBase
  296. def writeOnlyInserts(spark: SparkSession, deltaTxn: OptimisticTransaction, filterMatchedRows: Boolean, numSourceRowsMetric: String): Seq[FileAction]

    Optimization to write new files by inserting only new data.

    Optimization to write new files by inserting only new data.

    When there are no matched clauses for the merge command, data is skipped based on the merge condition and left anti join is performed on the source data to find the rows to be inserted.

    When there is nothing matched for the merge command even if there are matched clauses, the source table is used to perform inserting.

    spark

    The spark session.

    deltaTxn

    The existing transaction.

    filterMatchedRows

    Whether to filter away matched data or not.

    numSourceRowsMetric

    The name of the metric in which to record the number of source rows

    Attributes
    protected
    Definition Classes
    InsertOnlyMergeExecutor
  297. object RetryHandling extends Enumeration
    Definition Classes
    MergeIntoMaterializeSource
  298. object SubqueryExpression

    Extractor object for the subquery plan of expressions that contain subqueries.

    Extractor object for the subquery plan of expressions that contain subqueries.

    Definition Classes
    DeltaSparkPlanUtils

Inherited from Serializable

Inherited from Serializable

Inherited from ClassicMergeExecutor

Inherited from InsertOnlyMergeExecutor

Inherited from MergeOutputGeneration

Inherited from MergeIntoCommandBase

Inherited from UpdateExpressionsSupport

Inherited from AnalysisHelper

Inherited from DeltaSparkPlanUtils

Inherited from ImplicitMetadataOperation

Inherited from PredicateHelper

Inherited from AliasHelper

Inherited from DeltaCommand

Inherited from DeltaLogging

Inherited from DatabricksLogging

Inherited from DeltaProgressReporter

Inherited from LeafRunnableCommand

Inherited from LeafLike[LogicalPlan]

Inherited from RunnableCommand

Inherited from Command

Inherited from LogicalPlan

Inherited from Logging

Inherited from QueryPlanConstraints

Inherited from ConstraintHelper

Inherited from LogicalPlanDistinctKeys

Inherited from LogicalPlanStats

Inherited from AnalysisHelper

Inherited from QueryPlan[LogicalPlan]

Inherited from SQLConfHelper

Inherited from TreeNode[LogicalPlan]

Inherited from WithOrigin

Inherited from TreePatternBits

Inherited from Product

Inherited from Equals

Inherited from AnyRef

Inherited from Any

Ungrouped