Class DecorrelateUnnest
java.lang.Object
io.trino.sql.planner.iterative.rule.DecorrelateUnnest
- All Implemented Interfaces:
Rule<CorrelatedJoinNode>
This rule decorrelates plans with correlated UnnestNode and optional EnforceSingleRowNode,
optional LimitNode, optional TopNNode and optional projections in the subquery.
The rule finds correlated UnnestNode in CorrelatedJoinNode's subquery and folds them into UnnestNode representing INNER or LEFT JOIN UNNEST. It transforms plans, where: - CorrelatedJoinNode is INNER or LEFT on true - UnnestNode in subquery is based only on correlation symbols - UnnestNode in subquery is INNER or LEFT without filter
Transforms:
- CorrelatedJoin (INNER or LEFT) on true, correlation(c)
- Input (a, c)
[- EnforceSingleRow]
[- Limit (5)]
[- TopN (10) order by x]
[- Project x <- foo(u)]
- Unnest INNER or LEFT
u <- unnest(c)
replicate: ()
Into:
Note: The RowNumberNodes and WindowNodes produced by the rewrite along with filters, can be further optimized to TopNRankingNodes.- Project (restrict outputs) [- Project [*1] a <- a c <- c x <- IF(ordinality IS NULL, null, x)] [- Filter (fail if row_number > 1)] [*2] [- Filter (row_number < 5)] [*3] [- Filter (row_number < 10) [*4] - Window partition by (unique), order by (x) row_number <- row_number()] [- Projection x <- foo(u)] [*5] - Unnest (LEFT or INNER) [*6] WITH ORDINALITY (ordinality) u <- unnest(c) replicate: (a, c, unique) - AssignUniqueId (unique) - Input (a, c)[1] If UnnestNode is rewritten from INNER to LEFT, synthetic rows with nulls are added by the LEFT unnest at the bottom of the plan. In the correlated plan, they would be added in EnforceSingleRowNode or during join, that is near the root of the plan after all projections. This ProjectNode restores null values which might have been modified by projections. It uses ordinality symbol to distinguish between unnested rows and synthetic rows:x <- IF(ordinality IS NULL, null, x)[2] If the original plan has EnforceSingleRowNode in the subquery, it has to be restored. EnforceSingleRowNode is responsible for: - adding a synthetic row of nulls where there are no rows, - checking that there is no more than 1 row. In this rewrite, if EnforceSingleRowNode is present in the original plan, the rewritten UnnestNode is LEFT. This ensures that there is at least 1 row for each input row. To restore the semantics of EnforceSingleRowNode, it is sufficient to add a check that there is no more than 1 row for each input row. This is achieved by RowNumberNode partitioned by input rows (unique) + FilterNode. If RowNumberNode is already present in the plan, only a FilterNode is added. [3] If the original plan has LimitNode in the subquery, it has to be restored. It is achieved by RowNumberNode partitioned by input rows (unique) and a FilterNode. If RowNumberNode is already present in the plan, only a FilterNode is added. [4] If the original plan has TopNNode in the subquery, it has to be restored. It is achieved by row_number() function over window partitioned by input rows (unique) and ordered by TopNNode's ordering scheme + FilterNode. Even if RowNumberNode is present in the plan, the rowNumberSymbol cannot be reused because its numbering order might not match the TopNNode's ordering. [5] All projections present in the subquery are restored on top of the rewritten UnnestNode. Apart from their original assignments, they pass all the input symbols, the ordinality symbol and the row number symbol (if present) which might be useful in the upstream plan. [6] Type of the rewritten UnnestNode is LEFT with one exception: if both CorrelatedJoinNode and original UnnestNode are INNER, and there is no EnforceSingleRowNode in the subquery, then the type is INNER. If the unnest type is rewritten from INNER to LEFT, the INNER semantics is restored by a projection ([1]).
Note: this rule captures and transforms different plans. Therefore there is some redundancy. Not every flow will use ordinality symbol or unique symbol. However, for simplicity, they are always present. It is up to other optimizer rules to prune unused symbols or redundant nodes.
-
Nested Class Summary
Nested classes/interfaces inherited from interface io.trino.sql.planner.iterative.Rule
Rule.Context, Rule.Result -
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionapply(CorrelatedJoinNode correlatedJoinNode, Captures captures, Rule.Context context) Returns a pattern to which plan nodes this rule applies.
-
Constructor Details
-
DecorrelateUnnest
-
-
Method Details
-
getPattern
Description copied from interface:RuleReturns a pattern to which plan nodes this rule applies.- Specified by:
getPatternin interfaceRule<CorrelatedJoinNode>
-