public class HiveAggregatePartitionIncrementalRewritingRule
extends org.apache.calcite.plan.RelOptRule
Rule to prepare the plan for incremental view maintenance if the view is partitioned and insert only:
Insert overwrite the partitions which are affected since the last rebuild only and leave the
rest of the partitions intact.
Assume that we have a materialized view partitioned on column a and writeId was 1 at the last rebuild:
CREATE MATERIALIZED VIEW mat1 PARTITIONED ON (a) STORED AS ORC TBLPROPERTIES ("transactional"="true", "transactional_properties"="insert_only") AS
SELECT a, b, sum(c) sumc FROM t1 GROUP BY b, a;
1. Query all rows from source tables since the last rebuild.
2. Query all rows from MV which are in any of the partitions queried in 1.
3. Take the union of rows from 1. and 2. and perform the same aggregations defined in the MV
SELECT a, b, sum(sumc) FROM (
SELECT a, b, sumc FROM mat1
LEFT SEMI JOIN (SELECT a, b, sum(c) FROM t1 WHERE ROW__ID.writeId > 1 GROUP BY b, a) q ON (mat1.a <=> q.a)
UNION ALL
SELECT a, b, sum(c) sumc FROM t1 WHERE ROW__ID.writeId > 1 GROUP BY b, a
) sub
GROUP BY b, a