Class DeterminePartitionCount
java.lang.Object
io.trino.sql.planner.optimizations.DeterminePartitionCount
- All Implemented Interfaces:
PlanOptimizer
This rule looks at the amount of data read and processed by the query to determine the value of partition count
used for remote partitioned exchanges. It helps to increase the concurrency of the engine in the case of large cluster.
This rule is also cautious about lack of or incorrect statistics therefore it skips for input multiplying nodes like
CROSS JOIN or UNNEST.
E.g. 1: Given query: SELECT count(column_a) FROM table_with_stats_a group by column_b config: MIN_INPUT_SIZE_PER_TASK: 500 MB Input table data size: 1000 MB Estimated partition count: Input table data size / MIN_INPUT_SIZE_PER_TASK => 2
E.g. 2: Given query: SELECT * FROM table_with_stats_a as a JOIN table_with_stats_b as b ON a.column_b = b.column_b config: MIN_INPUT_SIZE_PER_TASK: 500 MB Input tables data size: 1000 MB Join output data size: 5000 MB Estimated partition count: max((Input table data size / MIN_INPUT_SIZE_PER_TASK), (Join output data size / MIN_INPUT_SIZE_PER_TASK)) => 10
-
Nested Class Summary
Nested classes/interfaces inherited from interface io.trino.sql.planner.optimizations.PlanOptimizer
PlanOptimizer.Context -
Constructor Summary
ConstructorsConstructorDescriptionDeterminePartitionCount(StatsCalculator statsCalculator, TaskCountEstimator taskCountEstimator) -
Method Summary
-
Constructor Details
-
DeterminePartitionCount
public DeterminePartitionCount(StatsCalculator statsCalculator, TaskCountEstimator taskCountEstimator)
-
-
Method Details
-
optimize
- Specified by:
optimizein interfacePlanOptimizer
-