Class DeterminePartitionCount
java.lang.Object
io.trino.sql.planner.optimizations.DeterminePartitionCount
- All Implemented Interfaces:
PlanOptimizer
This rule looks at the amount of data read and processed by the query to determine the value of partition count
used for remote partitioned exchanges. It helps to increase the concurrency of the engine in the case of large cluster.
This rule is also cautious about lack of or incorrect statistics therefore it skips for input multiplying nodes like
CROSS JOIN or UNNEST.
E.g. 1:
Given query: SELECT count(column_a) FROM table_with_stats_a group by column_b
config:
MIN_INPUT_SIZE_PER_TASK: 500 MB
Input table data size: 1000 MB
Estimated partition count: Input table data size / MIN_INPUT_SIZE_PER_TASK => 2
E.g. 2:
Given query: SELECT * FROM table_with_stats_a as a JOIN table_with_stats_b as b ON a.column_b = b.column_b
config:
MIN_INPUT_SIZE_PER_TASK: 500 MB
Input tables data size: 1000 MB
Join output data size: 5000 MB
Estimated partition count: max((Input table data size / MIN_INPUT_SIZE_PER_TASK), (Join output data size / MIN_INPUT_SIZE_PER_TASK)) => 10
-
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionoptimize(PlanNode plan, Session session, TypeProvider types, SymbolAllocator symbolAllocator, PlanNodeIdAllocator idAllocator, WarningCollector warningCollector, PlanOptimizersStatsCollector planOptimizersStatsCollector, TableStatsProvider tableStatsProvider)
-
Constructor Details
-
DeterminePartitionCount
-
-
Method Details
-
optimize
public PlanNode optimize(PlanNode plan, Session session, TypeProvider types, SymbolAllocator symbolAllocator, PlanNodeIdAllocator idAllocator, WarningCollector warningCollector, PlanOptimizersStatsCollector planOptimizersStatsCollector, TableStatsProvider tableStatsProvider) - Specified by:
optimizein interfacePlanOptimizer
-