Class DeterminePartitionCount
java.lang.Object
io.trino.sql.planner.optimizations.DeterminePartitionCount
- All Implemented Interfaces:
PlanOptimizer
This rule looks at the amount of data read and processed by the query to determine the value of partition count
used for remote exchanges. It helps to increase the concurrency of the engine in the case of large cluster.
This rule is also cautious about lack of or incorrect statistics therefore it skips for input multiplying nodes like
CROSS JOIN or UNNEST.
E.g. 1:
Given query: SELECT count(column_a) FROM table_with_stats_a
config:
MIN_INPUT_SIZE_PER_TASK: 500 MB
Input table data size: 1000 MB
Estimated partition count: Input table data size / MIN_INPUT_SIZE_PER_TASK => 2
E.g. 2:
Given query: SELECT * FROM table_with_stats_a as a JOIN table_with_stats_b as b ON a.column_b = b.column_b
config:
MIN_INPUT_SIZE_PER_TASK: 500 MB
Input tables data size: 1000 MB
Join output data size: 5000 MB
Estimated partition count: max((Input table data size / MIN_INPUT_SIZE_PER_TASK), (Join output data size / MIN_INPUT_SIZE_PER_TASK)) => 10
-
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionoptimize(PlanNode plan, Session session, TypeProvider types, SymbolAllocator symbolAllocator, PlanNodeIdAllocator idAllocator, WarningCollector warningCollector, TableStatsProvider tableStatsProvider)
-
Constructor Details
-
DeterminePartitionCount
-
-
Method Details
-
optimize
public PlanNode optimize(PlanNode plan, Session session, TypeProvider types, SymbolAllocator symbolAllocator, PlanNodeIdAllocator idAllocator, WarningCollector warningCollector, TableStatsProvider tableStatsProvider) - Specified by:
optimizein interfacePlanOptimizer
-