Class SkewedPartitionRebalancer

java.lang.Object
io.trino.operator.output.SkewedPartitionRebalancer

@ThreadSafe public class SkewedPartitionRebalancer extends Object
Helps in distributing big or skewed partitions across available tasks to improve the performance of partitioned writes.

This rebalancer initialize a bunch of buckets for each task based on a given taskBucketCount and then tries to uniformly distribute partitions across those buckets. This helps to mitigate two problems: 1. Mitigate skewness across tasks. 2. Scale few big partitions across tasks even if there's no skewness among them. This will essentially speed the local scaling without impacting much overall resource utilization.

Example:

Before: 3 tasks, 3 buckets per task, and 2 skewed partitions Task1 Task2 Task3 Bucket1 (Part 1) Bucket1 (Part 2) Bucket1 Bucket2 Bucket2 Bucket2 Bucket3 Bucket3 Bucket3

After rebalancing: Task1 Task2 Task3 Bucket1 (Part 1) Bucket1 (Part 2) Bucket1 (Part 1) Bucket2 (Part 2) Bucket2 (Part 1) Bucket2 (Part 2) Bucket3 Bucket3 Bucket3