Class SegmentToMoveCalculator


  • public class SegmentToMoveCalculator
    extends Object
    Calculates the maximum, minimum and required number of segments to move in a Coordinator run for balancing.
    • Method Detail

      • computeNumSegmentsToMoveInTier

        public static int computeNumSegmentsToMoveInTier​(String tier,
                                                         List<ServerHolder> historicals,
                                                         int maxSegmentsToMoveInTier)
        Calculates the number of segments to be picked for moving in the given tier, based on the level of skew between the historicals in the tier.
        Parameters:
        tier - Name of tier used for logging purposes
        historicals - Active historicals in tier
        maxSegmentsToMoveInTier - Maximum number of segments allowed to be moved in the tier.
        Returns:
        Number of segments to move in the tier in the range [MIN_SEGMENTS_TO_MOVE, maxSegmentsToMoveInTier].
      • computeMinSegmentsToMoveInTier

        public static int computeMinSegmentsToMoveInTier​(int totalSegmentsInTier)
        Calculates the minimum number of segments that should be considered for moving in a tier, so that the cluster is always balancing itself.

        This value must be calculated separately for every tier.

        Parameters:
        totalSegmentsInTier - Total number of all replicas of all segments loaded or queued across all historicals in the tier.
        Returns:
        minSegmentsToMoveInTier in the range [MIN_SEGMENTS_TO_MOVE, ~0.15% of totalSegmentsInTier].
      • computeMaxSegmentsToMovePerTier

        public static int computeMaxSegmentsToMovePerTier​(int totalSegments,
                                                          int numBalancerThreads,
                                                          org.joda.time.Duration coordinatorPeriod)
        Calculates the maximum number of segments that can be picked for moving in the cluster in a single coordinator run.

        This value must be calculated at the cluster level and then applied to every tier so that the total computation time is estimated correctly.

        Each balancer thread can perform 1 billion computations in 20s (see #14584). Therefore, keeping a buffer of 10s, in every 30s:

         numComputations = maxSegmentsToMove * totalSegments
        
         maxSegmentsToMove = numComputations / totalSegments
                           = (nThreads * 1B) / totalSegments
         
        Parameters:
        totalSegments - Total number of all replicas of all segments loaded or queued across all historicals in the cluster.
        Returns:
        maxSegmentsToMove per tier in the range [MIN_SEGMENTS_TO_MOVE, ~20% of totalSegments].
        See Also:
        #14584
      • computeNumSegmentsToMoveToBalanceTier

        public static int computeNumSegmentsToMoveToBalanceTier​(String tier,
                                                                List<ServerHolder> historicals)
        Computes the number of segments that need to be moved across the historicals in a tier to attain balance in terms of disk usage and segment counts per data source.
        Parameters:
        tier - Name of the tier used only for logging purposes
        historicals - List of historicals in the tier