Class ThetaSketchAccumulator


  • public class ThetaSketchAccumulator
    extends Object
    Intermediate state used by DistinctCountThetaSketchAggregationFunction which gives the end user more control over how sketches are merged for performance. The end user can set parameters that trade-off more memory usage for more pre-aggregation. This permits use of the Union "early-stop" optimisation where ordered sketches require no further processing beyond the minimum Theta value. The union operation initialises an empty "gadget" bookkeeping sketch that is updated with hashed entries that fall below the minimum Theta value for all input sketches ("Broder Rule"). When the initial Theta value is set to the minimum immediately, further gains can be realised.
    • Constructor Detail

      • ThetaSketchAccumulator

        public ThetaSketchAccumulator()
      • ThetaSketchAccumulator

        public ThetaSketchAccumulator​(org.apache.datasketches.theta.SetOperationBuilder setOperationBuilder,
                                      int threshold)
    • Method Detail

      • setSetOperationBuilder

        public void setSetOperationBuilder​(org.apache.datasketches.theta.SetOperationBuilder setOperationBuilder)
      • setThreshold

        public void setThreshold​(int threshold)
      • isEmpty

        public boolean isEmpty()
      • getResult

        @Nonnull
        public org.apache.datasketches.theta.Sketch getResult()
      • apply

        public void apply​(org.apache.datasketches.theta.Sketch sketch)