Class DubboMergingDigest

java.lang.Object
com.tdunning.math.stats.TDigest
org.apache.dubbo.metrics.aggregate.DubboAbstractTDigest
org.apache.dubbo.metrics.aggregate.DubboMergingDigest
All Implemented Interfaces:
Serializable

public class DubboMergingDigest extends DubboAbstractTDigest
Maintains a t-digest by collecting new points in a buffer that is then sorted occasionally and merged into a sorted array that contains previously computed centroids.

This can be very fast because the cost of sorting and merging is amortized over several insertion. If we keep N centroids total and have the input array is k long, then the amortized cost is something like

N/k + log k

These costs even out when N/k = log k. Balancing costs is often a good place to start in optimizing an algorithm. For different values of compression factor, the following table shows estimated asymptotic values of N and suggested values of k:

CompressionNk
507825
10015742
20031473
Sizing considerations for t-digest

The virtues of this kind of t-digest implementation include:

  • No allocation is required after initialization
  • The data structure automatically compresses existing centroids when possible
  • No Java object overhead is incurred for centroids since data is kept in primitive arrays

The current implementation takes the liberty of using ping-pong buffers for implementing the merge resulting in a substantial memory penalty, but the complexity of an in place merge was not considered as worthwhile since even with the overhead, the memory cost is less than 40 bytes per centroid which is much less than half what the AVLTreeDigest uses and no dynamic allocation is required at all.

See Also:
  • Field Details

    • useAlternatingSort

      public boolean useAlternatingSort
    • useTwoLevelCompression

      public boolean useTwoLevelCompression
    • useWeightLimit

      public static boolean useWeightLimit
  • Constructor Details

    • DubboMergingDigest

      public DubboMergingDigest(double compression)
      Allocates a buffer merging t-digest. This is the normally used constructor that allocates default sized internal arrays. Other versions are available, but should only be used for special cases.
      Parameters:
      compression - The compression factor
    • DubboMergingDigest

      public DubboMergingDigest(double compression, int bufferSize)
      If you know the size of the temporary buffer for incoming points, you can use this entry point.
      Parameters:
      compression - Compression factor for t-digest. Same as 1/\delta in the paper.
      bufferSize - How many samples to retain before merging.
    • DubboMergingDigest

      public DubboMergingDigest(double compression, int bufferSize, int size)
      Fully specified constructor. Normally only used for deserializing a buffer t-digest.
      Parameters:
      compression - Compression factor
      bufferSize - Number of temporary centroids
      size - Size of main buffer
  • Method Details

    • getMin

      public double getMin()
      Overrides:
      getMin in class com.tdunning.math.stats.TDigest
    • getMax

      public double getMax()
      Overrides:
      getMax in class com.tdunning.math.stats.TDigest
    • recordAllData

      public com.tdunning.math.stats.TDigest recordAllData()
      Turns on internal data recording.
      Overrides:
      recordAllData in class DubboAbstractTDigest
    • add

      public void add(double x, int w)
      Specified by:
      add in class com.tdunning.math.stats.TDigest
    • add

      public void add(List<? extends com.tdunning.math.stats.TDigest> others)
      Specified by:
      add in class com.tdunning.math.stats.TDigest
    • compress

      public void compress()
      Merges any pending inputs and compresses the data down to the public setting. Note that this typically loses a bit of precision and thus isn't a thing to be doing all the time. It is best done only when we want to show results to the outside world.
      Specified by:
      compress in class com.tdunning.math.stats.TDigest
    • size

      public long size()
      Specified by:
      size in class com.tdunning.math.stats.TDigest
    • cdf

      public double cdf(double x)
      Specified by:
      cdf in class com.tdunning.math.stats.TDigest
    • quantile

      public double quantile(double q)
      Specified by:
      quantile in class com.tdunning.math.stats.TDigest
    • centroidCount

      public int centroidCount()
      Specified by:
      centroidCount in class com.tdunning.math.stats.TDigest
    • centroids

      public Collection<com.tdunning.math.stats.Centroid> centroids()
      Specified by:
      centroids in class com.tdunning.math.stats.TDigest
    • compression

      public double compression()
      Specified by:
      compression in class com.tdunning.math.stats.TDigest
    • byteSize

      public int byteSize()
      Specified by:
      byteSize in class com.tdunning.math.stats.TDigest
    • smallByteSize

      public int smallByteSize()
      Specified by:
      smallByteSize in class com.tdunning.math.stats.TDigest
    • getScaleFunction

      public com.tdunning.math.stats.ScaleFunction getScaleFunction()
    • setScaleFunction

      public void setScaleFunction(com.tdunning.math.stats.ScaleFunction scaleFunction)
      Overrides:
      setScaleFunction in class com.tdunning.math.stats.TDigest
    • asBytes

      public void asBytes(ByteBuffer buf)
      Specified by:
      asBytes in class com.tdunning.math.stats.TDigest
    • asSmallBytes

      public void asSmallBytes(ByteBuffer buf)
      Specified by:
      asSmallBytes in class com.tdunning.math.stats.TDigest
    • toString

      public String toString()
      Overrides:
      toString in class Object