public class DDSketch extends java.lang.Object implements QuantileSketch<DDSketch>
QuantileSketch with relative-error guarantees. This sketch computes quantile values with an
approximation error that is relative to the actual quantile value. It works on non-negative input values.
For instance, using DDSketch with a relative accuracy guarantee set to 1%, if the expected quantile
value is 100, the computed quantile value is guaranteed to be between 99 and 101. If the expected quantile value
is 1000, the computed quantile value is guaranteed to be between 990 and 1010.
DDSketch works by mapping floating-point input values to bins and counting the number of values for each
bin. The mapping to bins is handled by IndexMapping, while the underlying structure that keeps track of
bin counts is Store. memoryOptimal(double) constructs a sketch with a logarithmic index mapping, hence low
memory footprint, whereas fast(double) and balanced(double) offer faster ingestion speeds at the cost of
larger memory footprints. The size of the sketch can be upper-bounded by using collapsing stores. For instance,
memoryOptimalCollapsingLowest(double, int) is the version of DDSketch described in the paper, and also
implemented in Go
and Python
. It collapses lowest bins when the maximum number of buckets is reached. For using a specific
IndexMapping or a specific implementation of Store, the constructor can be used
(DDSketch(IndexMapping, Supplier)).
The memory size of the sketch depends on the range that is covered by the input values: the larger that range, the
more bins are needed to keep track of the input values. As a rough estimate, if working on durations using
memoryOptimal(double) with a relative accuracy of 2%, about 2kB (275 bins) are needed to cover values between 1
millisecond and 1 minute, and about 6kB (802 bins) to cover values between 1 nanosecond and 1 day. The number of
bins that are maintained can be upper-bounded using collapsing stores (see for example
memoryOptimalCollapsingLowest(double, int) and memoryOptimalCollapsingHighest(double, int)).
Note that this implementation is not thread-safe.
| Constructor and Description |
|---|
DDSketch(IndexMapping indexMapping,
java.util.function.Supplier<Store> storeSupplier)
Constructs an initially empty quantile sketch using the specified
IndexMapping and Store
supplier. |
DDSketch(IndexMapping indexMapping,
java.util.function.Supplier<Store> storeSupplier,
double minIndexedValue)
Constructs an initially empty quantile sketch using the specified
IndexMapping and Store
supplier. |
| Modifier and Type | Method and Description |
|---|---|
void |
accept(double value)
Adds a value to the sketch.
|
void |
accept(double value,
double count)
Adds a value to the sketch with a floating-point
count. |
void |
accept(double value,
long count)
Adds a value to the sketch as many times as specified by
count. |
static DDSketch |
balanced(double relativeAccuracy)
Constructs a balanced instance of
DDSketch, with high ingestion speed and low memory footprint. |
static DDSketch |
balancedCollapsingHighest(double relativeAccuracy,
int maxNumBins)
Constructs a balanced instance of
DDSketch, with high ingestion speed and low memory footprint,, using
a limited number of bins. |
static DDSketch |
balancedCollapsingLowest(double relativeAccuracy,
int maxNumBins)
Constructs a balanced instance of
DDSketch, with high ingestion speed and low memory footprint, using
a limited number of bins. |
DDSketch |
copy() |
static DDSketch |
fast(double relativeAccuracy)
Constructs a fast instance of
DDSketch, with optimized ingestion speed, at the cost of higher memory
usage. |
static DDSketch |
fastCollapsingHighest(double relativeAccuracy,
int maxNumBins)
Constructs a fast instance of
DDSketch, with optimized ingestion speed, at the cost of higher memory
usage, using a limited number of bins. |
static DDSketch |
fastCollapsingLowest(double relativeAccuracy,
int maxNumBins)
Constructs a fast instance of
DDSketch, with optimized ingestion speed, at the cost of higher memory
usage, using a limited number of bins. |
static DDSketch |
fromProto(java.util.function.Supplier<? extends Store> storeSupplier,
com.datadoghq.sketch.ddsketch.proto.DDSketch proto)
Builds a new instance of
DDSketch based on the provided protobuf representation, assuming it encodes
non-negative values only. |
double |
getCount() |
IndexMapping |
getIndexMapping() |
double |
getMaxValue() |
double |
getMinValue() |
Store |
getStore() |
double |
getValueAtQuantile(double quantile) |
double[] |
getValuesAtQuantiles(double[] quantiles) |
boolean |
isEmpty() |
static DDSketch |
memoryOptimal(double relativeAccuracy)
Constructs a memory-optimal instance of
DDSketch, with optimized memory usage, at the cost of lower
ingestion speed. |
static DDSketch |
memoryOptimalCollapsingHighest(double relativeAccuracy,
int maxNumBins)
Constructs a memory-optimal instance of
DDSketch, with optimized memory usage, at the cost of lower
ingestion speed, using a limited number of bins. |
static DDSketch |
memoryOptimalCollapsingLowest(double relativeAccuracy,
int maxNumBins)
Constructs a memory-optimal instance of
DDSketch, with optimized memory usage, at the cost of lower
ingestion speed, using a limited number of bins. |
void |
mergeWith(DDSketch other)
Merges the other sketch into this one.
|
com.datadoghq.sketch.ddsketch.proto.DDSketch |
toProto()
Generates a protobuf representation of this
DDSketch. |
public DDSketch(IndexMapping indexMapping, java.util.function.Supplier<Store> storeSupplier)
IndexMapping and Store
supplier.indexMapping - the mapping between floating-point values and integer indices to be used by the sketchstoreSupplier - the store constructor for keeping track of added valuesbalanced(double),
balancedCollapsingLowest(double, int),
balancedCollapsingHighest(double, int),
fast(double),
fastCollapsingLowest(double, int),
fastCollapsingHighest(double, int),
memoryOptimal(double),
memoryOptimalCollapsingLowest(double, int),
memoryOptimalCollapsingHighest(double, int)public DDSketch(IndexMapping indexMapping, java.util.function.Supplier<Store> storeSupplier, double minIndexedValue)
IndexMapping and Store
supplier.indexMapping - the mapping between floating-point values and integer indices to be used by the sketchstoreSupplier - the store constructor for keeping track of added valuesminIndexedValue - the least value that should be distinguished from zerobalanced(double),
balancedCollapsingLowest(double, int),
balancedCollapsingHighest(double, int),
fast(double),
fastCollapsingLowest(double, int),
fastCollapsingHighest(double, int),
memoryOptimal(double),
memoryOptimalCollapsingLowest(double, int),
memoryOptimalCollapsingHighest(double, int)public IndexMapping getIndexMapping()
public Store getStore()
public void accept(double value)
accept in interface QuantileSketch<DDSketch>accept in interface java.util.function.DoubleConsumervalue - the value to be addedjava.lang.IllegalArgumentException - if the value is outside the range that is tracked by the sketchpublic void accept(double value,
long count)
count.accept in interface QuantileSketch<DDSketch>value - the value to be addedcount - the number of times the value is to be addedjava.lang.IllegalArgumentException - if the value is outside the range that is tracked by the sketchpublic void accept(double value,
double count)
count.value - the value to be addedcount - the weight associated with the value to be addedjava.lang.IllegalArgumentException - if count is negativepublic void mergeWith(DDSketch other)
mergeWith in interface QuantileSketch<DDSketch>other - the sketch to be merged into this onejava.lang.IllegalArgumentException - if the other sketch does not use the same index mappingpublic DDSketch copy()
copy in interface QuantileSketch<DDSketch>public boolean isEmpty()
isEmpty in interface QuantileSketch<DDSketch>public double getCount()
getCount in interface QuantileSketch<DDSketch>public double getMinValue()
getMinValue in interface QuantileSketch<DDSketch>public double getMaxValue()
getMaxValue in interface QuantileSketch<DDSketch>public double getValueAtQuantile(double quantile)
getValueAtQuantile in interface QuantileSketch<DDSketch>quantile - a number between 0 and 1 (both included)public double[] getValuesAtQuantiles(double[] quantiles)
getValuesAtQuantiles in interface QuantileSketch<DDSketch>quantiles - number between 0 and 1 (both included)public com.datadoghq.sketch.ddsketch.proto.DDSketch toProto()
DDSketch.DDSketchpublic static DDSketch fromProto(java.util.function.Supplier<? extends Store> storeSupplier, com.datadoghq.sketch.ddsketch.proto.DDSketch proto)
DDSketch based on the provided protobuf representation, assuming it encodes
non-negative values only.storeSupplier - the constructor of the Store implementation to be used for encoding bin countersproto - the protobuf representation of a sketchDDSketch that matches the protobuf representationjava.lang.IllegalArgumentException - if the protobuf representation contains negative valuespublic static DDSketch balanced(double relativeAccuracy)
DDSketch, with high ingestion speed and low memory footprint.relativeAccuracy - the relative accuracy guaranteed by the sketchDDSketchpublic static DDSketch balancedCollapsingLowest(double relativeAccuracy, int maxNumBins)
DDSketch, with high ingestion speed and low memory footprint, using
a limited number of bins. When the maximum number of bins is reached, bins with lowest indices are collapsed,
which causes the relative accuracy guarantee to be lost on lowest quantiles.relativeAccuracy - the relative accuracy guaranteed by the sketch (for non-collapsed bins)maxNumBins - the maximum number of bins to be maintainedDDSketch using a limited number of binspublic static DDSketch balancedCollapsingHighest(double relativeAccuracy, int maxNumBins)
DDSketch, with high ingestion speed and low memory footprint,, using
a limited number of bins. When the maximum number of bins is reached, bins with highest indices are collapsed,
which causes the relative accuracy guarantee to be lost on highest quantiles.relativeAccuracy - the relative accuracy guaranteed by the sketch (for non-collapsed bins)maxNumBins - the maximum number of bins to be maintainedDDSketch using a limited number of binspublic static DDSketch fast(double relativeAccuracy)
DDSketch, with optimized ingestion speed, at the cost of higher memory
usage.relativeAccuracy - the relative accuracy guaranteed by the sketchDDSketchpublic static DDSketch fastCollapsingLowest(double relativeAccuracy, int maxNumBins)
DDSketch, with optimized ingestion speed, at the cost of higher memory
usage, using a limited number of bins. When the maximum number of bins is reached, bins with lowest indices
are collapsed, which causes the relative accuracy guarantee to be lost on lowest quantiles.relativeAccuracy - the relative accuracy guaranteed by the sketch (for non-collapsed bins)maxNumBins - the maximum number of bins to be maintainedDDSketch using a limited number of binspublic static DDSketch fastCollapsingHighest(double relativeAccuracy, int maxNumBins)
DDSketch, with optimized ingestion speed, at the cost of higher memory
usage, using a limited number of bins. When the maximum number of bins is reached, bins with highest indices
are collapsed, which causes the relative accuracy guarantee to be lost on highest quantiles.relativeAccuracy - the relative accuracy guaranteed by the sketch (for non-collapsed bins)maxNumBins - the maximum number of bins to be maintainedDDSketch using a limited number of binspublic static DDSketch memoryOptimal(double relativeAccuracy)
DDSketch, with optimized memory usage, at the cost of lower
ingestion speed.relativeAccuracy - the relative accuracy guaranteed by the sketchDDSketchpublic static DDSketch memoryOptimalCollapsingLowest(double relativeAccuracy, int maxNumBins)
DDSketch, with optimized memory usage, at the cost of lower
ingestion speed, using a limited number of bins. When the maximum number of bins is reached, bins with lowest
indices are collapsed, which causes the relative accuracy guarantee to be lost on lowest quantiles.relativeAccuracy - the relative accuracy guaranteed by the sketch (for non-collapsed bins)maxNumBins - the maximum number of bins to be maintainedDDSketch using a limited number of binspublic static DDSketch memoryOptimalCollapsingHighest(double relativeAccuracy, int maxNumBins)
DDSketch, with optimized memory usage, at the cost of lower
ingestion speed, using a limited number of bins. When the maximum number of bins is reached, bins with highest
indices are collapsed, which causes the relative accuracy guarantee to be lost on highest quantiles.relativeAccuracy - the relative accuracy guaranteed by the sketch (for non-collapsed bins)maxNumBins - the maximum number of bins to be maintainedDDSketch using a limited number of bins