public abstract class Sketch extends Object
| Modifier and Type | Method and Description |
|---|---|
abstract CompactSketch |
compact()
Converts this sketch to an ordered CompactSketch on the Java heap.
|
abstract CompactSketch |
compact(boolean dstOrdered,
org.apache.datasketches.memory.WritableMemory dstMem)
Convert this sketch to a CompactSketch in the chosen form.
|
int |
getCountLessThanTheta(double theta)
Gets the number of hash values less than the given theta.
|
abstract int |
getCurrentBytes(boolean compact)
Returns the number of storage bytes required for this Sketch in its current state.
|
double |
getEstimate()
Gets the unique count estimate.
|
abstract Family |
getFamily()
Returns the Family that this sketch belongs to
|
double |
getLowerBound(int numStdDev)
Gets the approximate lower error bound given the specified number of Standard Deviations.
|
static int |
getMaxCompactSketchBytes(int numberOfEntries)
Returns the maximum number of storage bytes required for a CompactSketch with the given
number of actual entries.
|
static int |
getMaxUpdateSketchBytes(int nomEntries)
Returns the maximum number of storage bytes required for an UpdateSketch with the given
number of nominal entries (power of 2).
|
int |
getRetainedEntries()
Returns the number of valid entries that have been retained by the sketch.
|
abstract int |
getRetainedEntries(boolean valid)
Returns the number of entries that have been retained by the sketch.
|
static int |
getSerializationVersion(org.apache.datasketches.memory.Memory mem)
Returns the serialization version from the given Memory
|
double |
getTheta()
Gets the value of theta as a double with a value between zero and one
|
abstract long |
getThetaLong()
Gets the value of theta as a long
|
double |
getUpperBound(int numStdDev)
Gets the approximate upper error bound given the specified number of Standard Deviations.
|
abstract boolean |
hasMemory()
Returns true if this sketch's data structure is backed by Memory or WritableMemory.
|
static Sketch |
heapify(org.apache.datasketches.memory.Memory srcMem)
Heapify takes the sketch image in Memory and instantiates an on-heap
Sketch using the
Default Update Seed.
|
static Sketch |
heapify(org.apache.datasketches.memory.Memory srcMem,
long seed)
Heapify takes the sketch image in Memory and instantiates an on-heap
Sketch using the given seed.
|
abstract boolean |
isCompact()
Returns true if this sketch is in compact form.
|
abstract boolean |
isDirect()
Returns true if the this sketch's internal data structure is backed by direct (off-heap)
Memory.
|
abstract boolean |
isEmpty()
|
boolean |
isEstimationMode()
Returns true if the sketch is Estimation Mode (as opposed to Exact Mode).
|
abstract boolean |
isOrdered()
Returns true if internal cache is ordered
|
boolean |
isSameResource(org.apache.datasketches.memory.Memory that)
Returns true if the backing resource of this is identical with the backing resource
of that.
|
abstract HashIterator |
iterator()
Returns a HashIterator that can be used to iterate over the retained hash values of the
Theta sketch.
|
abstract byte[] |
toByteArray()
Serialize this sketch to a byte array form.
|
String |
toString()
Returns a human readable summary of the sketch.
|
String |
toString(boolean sketchSummary,
boolean dataDetail,
int width,
boolean hexMode)
Gets a human readable listing of contents and summary of the given sketch.
|
static String |
toString(byte[] byteArr)
Returns a human readable string of the preamble of a byte array image of a Theta Sketch.
|
static String |
toString(org.apache.datasketches.memory.Memory mem)
Returns a human readable string of the preamble of a Memory image of a Theta Sketch.
|
static Sketch |
wrap(org.apache.datasketches.memory.Memory srcMem)
Wrap takes the sketch image in Memory and refers to it directly.
|
static Sketch |
wrap(org.apache.datasketches.memory.Memory srcMem,
long seed)
Wrap takes the sketch image in Memory and refers to it directly.
|
public static Sketch heapify(org.apache.datasketches.memory.Memory srcMem)
srcMem - an image of a Sketch where the image seed hash matches the default seed hash.
See Memorypublic static Sketch heapify(org.apache.datasketches.memory.Memory srcMem, long seed)
srcMem - an image of a Sketch where the image seed hash matches the given seed hash.
See Memoryseed - See Update Hash Seed.
Compact sketches store a 16-bit hash of the seed, but not the seed itself.public static Sketch wrap(org.apache.datasketches.memory.Memory srcMem)
Util.DEFAULT_UPDATE_SEED.
Default Update Seed.srcMem - an image of a Sketch where the image seed hash matches the default seed hash.
See Memorypublic static Sketch wrap(org.apache.datasketches.memory.Memory srcMem, long seed)
srcMem - an image of a Sketch where the image seed hash matches the given seed hash.
See Memoryseed - See Update Hash Seed.
Compact sketches store a 16-bit hash of the seed, but not the seed itself.public abstract CompactSketch compact()
If this sketch is already in compact form this operation returns this.
public abstract CompactSketch compact(boolean dstOrdered, org.apache.datasketches.memory.WritableMemory dstMem)
If this sketch is already in compact form this operation returns this.
Otherwise, this compacting process converts the hash table form of an UpdateSketch to a simple list of the valid hash values from the hash table. Any hash values equal to or greater than theta will be discarded. The number of valid values remaining in the Compact Sketch depends on a number of factors, but may be larger or smaller than Nominal Entries (or k). It will never exceed 2k. If it is critical to always limit the size to no more than k, then rebuild() should be called on the UpdateSketch prior to this.
dstOrdered - See Destination OrdereddstMem - See Destination Memory.public int getCountLessThanTheta(double theta)
theta - the given theta as a double between zero and one.public abstract int getCurrentBytes(boolean compact)
compact - if true, returns the bytes required for compact form.
If this sketch is already in compact form this parameter is ignored.public double getEstimate()
public abstract Family getFamily()
public abstract HashIterator iterator()
public double getLowerBound(int numStdDev)
numStdDev - See Number of Standard Deviationspublic static int getMaxCompactSketchBytes(int numberOfEntries)
numberOfEntries - the actual number of entries stored with the CompactSketch.public static int getMaxUpdateSketchBytes(int nomEntries)
nomEntries - Nominal Entres
This will become the ceiling power of 2 if it is not.public int getRetainedEntries()
public abstract int getRetainedEntries(boolean valid)
valid - if true, returns the number of valid entries, which are less than theta and used
for estimation.
Otherwise, return the number of all entries, valid or not, that are currently in the internal
sketch cache.public static int getSerializationVersion(org.apache.datasketches.memory.Memory mem)
mem - the sketch Memorypublic double getTheta()
public abstract long getThetaLong()
public double getUpperBound(int numStdDev)
numStdDev - See Number of Standard Deviationspublic abstract boolean hasMemory()
public abstract boolean isCompact()
public abstract boolean isDirect()
public abstract boolean isEmpty()
public boolean isEstimationMode()
public abstract boolean isOrdered()
public boolean isSameResource(org.apache.datasketches.memory.Memory that)
that - A different non-null objectpublic abstract byte[] toByteArray()
public String toString()
public String toString(boolean sketchSummary, boolean dataDetail, int width, boolean hexMode)
sketchSummary - If true the sketch summary will be output at the end.dataDetail - If true, includes all valid hash values in the sketch.width - The number of columns of hash values. Default is 8.hexMode - If true, hashes will be output in hex.public static String toString(byte[] byteArr)
byteArr - the given byte arraypublic static String toString(org.apache.datasketches.memory.Memory mem)
mem - the given Memory objectCopyright © 2015–2020 The Apache Software Foundation. All rights reserved.