Package io.trino.type.setdigest
Class SetDigest
java.lang.Object
io.trino.type.setdigest.SetDigest
For the MinHash algorithm, see "On the resemblance and containment of documents" by Andrei Z. Broder,
and the Wikipedia page: http://en.wikipedia.org/wiki/MinHash#Variant_with_a_single_hash_function
-
Field Summary
FieldsModifier and TypeFieldDescriptionstatic final intstatic final int -
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionvoidadd(long value) voidadd(io.airlift.slice.Slice value) longintintstatic longio.airlift.stats.cardinality.HyperLogLoggetHll()booleanisExact()static doublejaccardIndex(SetDigest a, SetDigest b) voidstatic SetDigestnewInstance(io.airlift.slice.Slice serialized) io.airlift.slice.Slice
-
Field Details
-
NUMBER_OF_BUCKETS
public static final int NUMBER_OF_BUCKETS- See Also:
-
DEFAULT_MAX_HASHES
public static final int DEFAULT_MAX_HASHES- See Also:
-
-
Constructor Details
-
SetDigest
public SetDigest() -
SetDigest
public SetDigest(int maxHashes, int numHllBuckets) -
SetDigest
public SetDigest(int maxHashes, io.airlift.stats.cardinality.HyperLogLog hll, it.unimi.dsi.fastutil.longs.Long2ShortSortedMap minhash)
-
-
Method Details
-
newInstance
-
serialize
public io.airlift.slice.Slice serialize() -
getHll
public io.airlift.stats.cardinality.HyperLogLog getHll() -
estimatedInMemorySize
public int estimatedInMemorySize() -
estimatedSerializedSize
public int estimatedSerializedSize() -
isExact
public boolean isExact() -
cardinality
public long cardinality() -
exactIntersectionCardinality
-
jaccardIndex
-
add
public void add(long value) -
add
public void add(io.airlift.slice.Slice value) -
mergeWith
-
getHashCounts
-