org.apache.spark.sql.catalyst.util
HyperLogLogPlusPlusHelper
Companion object HyperLogLogPlusPlusHelper
class HyperLogLogPlusPlusHelper extends Serializable
- Alphabetic
- By Inheritance
- HyperLogLogPlusPlusHelper
- Serializable
- AnyRef
- Any
- Hide All
- Show All
- Public
- Protected
Instance Constructors
- new HyperLogLogPlusPlusHelper(relativeSD: Double)
Value Members
- final def !=(arg0: Any): Boolean
- Definition Classes
- AnyRef → Any
- final def ##: Int
- Definition Classes
- AnyRef → Any
- final def ==(arg0: Any): Boolean
- Definition Classes
- AnyRef → Any
- final def asInstanceOf[T0]: T0
- Definition Classes
- Any
- def clone(): AnyRef
- Attributes
- protected[lang]
- Definition Classes
- AnyRef
- Annotations
- @throws(classOf[java.lang.CloneNotSupportedException]) @native()
- final def eq(arg0: AnyRef): Boolean
- Definition Classes
- AnyRef
- def equals(arg0: AnyRef): Boolean
- Definition Classes
- AnyRef → Any
- def estimateBias(e: Double): Double
Estimate the bias using the raw estimates with their respective biases from the HLL++ appendix.
Estimate the bias using the raw estimates with their respective biases from the HLL++ appendix. We currently use KNN interpolation to determine the bias (as suggested in the paper).
- def finalize(): Unit
- Attributes
- protected[lang]
- Definition Classes
- AnyRef
- Annotations
- @throws(classOf[java.lang.Throwable])
- final def getClass(): Class[_ <: AnyRef]
- Definition Classes
- AnyRef → Any
- Annotations
- @native()
- def hashCode(): Int
- Definition Classes
- AnyRef → Any
- Annotations
- @native()
- final def isInstanceOf[T0]: Boolean
- Definition Classes
- Any
- def merge(buffer1: InternalRow, buffer2: InternalRow, offset1: Int, offset2: Int): Unit
Merge the HLL buffers by iterating through the registers in both buffers and select the maximum number of leading zeros for each register.
- final def ne(arg0: AnyRef): Boolean
- Definition Classes
- AnyRef
- final def notify(): Unit
- Definition Classes
- AnyRef
- Annotations
- @native()
- final def notifyAll(): Unit
- Definition Classes
- AnyRef
- Annotations
- @native()
- val numWords: Int
The number of words used to store the registers.
The number of words used to store the registers. We use Longs for storage because this is the most compact way of storage; Spark aligns to 8-byte words or uses Long wrappers.
We only store whole registers per word in order to prevent overly complex bitwise operations. In practice this means we only use 60 out of 64 bits.
- def query(buffer: InternalRow, bufferOffset: Int): Long
Compute the HyperLogLog estimate.
Compute the HyperLogLog estimate.
Variable names in the HLL++ paper match variable names in the code.
- final def synchronized[T0](arg0: => T0): T0
- Definition Classes
- AnyRef
- def toString(): String
- Definition Classes
- AnyRef → Any
- def trueRsd: Double
The
rsdof HLL++ is always equal to or better than thersdrequested.The
rsdof HLL++ is always equal to or better than thersdrequested. This method returns thersdthis instance actually guarantees.- returns
the actual
rsd.
- def update(buffer: InternalRow, bufferOffset: Int, _value: Any, dataType: DataType): Unit
Update the HLL++ buffer.
Update the HLL++ buffer.
Variable names in the HLL++ paper match variable names in the code.
- final def wait(): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws(classOf[java.lang.InterruptedException])
- final def wait(arg0: Long, arg1: Int): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws(classOf[java.lang.InterruptedException])
- final def wait(arg0: Long): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws(classOf[java.lang.InterruptedException]) @native()