@InterfaceAudience.Private public final class BloomFilterUtil extends Object
| Modifier and Type | Field and Description |
|---|---|
static byte[] |
bitvals
Bit-value lookup array to prevent doing the same work over and over
|
static double |
LOG2_SQUARED
Used in computing the optimal Bloom filter size.
|
static String |
PREFIX_LENGTH_KEY |
static String |
STATS_RECORD_SEP
Record separator for the Bloom filter statistics human-readable string
|
| Modifier and Type | Method and Description |
|---|---|
static double |
actualErrorRate(long maxKeys,
long bitSize,
int functionCount)
Computes the actual error rate for the given number of elements, number of bits, and number of
hash functions.
|
static long |
computeBitSize(long maxKeys,
double errorRate) |
static int |
computeFoldableByteSize(long bitSize,
int foldFactor)
Increases the given byte size of a Bloom filter until it can be folded by the given factor.
|
static long |
computeMaxKeys(long bitSize,
double errorRate,
int hashCount)
The maximum number of keys we can put into a Bloom filter of a certain size to get the given
error rate, with the given number of hash functions.
|
static boolean |
contains(byte[] buf,
int offset,
int length,
ByteBuff bloomBuf,
int bloomOffset,
int bloomSize,
Hash hash,
int hashCount) |
static boolean |
contains(Cell cell,
ByteBuff bloomBuf,
int bloomOffset,
int bloomSize,
Hash hash,
int hashCount,
BloomType type) |
static BloomFilterChunk |
createBySize(int byteSizeHint,
double errorRate,
int hashType,
int foldFactor,
BloomType bloomType)
Creates a Bloom filter chunk of the given size.
|
static String |
formatStats(BloomFilterBase bloomFilter)
A human-readable string with statistics for the given Bloom filter.
|
static byte[] |
getBloomFilterParam(BloomType bloomFilterType,
org.apache.hadoop.conf.Configuration conf) |
static long |
idealMaxKeys(long bitSize,
double errorRate)
The maximum number of keys we can put into a Bloom filter of a certain size to maintain the
given error rate, assuming the number of hash functions is chosen optimally and does not even
have to be an integer (hence the "ideal" in the function name).
|
static int |
optimalFunctionCount(int maxKeys,
long bitSize) |
static void |
setRandomGeneratorForTest(Random random)
Sets a random generator to be used for look-ups instead of computing hashes.
|
static String |
toString(BloomFilterChunk bloomFilter) |
public static final String STATS_RECORD_SEP
public static final double LOG2_SQUARED
public static final String PREFIX_LENGTH_KEY
public static final byte[] bitvals
public static long computeBitSize(long maxKeys,
double errorRate)
public static void setRandomGeneratorForTest(Random random)
This gets used in contains(ByteBuff, int, int, Hash, int, HashKey)
random - The random number source to use, or null to compute actual hashespublic static long idealMaxKeys(long bitSize,
double errorRate)
for a more precise estimatepublic static long computeMaxKeys(long bitSize,
double errorRate,
int hashCount)
public static double actualErrorRate(long maxKeys,
long bitSize,
int functionCount)
public static int computeFoldableByteSize(long bitSize,
int foldFactor)
public static int optimalFunctionCount(int maxKeys,
long bitSize)
public static BloomFilterChunk createBySize(int byteSizeHint, double errorRate, int hashType, int foldFactor, BloomType bloomType)
byteSizeHint - the desired number of bytes for the Bloom filter bit array. Will be
increased so that folding is possible.errorRate - target false positive rate of the Bloom filterhashType - Bloom filter hash function typepublic static boolean contains(byte[] buf,
int offset,
int length,
ByteBuff bloomBuf,
int bloomOffset,
int bloomSize,
Hash hash,
int hashCount)
public static boolean contains(Cell cell, ByteBuff bloomBuf, int bloomOffset, int bloomSize, Hash hash, int hashCount, BloomType type)
public static String formatStats(BloomFilterBase bloomFilter)
bloomFilter - the Bloom filter to output statistics for;STATS_RECORD_SEP.public static String toString(BloomFilterChunk bloomFilter)
public static byte[] getBloomFilterParam(BloomType bloomFilterType, org.apache.hadoop.conf.Configuration conf) throws IllegalArgumentException
IllegalArgumentExceptionCopyright © 2007–2020 The Apache Software Foundation. All rights reserved.