brickhouse.udf.bloom
Class BloomFilter
java.lang.Object
org.apache.hadoop.util.bloom.Filter
brickhouse.udf.bloom.BloomFilter
- All Implemented Interfaces:
- org.apache.hadoop.io.Writable
public class BloomFilter
- extends org.apache.hadoop.util.bloom.Filter
Implements a Bloom filter, as defined by Bloom in 1970.
The Bloom filter is a data structure that was introduced in 1970 and that has been adopted by
the networking research community in the past decade thanks to the bandwidth efficiencies that it
offers for the transmission of set membership information between networked hosts. A sender encodes
the information into a bit vector, the Bloom filter, that is more compact than a conventional
representation. Computation and space costs for construction are linear in the number of elements.
The receiver uses the filter to test whether various elements are members of the set. Though the
filter will occasionally return a false positive, it will never return a false negative. When creating
the filter, the sender can choose its desired point in a trade-off between the false positive rate and the size.
Originally created by
European Commission One-Lab Project 034819.
- See Also:
The general behavior of a filter,
Space/Time Trade-Offs in Hash Coding with Allowable Errors
| Fields inherited from class org.apache.hadoop.util.bloom.Filter |
hash, hashType, nbHash, vectorSize |
|
Constructor Summary |
BloomFilter()
Default constructor - use with readFields |
BloomFilter(int vectorSize,
int nbHash,
int hashType)
Constructor |
| Methods inherited from class org.apache.hadoop.util.bloom.Filter |
add, add, add |
BloomFilter
public BloomFilter()
- Default constructor - use with readFields
BloomFilter
public BloomFilter(int vectorSize,
int nbHash,
int hashType)
- Constructor
- Parameters:
vectorSize - The vector size of this filter.nbHash - The number of hash function to consider.hashType - type of the hashing function (see
Hash).
add
public void add(org.apache.hadoop.util.bloom.Key key)
- Specified by:
add in class org.apache.hadoop.util.bloom.Filter
and
public void and(org.apache.hadoop.util.bloom.Filter filter)
- Specified by:
and in class org.apache.hadoop.util.bloom.Filter
membershipTest
public boolean membershipTest(org.apache.hadoop.util.bloom.Key key)
- Specified by:
membershipTest in class org.apache.hadoop.util.bloom.Filter
not
public void not()
- Specified by:
not in class org.apache.hadoop.util.bloom.Filter
or
public void or(org.apache.hadoop.util.bloom.Filter filter)
- Specified by:
or in class org.apache.hadoop.util.bloom.Filter
xor
public void xor(org.apache.hadoop.util.bloom.Filter filter)
- Specified by:
xor in class org.apache.hadoop.util.bloom.Filter
toString
public String toString()
- Overrides:
toString in class Object
getVectorSize
public int getVectorSize()
- Returns:
- size of the the bloomfilter
write
public void write(DataOutput out)
throws IOException
- Specified by:
write in interface org.apache.hadoop.io.Writable- Overrides:
write in class org.apache.hadoop.util.bloom.Filter
- Throws:
IOException
readFields
public void readFields(DataInput in)
throws IOException
- Specified by:
readFields in interface org.apache.hadoop.io.Writable- Overrides:
readFields in class org.apache.hadoop.util.bloom.Filter
- Throws:
IOException
Copyright © 2013. All rights reserved.