Package brickhouse.udf.bloom

Class Summary
BloomAndUDF  
BloomContainsUDF Returns true if the bloom (probably) contains the string
BloomFactory Utility class for construction and serialization of BloomFilters ...
BloomFilter Implements a Bloom filter, as defined by Bloom in 1970.
BloomNotUDF  
BloomOrUDF  
BloomUDAF Construct a BloomFilter by aggregating on keys Uses hadoop util BloomFilter class Use with bloom_contains( key, bloomfile ); insert overwrite local directory bloomfile select bloom( ks_uid ) from big_table where premise = true; add file bloomfile; select ks_uid from other_big_table where bloom_contains( key, distributed_bloom('bloomfile') );
BloomUDAF.BloomUDAFEvaluator  
DistributedBloomUDF UDF to acccess a bloom stored from a file stored in distributed cache Assumes the file is a tab-separated file of name-value pairs, which has been placed in distributed cache using the "add file" command Example INSERT OVERWRITE LOCAL DIRECTORY mybloom select bloom(key) from my_map_table where premise=true; ADD FILE mybloom; select * from my_big_table where bloom_contains( key, distributed_bloom('mybloom') ) == true;
 



Copyright © 2013. All rights reserved.