Workaround for the Hive bug
https://issues.apache.org/jira/browse/HIVE-1955
FAILED: Error in semantic analysis: Line 4:3 Non-constant expressions for array indexes not supported key
Use instead of [ ] syntax,
Construct a BloomFilter by aggregating on keys
Uses hadoop util BloomFilter class
Use with bloom_contains( key, bloomfile );
insert overwrite local directory bloomfile
select bloom( ks_uid )
from big_table
where premise = true;
add file bloomfile;
select ks_uid
from other_big_table
where bloom_contains( key, distributed_bloom('bloomfile') );
UDF for combining two lists or two maps together,
across multiple rows, ( in a grouping ),
so that state can be store, and we can calculate
things like "previous actors"
UDF to acccess a bloom stored from a file stored in distributed cache
Assumes the file is a tab-separated file of name-value pairs,
which has been placed in distributed cache using the "add file" command
Example
INSERT OVERWRITE LOCAL DIRECTORY mybloom select bloom(key) from my_map_table where premise=true;
ADD FILE mybloom;
select *
from my_big_table
where bloom_contains( key, distributed_bloom('mybloom') ) == true;
UDF to access a distributed map file
Assumes the file is a tab-separated file of name-value pairs,
which has been placed in distributed cache using the "add file" command
Example
INSERT OVERWRITE LOCAL DIRECTORY mymap select key,value from my_map_table;
ADD FILE mymap;
select key, val* distributed_map( key, 'mymap') from the_table;
If one argument is passed in, it is assumed to be a filename, containing
a map of type map, and the entire map is returned.
Workaround for the Hive bug
https://issues.apache.org/jira/browse/HIVE-1955
FAILED: Error in semantic analysis: Line 4:3 Non-constant expressions for array indexes not supported key
Use instead of [ ] syntax,
Workaround for the Hive bug
https://issues.apache.org/jira/browse/HIVE-1955
FAILED: Error in semantic analysis: Line 4:3 Non-constant expressions for array indexes not supported key
Use instead of [ ] syntax,