brickhouse.hbase
Class CachedGetUDF
java.lang.Object
org.apache.hadoop.hive.ql.udf.generic.GenericUDF
brickhouse.hbase.CachedGetUDF
public class CachedGetUDF
- extends org.apache.hadoop.hive.ql.udf.generic.GenericUDF
Load data from HBase, and cache locally in memory, for faster access.
Similar to using a distributed map, except the values are stored in HBase,
and can be sharded across multiple nodes, so one can process elements which
wouldn't fit into memory on a single node.
This may be useful in situations where you would potentially have a cartesian
product ( bayesian topic assignment, similiarity clustering ), and would
want to avoid an extra join.
One can cache strings, or arbitrary Hive structures, by storing values as
JSON strings, and using a template object similiar to the one used in
the from_json UDF. An example would be storing a map as
a bag-of-words, or an array<string> to store a sketch-set
| Nested classes/interfaces inherited from class org.apache.hadoop.hive.ql.udf.generic.GenericUDF |
org.apache.hadoop.hive.ql.udf.generic.GenericUDF.DeferredJavaObject, org.apache.hadoop.hive.ql.udf.generic.GenericUDF.DeferredObject |
|
Method Summary |
Object |
evaluate(org.apache.hadoop.hive.ql.udf.generic.GenericUDF.DeferredObject[] arg0)
|
String |
getDisplayString(String[] arg0)
|
Object |
getValue(String key)
|
org.apache.hadoop.hive.serde2.objectinspector.ObjectInspector |
initialize(org.apache.hadoop.hive.serde2.objectinspector.ObjectInspector[] parameters)
User should pass in a constant Map of HBase parameters,
the String key to look up, |
| Methods inherited from class org.apache.hadoop.hive.ql.udf.generic.GenericUDF |
getRequiredFiles, getRequiredJars, initializeAndFoldConstants |
| Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
CachedGetUDF
public CachedGetUDF()
evaluate
public Object evaluate(org.apache.hadoop.hive.ql.udf.generic.GenericUDF.DeferredObject[] arg0)
throws org.apache.hadoop.hive.ql.metadata.HiveException
- Specified by:
evaluate in class org.apache.hadoop.hive.ql.udf.generic.GenericUDF
- Throws:
org.apache.hadoop.hive.ql.metadata.HiveException
getValue
public Object getValue(String key)
getDisplayString
public String getDisplayString(String[] arg0)
- Specified by:
getDisplayString in class org.apache.hadoop.hive.ql.udf.generic.GenericUDF
initialize
public org.apache.hadoop.hive.serde2.objectinspector.ObjectInspector initialize(org.apache.hadoop.hive.serde2.objectinspector.ObjectInspector[] parameters)
throws org.apache.hadoop.hive.ql.exec.UDFArgumentException
- User should pass in a constant Map of HBase parameters,
the String key to look up,
- Specified by:
initialize in class org.apache.hadoop.hive.ql.udf.generic.GenericUDF
- Throws:
org.apache.hadoop.hive.ql.exec.UDFArgumentException
Copyright © 2013. All rights reserved.