brickhouse.udf.sketch
Class SetSimilarityUDF
java.lang.Object
org.apache.hadoop.hive.ql.exec.UDF
brickhouse.udf.sketch.SetSimilarityUDF
public class SetSimilarityUDF
- extends org.apache.hadoop.hive.ql.exec.UDF
Compute the Jaccard similarity of two sketch sets.
Jaccard Similarity is defined as the size of the intersection of two sets divided by the
size of the union of the sets. Since sketches are only approximate measures, this
calculation only makes sense when the sets are roughly the same size.
| Methods inherited from class org.apache.hadoop.hive.ql.exec.UDF |
getRequiredFiles, getRequiredJars, getResolver, setResolver |
| Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
SetSimilarityUDF
public SetSimilarityUDF()
evaluate
public Double evaluate(List<String> a,
List<String> b)
Copyright © 2013. All rights reserved.