brickhouse.udf.sketch
Class SetSimilarityUDF

java.lang.Object
  extended by org.apache.hadoop.hive.ql.exec.UDF
      extended by brickhouse.udf.sketch.SetSimilarityUDF

public class SetSimilarityUDF
extends org.apache.hadoop.hive.ql.exec.UDF

Compute the Jaccard similarity of two sketch sets. Jaccard Similarity is defined as the size of the intersection of two sets divided by the size of the union of the sets. Since sketches are only approximate measures, this calculation only makes sense when the sets are roughly the same size.


Constructor Summary
SetSimilarityUDF()
           
 
Method Summary
 Double evaluate(List<String> a, List<String> b)
           
 
Methods inherited from class org.apache.hadoop.hive.ql.exec.UDF
getRequiredFiles, getRequiredJars, getResolver, setResolver
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

SetSimilarityUDF

public SetSimilarityUDF()
Method Detail

evaluate

public Double evaluate(List<String> a,
                       List<String> b)


Copyright © 2013. All rights reserved.