public class SourcedSoftTFIDF extends SourcedTFIDF
On the WHIRL datasets, thresholding JaroWinkler at 0.9 or 0.95 seems to be about right.
SourcedTFIDF.UnitVectorcollectionSize, documentFrequency, totalTokenCounttokenizer| Constructor and Description |
|---|
SourcedSoftTFIDF() |
SourcedSoftTFIDF(SourcedTokenizer tokenizer,
StringDistance tokenDistance,
double tokenMatchThreshold) |
SourcedSoftTFIDF(StringDistance tokenDistance) |
SourcedSoftTFIDF(StringDistance tokenDistance,
double tokenMatchThreshold) |
| Modifier and Type | Method and Description |
|---|---|
String |
explainScore(StringWrapper s,
StringWrapper t)
Explain how the distance was computed.
|
double |
getTokenMatchThreshold() |
double |
score(StringWrapper s0,
StringWrapper t0)
This method needs to be implemented by subclasses.
|
void |
setTokenMatchThreshold(double d) |
void |
setTokenMatchThreshold(Double d) |
String |
toString() |
asUnitVector, getCollectionSize, getDocumentFrequency, getTokens, getWeight, main, prepare, setCollectionSize, setDocumentFrequencycheckTrainingHasHappened, trainasBagOfSourcedTokens, prepare, setStringWrapperPooladdExample, doMain, explainScore, getDistance, hasNextQuery, nextQuery, prepare, score, setDistanceInstancePoolpublic SourcedSoftTFIDF(SourcedTokenizer tokenizer, StringDistance tokenDistance, double tokenMatchThreshold)
public SourcedSoftTFIDF(StringDistance tokenDistance, double tokenMatchThreshold)
public SourcedSoftTFIDF(StringDistance tokenDistance)
public SourcedSoftTFIDF()
public void setTokenMatchThreshold(double d)
public void setTokenMatchThreshold(Double d)
public double getTokenMatchThreshold()
public double score(StringWrapper s0, StringWrapper t0)
AbstractStringDistancescore in interface StringDistancescore in class SourcedTFIDFpublic String explainScore(StringWrapper s, StringWrapper t)
explainScore in interface StringDistanceexplainScore in class SourcedTFIDFpublic String toString()
toString in class SourcedTFIDFCopyright © 2016. All rights reserved.