public abstract class TextSimilarity extends Object implements Similarity, SimilarityRanker
| Modifier and Type | Field and Description |
|---|---|
protected boolean |
filterStopWord |
protected static org.slf4j.Logger |
LOGGER |
thresholdRate| Constructor and Description |
|---|
TextSimilarity() |
| Modifier and Type | Method and Description |
|---|---|
protected abstract double |
scoreImpl(List<Word> words1,
List<Word> words2)
计算相似度分值
|
void |
setSegmentationAlgorithm(SegmentationAlgorithm segmentationAlgorithm) |
double |
similarScore(List<Word> words1,
List<Word> words2)
词列表1和词列表2的相似度分值
|
double |
similarScore(String text1,
String text2)
文本1和文本2的相似度分值
|
protected void |
taggingWeightWithWordFrequency(List<Word> words1,
List<Word> words2)
如果没有指定权重,则默认使用词频来标注词的权重
词频数据怎么来?
一个词在词列表1中出现了几次,它在词列表1中的权重就是几
一个词在词列表2中出现了几次,它在词列表2中的权重就是几
标注好的权重存储在Word类的weight字段中
|
protected Map<String,Float> |
toFastSearchMap(List<Word> words)
构造权重快速搜索容器
|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, waitmain, rank, rankisSimilar, isSimilar, isSimilar, isSimilar, similarScore, similarScoreprotected static final org.slf4j.Logger LOGGER
protected boolean filterStopWord
public void setSegmentationAlgorithm(SegmentationAlgorithm segmentationAlgorithm)
public double similarScore(String text1, String text2)
similarScore in interface Similaritytext1 - 文本1text2 - 文本2public double similarScore(List<Word> words1, List<Word> words2)
similarScore in interface Similaritywords1 - 词列表1words2 - 词列表2protected abstract double scoreImpl(List<Word> words1, List<Word> words2)
words1 - 词列表1words2 - 词列表2protected void taggingWeightWithWordFrequency(List<Word> words1, List<Word> words2)
words1 - 词列表1words2 - 词列表2Copyright © 2014–2015 APDPlat. All rights reserved.