| Interface | Description |
|---|---|
| CharacterSubstitutionInterface |
Used to indicate the cost of character substitution.
|
| Class | Description |
|---|---|
| Cosine | |
| Damerau |
Implementation of Damerau-Levenshtein distance, computed as the
minimum number of operations needed to transform one string into the other,
where an operation is defined as an insertion, deletion, or substitution of a
single character, or a transposition of two adjacent characters.
|
| Jaccard | |
| JaroWinkler | |
| KShingling |
k-shingling is the operation of transforming a string (or text document) into
a set of n-grams, which can be used to measure the similarity between two
strings or documents.
|
| Levenshtein |
The Levenshtein distance between two words is the minimum number of
single-character edits (insertions, deletions or substitutions) required to
change one word into the other.
|
| LongestCommonSubsequence |
The longest common subsequence (LCS) problem consists in finding the
longest subsequence common to two (or more) sequences.
|
| NGram |
N-Gram Similarity as defined by Kondrak, "N-Gram Similarity and Distance",
String Processing and Information Retrieval, Lecture Notes in Computer
Science Volume 3772, 2005, pp 115-126.
|
| NormalizedLevenshtein | |
| QGram | |
| SorensenDice | |
| StringProfile |
Profile of a string, computed using shingling.
|
| StringSet | |
| WeightedLevenshtein |
Implementation of Levenshtein that allows to define different weights for
different character substitutions.
|
Copyright © 2015. All rights reserved.