Class LMSimilarity

Direct Known Subclasses:
LMDirichletSimilarity, LMJelinekMercerSimilarity

public abstract class LMSimilarity extends SimilarityBase
Abstract superclass for language modeling Similarities. The following inner types are introduced:
  • LMSimilarity.LMStats, which defines a new statistic, the probability that the collection language model generates the current term;
  • LMSimilarity.CollectionModel, which is a strategy interface for object that compute the collection language model p(w|C);
  • LMSimilarity.DefaultCollectionModel, an implementation of the former, that computes the term probability as the number of occurrences of the term in the collection, divided by the total number of tokens.