Class RussianAnalyzer

All Implemented Interfaces:
Closeable, AutoCloseable

public final class RussianAnalyzer extends StopwordAnalyzerBase
Analyzer for Russian language.

Supports an external list of stopwords (words that will not be indexed at all). A default set of stopwords is used unless an alternative list is specified.

You must specify the required Version compatibility when creating RussianAnalyzer:

  • As of 3.1, StandardTokenizer is used, Snowball stemming is done with SnowballFilter, and Snowball stopwords are used by default.
  • Field Details

    • DEFAULT_STOPWORD_FILE

      public static final String DEFAULT_STOPWORD_FILE
      File containing default Russian stopwords.
      See Also:
  • Constructor Details

    • RussianAnalyzer

      public RussianAnalyzer(Version matchVersion)
    • RussianAnalyzer

      public RussianAnalyzer(Version matchVersion, CharArraySet stopwords)
      Builds an analyzer with the given stop words
      Parameters:
      matchVersion - lucene compatibility version
      stopwords - a stopword set
    • RussianAnalyzer

      public RussianAnalyzer(Version matchVersion, CharArraySet stopwords, CharArraySet stemExclusionSet)
      Builds an analyzer with the given stop words
      Parameters:
      matchVersion - lucene compatibility version
      stopwords - a stopword set
      stemExclusionSet - a set of words not to be stemmed
  • Method Details

    • getDefaultStopSet

      public static CharArraySet getDefaultStopSet()
      Returns an unmodifiable instance of the default stop-words set.
      Returns:
      an unmodifiable instance of the default stop-words set.