Class HindiAnalyzer

All Implemented Interfaces:
Closeable, AutoCloseable

public final class HindiAnalyzer extends StopwordAnalyzerBase
Analyzer for Hindi.

You must specify the required Version compatibility when creating HindiAnalyzer:

  • As of 3.6, StandardTokenizer is used for tokenization
  • Field Details

    • DEFAULT_STOPWORD_FILE

      public static final String DEFAULT_STOPWORD_FILE
      File containing default Hindi stopwords. Default stopword list is from http://members.unine.ch/jacques.savoy/clef/index.html The stopword list is BSD-Licensed.
      See Also:
  • Constructor Details

    • HindiAnalyzer

      public HindiAnalyzer(Version version, CharArraySet stopwords, CharArraySet stemExclusionSet)
      Builds an analyzer with the given stop words
      Parameters:
      version - lucene compatibility version
      stopwords - a stopword set
      stemExclusionSet - a stemming exclusion set
    • HindiAnalyzer

      public HindiAnalyzer(Version version, CharArraySet stopwords)
      Builds an analyzer with the given stop words
      Parameters:
      version - lucene compatibility version
      stopwords - a stopword set
    • HindiAnalyzer

      public HindiAnalyzer(Version version)
      Builds an analyzer with the default stop words: DEFAULT_STOPWORD_FILE.
  • Method Details

    • getDefaultStopSet

      public static CharArraySet getDefaultStopSet()
      Returns an unmodifiable instance of the default stop-words set.
      Returns:
      an unmodifiable instance of the default stop-words set.