Package org.apache.lucene.analysis.cz
Class CzechAnalyzer
java.lang.Object
org.apache.lucene.analysis.Analyzer
org.apache.lucene.analysis.util.StopwordAnalyzerBase
org.apache.lucene.analysis.cz.CzechAnalyzer
- All Implemented Interfaces:
Closeable,AutoCloseable
Analyzer for Czech language.
Supports an external list of stopwords (words that will not be indexed at all). A default set of stopwords is used unless an alternative list is specified.
You must specify the required Version compatibility when creating
CzechAnalyzer:
- As of 3.1, words are stemmed with
CzechStemFilter - As of 2.9, StopFilter preserves position increments
- As of 2.4, Tokens incorrectly identified as acronyms are corrected (see LUCENE-1068)
-
Nested Class Summary
Nested classes/interfaces inherited from class org.apache.lucene.analysis.Analyzer
Analyzer.GlobalReuseStrategy, Analyzer.PerFieldReuseStrategy, Analyzer.ReuseStrategy, Analyzer.TokenStreamComponents -
Field Summary
FieldsFields inherited from class org.apache.lucene.analysis.Analyzer
GLOBAL_REUSE_STRATEGY, PER_FIELD_REUSE_STRATEGY -
Constructor Summary
ConstructorsConstructorDescriptionCzechAnalyzer(Version matchVersion) Builds an analyzer with the default stop words (getDefaultStopSet()).CzechAnalyzer(Version matchVersion, CharArraySet stopwords) Builds an analyzer with the given stop words.CzechAnalyzer(Version matchVersion, CharArraySet stopwords, CharArraySet stemExclusionTable) Builds an analyzer with the given stop words and a set of work to be excluded from theCzechStemFilter. -
Method Summary
Modifier and TypeMethodDescriptionstatic final CharArraySetReturns a set of default Czech-stopwordsMethods inherited from class org.apache.lucene.analysis.util.StopwordAnalyzerBase
getStopwordSetMethods inherited from class org.apache.lucene.analysis.Analyzer
close, getOffsetGap, getPositionIncrementGap, getReuseStrategy, tokenStream, tokenStream
-
Field Details
-
DEFAULT_STOPWORD_FILE
File containing default Czech stopwords.- See Also:
-
-
Constructor Details
-
CzechAnalyzer
Builds an analyzer with the default stop words (getDefaultStopSet()).- Parameters:
matchVersion- Lucene version to match See}invalid @link
{@link <a href="#version">above</a>
-
CzechAnalyzer
Builds an analyzer with the given stop words.- Parameters:
matchVersion- Lucene version to match See}invalid @link
{@link <a href="#version">above</a>stopwords- a stopword set
-
CzechAnalyzer
Builds an analyzer with the given stop words and a set of work to be excluded from theCzechStemFilter.- Parameters:
matchVersion- Lucene version to match See}invalid @link
{@link <a href="#version">above</a>stopwords- a stopword setstemExclusionTable- a stemming exclusion set
-
-
Method Details
-
getDefaultStopSet
Returns a set of default Czech-stopwords- Returns:
- a set of default Czech-stopwords
-