Package org.elasticsearch.index.analysis
Class CustomAnalyzer
- java.lang.Object
-
- org.apache.lucene.analysis.Analyzer
-
- org.elasticsearch.index.analysis.CustomAnalyzer
-
- All Implemented Interfaces:
Closeable,AutoCloseable,AnalyzerComponentsProvider
public final class CustomAnalyzer extends Analyzer implements AnalyzerComponentsProvider
-
-
Nested Class Summary
-
Nested classes/interfaces inherited from class org.apache.lucene.analysis.Analyzer
Analyzer.ReuseStrategy, Analyzer.TokenStreamComponents
-
-
Field Summary
-
Fields inherited from class org.apache.lucene.analysis.Analyzer
GLOBAL_REUSE_STRATEGY, PER_FIELD_REUSE_STRATEGY
-
-
Constructor Summary
Constructors Constructor Description CustomAnalyzer(TokenizerFactory tokenizerFactory, CharFilterFactory[] charFilters, TokenFilterFactory[] tokenFilters)CustomAnalyzer(TokenizerFactory tokenizerFactory, CharFilterFactory[] charFilters, TokenFilterFactory[] tokenFilters, int positionIncrementGap, int offsetGap)
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description CharFilterFactory[]charFilters()protected Analyzer.TokenStreamComponentscreateComponents(String fieldName)Creates a newAnalyzer.TokenStreamComponentsinstance for this analyzer.AnalysisModegetAnalysisMode()AnalyzerComponentsgetComponents()intgetOffsetGap(String field)Just likeAnalyzer.getPositionIncrementGap(java.lang.String), except for Token offsets instead.intgetPositionIncrementGap(String fieldName)Invoked before indexing a IndexableField instance if terms have already been added to that field.protected ReaderinitReader(String fieldName, Reader reader)Override this if you want to add a CharFilter chain.protected ReaderinitReaderForNormalization(String fieldName, Reader reader)Wrap the givenReaderwithCharFilters that make sense for normalization.protected TokenStreamnormalize(String fieldName, TokenStream in)Wrap the givenTokenStreamin order to apply normalization filters.TokenFilterFactory[]tokenFilters()TokenizerFactorytokenizerFactory()-
Methods inherited from class org.apache.lucene.analysis.Analyzer
attributeFactory, close, getReuseStrategy, getVersion, normalize, setVersion, tokenStream, tokenStream
-
-
-
-
Constructor Detail
-
CustomAnalyzer
public CustomAnalyzer(TokenizerFactory tokenizerFactory, CharFilterFactory[] charFilters, TokenFilterFactory[] tokenFilters)
-
CustomAnalyzer
public CustomAnalyzer(TokenizerFactory tokenizerFactory, CharFilterFactory[] charFilters, TokenFilterFactory[] tokenFilters, int positionIncrementGap, int offsetGap)
-
-
Method Detail
-
tokenizerFactory
public TokenizerFactory tokenizerFactory()
-
tokenFilters
public TokenFilterFactory[] tokenFilters()
-
charFilters
public CharFilterFactory[] charFilters()
-
getPositionIncrementGap
public int getPositionIncrementGap(String fieldName)
Description copied from class:AnalyzerInvoked before indexing a IndexableField instance if terms have already been added to that field. This allows custom analyzers to place an automatic position increment gap between IndexbleField instances using the same field name. The default value position increment gap is 0. With a 0 position increment gap and the typical default token position increment of 1, all terms in a field, including across IndexableField instances, are in successive positions, allowing exact PhraseQuery matches, for instance, across IndexableField instance boundaries.- Overrides:
getPositionIncrementGapin classAnalyzer- Parameters:
fieldName- IndexableField name being indexed.- Returns:
- position increment gap, added to the next token emitted from
Analyzer.tokenStream(String,Reader). This value must be>= 0.
-
getOffsetGap
public int getOffsetGap(String field)
Description copied from class:AnalyzerJust likeAnalyzer.getPositionIncrementGap(java.lang.String), except for Token offsets instead. By default this returns 1. This method is only called if the field produced at least one token for indexing.- Overrides:
getOffsetGapin classAnalyzer- Parameters:
field- the field just indexed- Returns:
- offset gap, added to the next token emitted from
Analyzer.tokenStream(String,Reader). This value must be>= 0.
-
getAnalysisMode
public AnalysisMode getAnalysisMode()
-
getComponents
public AnalyzerComponents getComponents()
- Specified by:
getComponentsin interfaceAnalyzerComponentsProvider
-
createComponents
protected Analyzer.TokenStreamComponents createComponents(String fieldName)
Description copied from class:AnalyzerCreates a newAnalyzer.TokenStreamComponentsinstance for this analyzer.- Specified by:
createComponentsin classAnalyzer- Parameters:
fieldName- the name of the fields content passed to theAnalyzer.TokenStreamComponentssink as a reader- Returns:
- the
Analyzer.TokenStreamComponentsfor this analyzer.
-
initReader
protected Reader initReader(String fieldName, Reader reader)
Description copied from class:AnalyzerOverride this if you want to add a CharFilter chain.The default implementation returns
readerunchanged.- Overrides:
initReaderin classAnalyzer- Parameters:
fieldName- IndexableField name being indexedreader- original Reader- Returns:
- reader, optionally decorated with CharFilter(s)
-
initReaderForNormalization
protected Reader initReaderForNormalization(String fieldName, Reader reader)
Description copied from class:AnalyzerWrap the givenReaderwithCharFilters that make sense for normalization. This is typically a subset of theCharFilters that are applied inAnalyzer.initReader(String, Reader). This is used byAnalyzer.normalize(String, String).- Overrides:
initReaderForNormalizationin classAnalyzer
-
normalize
protected TokenStream normalize(String fieldName, TokenStream in)
Description copied from class:AnalyzerWrap the givenTokenStreamin order to apply normalization filters. The default implementation returns theTokenStreamas-is. This is used byAnalyzer.normalize(String, String).
-
-