org.elasticsearch.index.analysis
Class ICUFoldingFilter
java.lang.Object
org.apache.lucene.util.AttributeSource
org.apache.lucene.analysis.TokenStream
org.apache.lucene.analysis.TokenFilter
org.elasticsearch.index.analysis.ICUNormalizer2Filter
org.elasticsearch.index.analysis.ICUFoldingFilter
- All Implemented Interfaces:
- java.io.Closeable
public final class ICUFoldingFilter
- extends ICUNormalizer2Filter
A TokenFilter that applies search term folding to Unicode text,
applying foldings from UTR#30 Character Foldings.
This filter applies the following foldings from the report to unicode text:
- Accent removal
- Case folding
- Canonical duplicates folding
- Dashes folding
- Diacritic removal (including stroke, hook, descender)
- Greek letterforms folding
- Han Radical folding
- Hebrew Alternates folding
- Jamo folding
- Letterforms folding
- Math symbol folding
- Multigraph Expansions: All
- Native digit folding
- No-break folding
- Overline folding
- Positional forms folding
- Small forms folding
- Space folding
- Spacing Accents folding
- Subscript folding
- Superscript folding
- Suzhou Numeral folding
- Symbol folding
- Underline folding
- Vertical forms folding
- Width folding
Additionally, Default Ignorables are removed, and text is normalized to NFKC.
All foldings, case folding, and normalization mappings are applied recursively
to ensure a fully folded and normalized result.
| Nested classes/interfaces inherited from class org.apache.lucene.util.AttributeSource |
org.apache.lucene.util.AttributeSource.AttributeFactory, org.apache.lucene.util.AttributeSource.State |
| Fields inherited from class org.apache.lucene.analysis.TokenFilter |
input |
|
Constructor Summary |
ICUFoldingFilter(org.apache.lucene.analysis.TokenStream input)
Create a new ICUFoldingFilter on the specified input |
| Methods inherited from class org.apache.lucene.analysis.TokenFilter |
close, end, reset |
| Methods inherited from class org.apache.lucene.util.AttributeSource |
addAttribute, addAttributeImpl, captureState, clearAttributes, cloneAttributes, equals, getAttribute, getAttributeClassesIterator, getAttributeFactory, getAttributeImplsIterator, hasAttribute, hasAttributes, hashCode, restoreState, toString |
| Methods inherited from class java.lang.Object |
clone, finalize, getClass, notify, notifyAll, wait, wait, wait |
ICUFoldingFilter
public ICUFoldingFilter(org.apache.lucene.analysis.TokenStream input)
- Create a new ICUFoldingFilter on the specified input