org.apache.lucene.analysis.miscellaneous
Class CodepointCountFilter
java.lang.Object
org.apache.lucene.util.AttributeSource
org.apache.lucene.analysis.TokenStream
org.apache.lucene.analysis.TokenFilter
org.apache.lucene.analysis.util.FilteringTokenFilter
org.apache.lucene.analysis.miscellaneous.CodepointCountFilter
- All Implemented Interfaces:
- Closeable
public final class CodepointCountFilter
- extends FilteringTokenFilter
Removes words that are too long or too short from the stream.
Note: Length is calculated as the number of Unicode codepoints.
| Methods inherited from class org.apache.lucene.util.AttributeSource |
addAttribute, addAttributeImpl, captureState, clearAttributes, cloneAttributes, copyTo, equals, getAttribute, getAttributeClassesIterator, getAttributeFactory, getAttributeImplsIterator, hasAttribute, hasAttributes, hashCode, reflectAsString, reflectWith, restoreState, toString |
CodepointCountFilter
public CodepointCountFilter(Version version,
TokenStream in,
int min,
int max)
- Create a new
CodepointCountFilter. This will filter out tokens whose
CharTermAttribute is either too short (Character.codePointCount(char[], int, int)
< min) or too long (Character.codePointCount(char[], int, int) > max).
- Parameters:
version - the Lucene match versionin - the TokenStream to consumemin - the minimum lengthmax - the maximum length
accept
public boolean accept()
- Description copied from class:
FilteringTokenFilter
- Override this method and return if the current input token should be returned by
FilteringTokenFilter.incrementToken().
- Specified by:
accept in class FilteringTokenFilter
Copyright © 2000-2014 Apache Software Foundation. All Rights Reserved.