Class CapitalizationFilterFactory
java.lang.Object
org.apache.lucene.analysis.util.AbstractAnalysisFactory
org.apache.lucene.analysis.util.TokenFilterFactory
org.apache.lucene.analysis.miscellaneous.CapitalizationFilterFactory
Factory for
"onlyFirstWord" - should each word be capitalized or all of the words?
"keep" - a keep word list. Each word that should be kept separated by whitespace.
"keepIgnoreCase - true or false. If true, the keep list will be considered case-insensitive.
"forceFirstLetter" - Force the first letter to be capitalized even if it is in the keep list
"okPrefix" - do not change word capitalization if a word begins with something in this list. for example if "McK" is on the okPrefix list, the word "McKinley" should not be changed to "Mckinley"
"minWordLength" - how long the word needs to be to get capitalization applied. If the minWordLength is 3, "and" > "And" but "or" stays "or"
"maxWordCount" - if the token contains more then maxWordCount words, the capitalization is assumed to be correct.
CapitalizationFilter.
The factory takes parameters:"onlyFirstWord" - should each word be capitalized or all of the words?
"keep" - a keep word list. Each word that should be kept separated by whitespace.
"keepIgnoreCase - true or false. If true, the keep list will be considered case-insensitive.
"forceFirstLetter" - Force the first letter to be capitalized even if it is in the keep list
"okPrefix" - do not change word capitalization if a word begins with something in this list. for example if "McK" is on the okPrefix list, the word "McKinley" should not be changed to "Mckinley"
"minWordLength" - how long the word needs to be to get capitalization applied. If the minWordLength is 3, "and" > "And" but "or" stays "or"
"maxWordCount" - if the token contains more then maxWordCount words, the capitalization is assumed to be correct.
<fieldType name="text_cptlztn" class="solr.TextField" positionIncrementGap="100">
<analyzer>
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<filter class="solr.CapitalizationFilterFactory" onlyFirstWord="true"
keep="java solr lucene" keepIgnoreCase="false"
okPrefix="McK McD McA"/>
</analyzer>
</fieldType>- Since:
- solr 1.3
-
Field Summary
FieldsModifier and TypeFieldDescriptionstatic final Stringstatic final Stringstatic final Stringstatic final Stringstatic final Stringstatic final Stringstatic final Stringstatic final StringFields inherited from class org.apache.lucene.analysis.util.AbstractAnalysisFactory
LUCENE_MATCH_VERSION_PARAM -
Constructor Summary
ConstructorsConstructorDescriptionCreates a new CapitalizationFilterFactory -
Method Summary
Modifier and TypeMethodDescriptioncreate(TokenStream input) Transform the specified input TokenStreamMethods inherited from class org.apache.lucene.analysis.util.TokenFilterFactory
availableTokenFilters, forName, lookupClass, reloadTokenFiltersMethods inherited from class org.apache.lucene.analysis.util.AbstractAnalysisFactory
get, get, get, get, get, getChar, getClassArg, getLuceneMatchVersion, getOriginalArgs, getSet, isExplicitLuceneMatchVersion, require, require, require, requireChar, setExplicitLuceneMatchVersion
-
Field Details
-
KEEP
- See Also:
-
KEEP_IGNORE_CASE
- See Also:
-
OK_PREFIX
- See Also:
-
MIN_WORD_LENGTH
- See Also:
-
MAX_WORD_COUNT
- See Also:
-
MAX_TOKEN_LENGTH
- See Also:
-
ONLY_FIRST_WORD
- See Also:
-
FORCE_FIRST_LETTER
- See Also:
-
-
Constructor Details
-
CapitalizationFilterFactory
Creates a new CapitalizationFilterFactory
-
-
Method Details
-
create
Description copied from class:TokenFilterFactoryTransform the specified input TokenStream- Specified by:
createin classTokenFilterFactory
-