Class HyphenationCompoundWordTokenFilterFactory
- java.lang.Object
-
- org.apache.lucene.analysis.util.AbstractAnalysisFactory
-
- org.apache.lucene.analysis.util.TokenFilterFactory
-
- org.apache.lucene.analysis.compound.HyphenationCompoundWordTokenFilterFactory
-
- All Implemented Interfaces:
ResourceLoaderAware
public class HyphenationCompoundWordTokenFilterFactory extends TokenFilterFactory implements ResourceLoaderAware
Factory forHyphenationCompoundWordTokenFilter.This factory accepts the following parameters:
hyphenator(mandatory): path to the FOP xml hyphenation pattern. See http://offo.sourceforge.net/hyphenation/.encoding(optional): encoding of the xml hyphenation file. defaults to UTF-8.dictionary(optional): dictionary of words. defaults to no dictionary.minWordSize(optional): minimal word length that gets decomposed. defaults to 5.minSubwordSize(optional): minimum length of subwords. defaults to 2.maxSubwordSize(optional): maximum length of subwords. defaults to 15.onlyLongestMatch(optional): if true, adds only the longest matching subword to the stream. defaults to false.
<fieldType name="text_hyphncomp" class="solr.TextField" positionIncrementGap="100"> <analyzer> <tokenizer class="solr.WhitespaceTokenizerFactory"/> <filter class="solr.HyphenationCompoundWordTokenFilterFactory" hyphenator="hyphenator.xml" encoding="UTF-8" dictionary="dictionary.txt" minWordSize="5" minSubwordSize="2" maxSubwordSize="15" onlyLongestMatch="false"/> </analyzer> </fieldType>- See Also:
HyphenationCompoundWordTokenFilter
-
-
Field Summary
-
Fields inherited from class org.apache.lucene.analysis.util.AbstractAnalysisFactory
LUCENE_MATCH_VERSION_PARAM
-
-
Constructor Summary
Constructors Constructor Description HyphenationCompoundWordTokenFilterFactory(java.util.Map<java.lang.String,java.lang.String> args)Creates a new HyphenationCompoundWordTokenFilterFactory
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description HyphenationCompoundWordTokenFiltercreate(TokenStream input)Transform the specified input TokenStreamvoidinform(ResourceLoader loader)Initializes this component with the provided ResourceLoader (used for loading classes, files, etc).-
Methods inherited from class org.apache.lucene.analysis.util.TokenFilterFactory
availableTokenFilters, forName, lookupClass, reloadTokenFilters
-
Methods inherited from class org.apache.lucene.analysis.util.AbstractAnalysisFactory
get, get, get, get, get, getChar, getClassArg, getLuceneMatchVersion, getOriginalArgs, getSet, isExplicitLuceneMatchVersion, require, require, require, requireChar, setExplicitLuceneMatchVersion
-
-
-
-
Method Detail
-
inform
public void inform(ResourceLoader loader) throws java.io.IOException
Description copied from interface:ResourceLoaderAwareInitializes this component with the provided ResourceLoader (used for loading classes, files, etc).- Specified by:
informin interfaceResourceLoaderAware- Throws:
java.io.IOException
-
create
public HyphenationCompoundWordTokenFilter create(TokenStream input)
Description copied from class:TokenFilterFactoryTransform the specified input TokenStream- Specified by:
createin classTokenFilterFactory
-
-