Class CompoundWordTokenFilterBase
java.lang.Object
co.elastic.clients.elasticsearch._types.analysis.TokenFilterBase
co.elastic.clients.elasticsearch._types.analysis.CompoundWordTokenFilterBase
- All Implemented Interfaces:
JsonpSerializable
- Direct Known Subclasses:
DictionaryDecompounderTokenFilter,HyphenationDecompounderTokenFilter
- See Also:
-
Nested Class Summary
Nested ClassesModifier and TypeClassDescriptionstatic classCompoundWordTokenFilterBase.AbstractBuilder<BuilderT extends CompoundWordTokenFilterBase.AbstractBuilder<BuilderT>> -
Constructor Summary
ConstructorsModifierConstructorDescriptionprotected -
Method Summary
Modifier and TypeMethodDescriptionfinal IntegerMaximum subword character length.final IntegerMinimum subword character length.final IntegerMinimum word character length.final BooleanIftrue, only include the longest matching subword.protected voidserializeInternal(jakarta.json.stream.JsonGenerator generator, JsonpMapper mapper) protected static <BuilderT extends CompoundWordTokenFilterBase.AbstractBuilder<BuilderT>>
voidwordList()A list of subwords to look for in the token stream.final StringPath to a file that contains a list of subwords to find in the token stream.Methods inherited from class co.elastic.clients.elasticsearch._types.analysis.TokenFilterBase
serialize, setupTokenFilterBaseDeserializer, toString, version
-
Constructor Details
-
CompoundWordTokenFilterBase
-
-
Method Details
-
maxSubwordSize
Maximum subword character length. Longer subword tokens are excluded from the output. Defaults to15.API name:
max_subword_size -
minSubwordSize
Minimum subword character length. Shorter subword tokens are excluded from the output. Defaults to2.API name:
min_subword_size -
minWordSize
Minimum word character length. Shorter word tokens are excluded from the output. Defaults to5.API name:
min_word_size -
onlyLongestMatch
Iftrue, only include the longest matching subword. Defaults tofalse.API name:
only_longest_match -
wordList
A list of subwords to look for in the token stream. If found, the subword is included in the token output. Either this parameter orword_list_pathmust be specified.API name:
word_list -
wordListPath
Path to a file that contains a list of subwords to find in the token stream. If found, the subword is included in the token output. This path must be absolute or relative to the config location, and the file must be UTF-8 encoded. Each token in the file must be separated by a line break. Either this parameter orword_listmust be specified.API name:
word_list_path -
serializeInternal
- Overrides:
serializeInternalin classTokenFilterBase
-
setupCompoundWordTokenFilterBaseDeserializer
protected static <BuilderT extends CompoundWordTokenFilterBase.AbstractBuilder<BuilderT>> void setupCompoundWordTokenFilterBaseDeserializer(ObjectDeserializer<BuilderT> op)
-