Package org.dizitart.no2.index.fulltext
Interface TextTokenizer
-
- All Known Implementing Classes:
BaseTextTokenizer,EnglishTextTokenizer,UniversalTextTokenizer
public interface TextTokenizerAn abstract class representing a stop-word based text tokenizer.- Since:
- 1.0
- Author:
- Anindya Chatterjee.
- See Also:
EnglishTextTokenizer
-
-
Method Summary
All Methods Instance Methods Abstract Methods Modifier and Type Method Description LanguagesgetLanguage()Gets the language for the tokenizer.Set<String>stopWords()Gets all stop-words for a language.Set<String>tokenize(String text)Tokenize atextand discards all stop-words from it.
-