Package opennlp.tools.tokenize
Class WhitespaceTokenizer
java.lang.Object
opennlp.tools.tokenize.WhitespaceTokenizer
- All Implemented Interfaces:
Tokenizer
This tokenizer uses white spaces to tokenize the input text.
To obtain an instance of this tokenizer use the static final
INSTANCE field.-
Field Summary
FieldsModifier and TypeFieldDescriptionstatic final WhitespaceTokenizerUse this static reference to retrieve an instance of theWhitespaceTokenizer. -
Method Summary
-
Field Details
-
INSTANCE
Use this static reference to retrieve an instance of theWhitespaceTokenizer.
-
-
Method Details
-
tokenizePos
Description copied from interface:TokenizerFinds the boundaries of atomic parts in a string.- Parameters:
d- The string to be tokenized.- Returns:
- The Span[] with the spans (offsets into s) for each token as the individuals array elements.
-
tokenize
Description copied from interface:TokenizerSplits a string into its atomic parts
-