Package opennlp.tools.tokenize
Class TokenSampleStream
- All Implemented Interfaces:
AutoCloseable,ObjectStream<TokenSample>
This class is a stream filter which reads in string encoded samples and creates
TokenSamples out of them. The input string sample is tokenized if a
whitespace or the special separator chars occur.
Sample:
"token1 token2 token3<SPLIT>token4"
The tokens token1 and token2 are separated by a whitespace, token3 and token3
are separated by the special character sequence, in this case the default
split sequence.
The sequence must be unique in the input string and is not escaped.
-
Constructor Summary
ConstructorsConstructorDescriptionTokenSampleStream(ObjectStream<String> sentences) TokenSampleStream(ObjectStream<String> sampleStrings, String separatorChars) -
Method Summary
Methods inherited from class opennlp.tools.util.FilterObjectStream
close, reset
-
Constructor Details
-
TokenSampleStream
-
TokenSampleStream
-
-
Method Details
-
read
Description copied from interface:ObjectStreamReturns the next object. Calling this method repeatedly until it returns null will return each object from the underlying source exactly once.- Returns:
- the next object or null to signal that the stream is exhausted
- Throws:
IOException- if there is an error during reading
-