Package it.unimi.dsi.io
Class LineWordReader
- java.lang.Object
-
- it.unimi.dsi.io.LineWordReader
-
- All Implemented Interfaces:
WordReader,Serializable
public class LineWordReader extends Object implements WordReader, Serializable
A trivialWordReaderthat considers each line of a document a single word.The intended usage of this class is that of indexing stuff like lists of document identifiers: if the identifiers contain nonalphabetical characters, the default
FastBufferedReadermight do a poor job.Note that the non-word returned by
next(MutableString, MutableString)is always empty.- See Also:
- Serialized Form
-
-
Constructor Summary
Constructors Constructor Description LineWordReader()
-
Method Summary
Modifier and Type Method Description LineWordReadercopy()Returns a copy of this word reader.booleannext(MutableString word, MutableString nonWord)Extracts the next word and non-word.LineWordReadersetReader(Reader reader)Resets the internal state of this word reader, which will start again reading from the given reader.
-
-
-
Method Detail
-
next
public boolean next(MutableString word, MutableString nonWord) throws IOException
Description copied from interface:WordReaderExtracts the next word and non-word.If this method returns true, a new non-empty word, and possibly a new non-word, have been extracted. It is acceptable that the first call to this method after creation or after a call to
WordReader.setReader(Reader)returns an empty word. In other words bothwordandnonWordare maximal.- Specified by:
nextin interfaceWordReader- Parameters:
word- the next word returned by the underlying reader.nonWord- the nonword following the next word returned by the underlying reader.- Returns:
- true if a new word was processed, false otherwise (in which
case both
wordandnonWordare unchanged). - Throws:
IOException
-
setReader
public LineWordReader setReader(Reader reader)
Description copied from interface:WordReaderResets the internal state of this word reader, which will start again reading from the given reader.- Specified by:
setReaderin interfaceWordReader- Parameters:
reader- the new reader providing characters.- Returns:
- this word reader.
-
copy
public LineWordReader copy()
Description copied from interface:WordReaderReturns a copy of this word reader.This method must return a word reader with a behaviour that matches exactly that of this word reader.
- Specified by:
copyin interfaceWordReader- Returns:
- a copy of this word reader.
-
-