Class LineRecordReader
- java.lang.Object
-
- org.datavec.api.records.reader.BaseRecordReader
-
- org.datavec.api.records.reader.impl.LineRecordReader
-
- All Implemented Interfaces:
Closeable,Serializable,AutoCloseable,Configurable,RecordReader
- Direct Known Subclasses:
CSVRecordReader,JacksonLineRecordReader,RegexLineRecordReader,SVMLightRecordReader
public class LineRecordReader extends BaseRecordReader
Reads files line by line- Author:
- Adam Gibson
- See Also:
- Serialized Form
-
-
Field Summary
Fields Modifier and Type Field Description protected Stringcharsetprotected Configurationconfprotected booleaninitializedprotected intlineIndexprotected URI[]locationsprotected intsplitIndex-
Fields inherited from class org.datavec.api.records.reader.BaseRecordReader
inputSplit, listeners, streamCreatorFn
-
Fields inherited from interface org.datavec.api.records.reader.RecordReader
APPEND_LABEL, LABELS, NAME_SPACE
-
-
Constructor Summary
Constructors Constructor Description LineRecordReader()
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description voidclose()protected voidcloseIfRequired(Iterator<String> iterator)ConfigurationgetConf()Return the configuration used by this object.protected Iterator<String>getIterator(int location)List<String>getLabels()List of label stringsbooleanhasNext()Whether there are anymore recordsvoidinitialize(Configuration conf, InputSplit split)Called once at initialization.voidinitialize(InputSplit split)Called once at initialization.List<Record>loadFromMetaData(List<RecordMetaData> recordMetaDatas)Load multiple records from the given a list ofRecordMetaDatainstancesRecordloadFromMetaData(RecordMetaData recordMetaData)Load a single record from the givenRecordMetaDatainstance
Note: that for data that isn't splittable (i.e., text data that needs to be scanned/split), it is more efficient to load multiple records at once usingRecordReader.loadFromMetaData(List)List<Writable>next()Get the next recordRecordnextRecord()Similar toRecordReader.next(), but returns aRecordobject, that may include metadata such as the source of the dataprotected voidonLocationOpen(URI location)List<Writable>record(URI uri, DataInputStream dataInputStream)Load the record from the given DataInputStream UnlikeRecordReader.next()the internal state of the RecordReader is not modified Implementations of this method should not close the DataInputStreamvoidreset()Reset record reader iteratorbooleanresetSupported()voidsetConf(Configuration conf)Set the configuration to be used by this object.-
Methods inherited from class org.datavec.api.records.reader.BaseRecordReader
batchesSupported, getListeners, invokeListeners, next, setListeners, setListeners
-
-
-
-
Field Detail
-
locations
protected URI[] locations
-
splitIndex
protected int splitIndex
-
lineIndex
protected int lineIndex
-
conf
protected Configuration conf
-
initialized
protected boolean initialized
-
charset
protected String charset
-
-
Method Detail
-
initialize
public void initialize(InputSplit split) throws IOException, InterruptedException
Description copied from interface:RecordReaderCalled once at initialization.- Specified by:
initializein interfaceRecordReader- Overrides:
initializein classBaseRecordReader- Parameters:
split- the split that defines the range of records to read- Throws:
IOExceptionInterruptedException
-
initialize
public void initialize(Configuration conf, InputSplit split) throws IOException, InterruptedException
Description copied from interface:RecordReaderCalled once at initialization.- Parameters:
conf- a configuration for initializationsplit- the split that defines the range of records to read- Throws:
IOExceptionInterruptedException
-
next
public List<Writable> next()
Description copied from interface:RecordReaderGet the next record- Returns:
-
hasNext
public boolean hasNext()
Description copied from interface:RecordReaderWhether there are anymore records- Returns:
-
onLocationOpen
protected void onLocationOpen(URI location)
-
close
public void close() throws IOException- Throws:
IOException
-
setConf
public void setConf(Configuration conf)
Description copied from interface:ConfigurableSet the configuration to be used by this object.
-
getConf
public Configuration getConf()
Description copied from interface:ConfigurableReturn the configuration used by this object.
-
getLabels
public List<String> getLabels()
Description copied from interface:RecordReaderList of label strings- Returns:
-
reset
public void reset()
Description copied from interface:RecordReaderReset record reader iterator
-
resetSupported
public boolean resetSupported()
- Returns:
- True if the record reader can be reset, false otherwise. Note that some record readers cannot be reset - for example, if they are backed by a non-resettable input split (such as certain types of streams)
-
record
public List<Writable> record(URI uri, DataInputStream dataInputStream) throws IOException
Description copied from interface:RecordReaderLoad the record from the given DataInputStream UnlikeRecordReader.next()the internal state of the RecordReader is not modified Implementations of this method should not close the DataInputStream- Throws:
IOException- if error occurs during reading from the input stream
-
nextRecord
public Record nextRecord()
Description copied from interface:RecordReaderSimilar toRecordReader.next(), but returns aRecordobject, that may include metadata such as the source of the data- Returns:
- next record
-
loadFromMetaData
public Record loadFromMetaData(RecordMetaData recordMetaData) throws IOException
Description copied from interface:RecordReaderLoad a single record from the givenRecordMetaDatainstance
Note: that for data that isn't splittable (i.e., text data that needs to be scanned/split), it is more efficient to load multiple records at once usingRecordReader.loadFromMetaData(List)- Parameters:
recordMetaData- Metadata for the record that we want to load from- Returns:
- Single record for the given RecordMetaData instance
- Throws:
IOException- If I/O error occurs during loading
-
loadFromMetaData
public List<Record> loadFromMetaData(List<RecordMetaData> recordMetaDatas) throws IOException
Description copied from interface:RecordReaderLoad multiple records from the given a list ofRecordMetaDatainstances- Parameters:
recordMetaDatas- Metadata for the records that we want to load from- Returns:
- Multiple records for the given RecordMetaData instances
- Throws:
IOException- If I/O error occurs during loading
-
-