Package org.datavec.api.records.reader
Interface RecordReader
-
- All Superinterfaces:
AutoCloseable,Closeable,Configurable,Serializable
- All Known Subinterfaces:
SequenceRecordReader
- All Known Implementing Classes:
BaseRecordReader,CollectionRecordReader,CollectionSequenceRecordReader,ComposableRecordReader,ConcatenatingRecordReader,CSVLineSequenceRecordReader,CSVMultiSequenceRecordReader,CSVNLinesSequenceRecordReader,CSVRecordReader,CSVRegexRecordReader,CSVSequenceRecordReader,CSVVariableSlidingWindowRecordReader,FileBatchRecordReader,FileBatchSequenceRecordReader,FileRecordReader,InMemoryRecordReader,InMemorySequenceRecordReader,JacksonLineRecordReader,JacksonLineSequenceRecordReader,JacksonRecordReader,LibSvmRecordReader,LineRecordReader,ListStringRecordReader,MatlabRecordReader,RegexLineRecordReader,RegexSequenceRecordReader,SVMLightRecordReader,TransformProcessRecordReader,TransformProcessSequenceRecordReader
public interface RecordReader extends Closeable, Serializable, Configurable
-
-
Field Summary
Fields Modifier and Type Field Description static StringAPPEND_LABELstatic StringLABELSstatic StringNAME_SPACE
-
Method Summary
All Methods Instance Methods Abstract Methods Modifier and Type Method Description booleanbatchesSupported()This method returns true, if next(int) signature is supported by this RecordReader implementation.List<String>getLabels()List of label stringsList<RecordListener>getListeners()Get the record listeners for this record reader.booleanhasNext()Whether there are anymore recordsvoidinitialize(Configuration conf, InputSplit split)Called once at initialization.voidinitialize(InputSplit split)Called once at initialization.List<Record>loadFromMetaData(List<RecordMetaData> recordMetaDatas)Load multiple records from the given a list ofRecordMetaDatainstancesRecordloadFromMetaData(RecordMetaData recordMetaData)Load a single record from the givenRecordMetaDatainstance
Note: that for data that isn't splittable (i.e., text data that needs to be scanned/split), it is more efficient to load multiple records at once usingloadFromMetaData(List)List<Writable>next()Get the next recordList<List<Writable>>next(int num)This method will be used, if batchesSupported() returns true.RecordnextRecord()List<Writable>record(URI uri, DataInputStream dataInputStream)Load the record from the given DataInputStream Unlikenext()the internal state of the RecordReader is not modified Implementations of this method should not close the DataInputStreamvoidreset()Reset record reader iteratorbooleanresetSupported()voidsetListeners(Collection<RecordListener> listeners)Set the record listeners for this record reader.voidsetListeners(RecordListener... listeners)Set the record listeners for this record reader.-
Methods inherited from interface org.datavec.api.conf.Configurable
getConf, setConf
-
-
-
-
Method Detail
-
initialize
void initialize(InputSplit split) throws IOException, InterruptedException
Called once at initialization.- Parameters:
split- the split that defines the range of records to read- Throws:
IOExceptionInterruptedException
-
initialize
void initialize(Configuration conf, InputSplit split) throws IOException, InterruptedException
Called once at initialization.- Parameters:
conf- a configuration for initializationsplit- the split that defines the range of records to read- Throws:
IOExceptionInterruptedException
-
batchesSupported
boolean batchesSupported()
This method returns true, if next(int) signature is supported by this RecordReader implementation.- Returns:
-
next
List<List<Writable>> next(int num)
This method will be used, if batchesSupported() returns true.- Parameters:
num-- Returns:
-
hasNext
boolean hasNext()
Whether there are anymore records- Returns:
-
reset
void reset()
Reset record reader iterator
-
resetSupported
boolean resetSupported()
- Returns:
- True if the record reader can be reset, false otherwise. Note that some record readers cannot be reset - for example, if they are backed by a non-resettable input split (such as certain types of streams)
-
record
List<Writable> record(URI uri, DataInputStream dataInputStream) throws IOException
Load the record from the given DataInputStream Unlikenext()the internal state of the RecordReader is not modified Implementations of this method should not close the DataInputStream- Throws:
IOException- if error occurs during reading from the input stream
-
nextRecord
Record nextRecord()
Similar tonext(), but returns aRecordobject, that may include metadata such as the source of the data- Returns:
- next record
-
loadFromMetaData
Record loadFromMetaData(RecordMetaData recordMetaData) throws IOException
Load a single record from the givenRecordMetaDatainstance
Note: that for data that isn't splittable (i.e., text data that needs to be scanned/split), it is more efficient to load multiple records at once usingloadFromMetaData(List)- Parameters:
recordMetaData- Metadata for the record that we want to load from- Returns:
- Single record for the given RecordMetaData instance
- Throws:
IOException- If I/O error occurs during loading
-
loadFromMetaData
List<Record> loadFromMetaData(List<RecordMetaData> recordMetaDatas) throws IOException
Load multiple records from the given a list ofRecordMetaDatainstances- Parameters:
recordMetaDatas- Metadata for the records that we want to load from- Returns:
- Multiple records for the given RecordMetaData instances
- Throws:
IOException- If I/O error occurs during loading
-
getListeners
List<RecordListener> getListeners()
Get the record listeners for this record reader.
-
setListeners
void setListeners(RecordListener... listeners)
Set the record listeners for this record reader.
-
setListeners
void setListeners(Collection<RecordListener> listeners)
Set the record listeners for this record reader.
-
-