Class CSVVariableSlidingWindowRecordReader
- java.lang.Object
-
- org.datavec.api.records.reader.BaseRecordReader
-
- org.datavec.api.records.reader.impl.LineRecordReader
-
- org.datavec.api.records.reader.impl.csv.CSVRecordReader
-
- org.datavec.api.records.reader.impl.csv.CSVVariableSlidingWindowRecordReader
-
- All Implemented Interfaces:
Closeable,Serializable,AutoCloseable,Configurable,RecordReader,SequenceRecordReader
public class CSVVariableSlidingWindowRecordReader extends CSVRecordReader implements SequenceRecordReader
- See Also:
- Serialized Form
-
-
Field Summary
Fields Modifier and Type Field Description static StringLINES_PER_SEQUENCE-
Fields inherited from class org.datavec.api.records.reader.impl.csv.CSVRecordReader
DEFAULT_DELIMITER, DEFAULT_QUOTE, DELIMITER, QUOTE, SKIP_NUM_LINES, skipNumLines
-
Fields inherited from class org.datavec.api.records.reader.impl.LineRecordReader
charset, conf, initialized, lineIndex, locations, splitIndex
-
Fields inherited from class org.datavec.api.records.reader.BaseRecordReader
inputSplit, listeners, streamCreatorFn
-
Fields inherited from interface org.datavec.api.records.reader.RecordReader
APPEND_LABEL, LABELS, NAME_SPACE
-
-
Constructor Summary
Constructors Constructor Description CSVVariableSlidingWindowRecordReader()No-arg constructor with the default number of lines per sequence (10)CSVVariableSlidingWindowRecordReader(int maxLinesPerSequence)CSVVariableSlidingWindowRecordReader(int maxLinesPerSequence, int stride)CSVVariableSlidingWindowRecordReader(int maxLinesPerSequence, int skipNumLines, int stride, String delimiter)CSVVariableSlidingWindowRecordReader(int maxLinesPerSequence, int stride, String delimiter)
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description booleanhasNext()Whether there are anymore recordsvoidinitialize(Configuration conf, InputSplit split)Called once at initialization.List<Record>loadFromMetaData(List<RecordMetaData> recordMetaDatas)Load multiple records from the given a list ofRecordMetaDatainstancesRecordloadFromMetaData(RecordMetaData recordMetaData)Load a single record from the givenRecordMetaDatainstance
Note: that for data that isn't splittable (i.e., text data that needs to be scanned/split), it is more efficient to load multiple records at once usingRecordReader.loadFromMetaData(List)List<SequenceRecord>loadSequenceFromMetaData(List<RecordMetaData> recordMetaDatas)Load multiple sequence records from the given a list ofRecordMetaDatainstancesSequenceRecordloadSequenceFromMetaData(RecordMetaData recordMetaData)Load a single sequence record from the givenRecordMetaDatainstance
Note: that for data that isn't splittable (i.e., text data that needs to be scanned/split), it is more efficient to load multiple records at once usingSequenceRecordReader.loadSequenceFromMetaData(List)SequenceRecordnextSequence()Similar toSequenceRecordReader.sequenceRecord(), but returns aRecordobject, that may include metadata such as the source of the datavoidreset()Reset record reader iteratorList<List<Writable>>sequenceRecord()Returns a sequence record.List<List<Writable>>sequenceRecord(URI uri, DataInputStream dataInputStream)Load a sequence record from the given DataInputStream UnlikeRecordReader.next()the internal state of the RecordReader is not modified Implementations of this method should not close the DataInputStream-
Methods inherited from class org.datavec.api.records.reader.impl.csv.CSVRecordReader
batchesSupported, next, next, nextRecord, onLocationOpen, parseLine, readStringLine, record
-
Methods inherited from class org.datavec.api.records.reader.impl.LineRecordReader
close, closeIfRequired, getConf, getIterator, getLabels, initialize, resetSupported, setConf
-
Methods inherited from class org.datavec.api.records.reader.BaseRecordReader
getListeners, invokeListeners, setListeners, setListeners
-
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
-
Methods inherited from interface org.datavec.api.conf.Configurable
getConf, setConf
-
Methods inherited from interface org.datavec.api.records.reader.RecordReader
batchesSupported, getLabels, getListeners, initialize, next, next, nextRecord, record, resetSupported, setListeners, setListeners
-
-
-
-
Field Detail
-
LINES_PER_SEQUENCE
public static final String LINES_PER_SEQUENCE
-
-
Constructor Detail
-
CSVVariableSlidingWindowRecordReader
public CSVVariableSlidingWindowRecordReader()
No-arg constructor with the default number of lines per sequence (10)
-
CSVVariableSlidingWindowRecordReader
public CSVVariableSlidingWindowRecordReader(int maxLinesPerSequence)
- Parameters:
maxLinesPerSequence- Number of lines in each sequence, use default delemiter(,) between entries in the same line
-
CSVVariableSlidingWindowRecordReader
public CSVVariableSlidingWindowRecordReader(int maxLinesPerSequence, int stride)- Parameters:
maxLinesPerSequence- Number of lines in each sequence, use default delemiter(,) between entries in the same linestride- Number of lines between records (increment window > 1 line)
-
CSVVariableSlidingWindowRecordReader
public CSVVariableSlidingWindowRecordReader(int maxLinesPerSequence, int stride, String delimiter)- Parameters:
maxLinesPerSequence- Number of lines in each sequence, use default delemiter(,) between entries in the same linestride- Number of lines between records (increment window > 1 line)
-
CSVVariableSlidingWindowRecordReader
public CSVVariableSlidingWindowRecordReader(int maxLinesPerSequence, int skipNumLines, int stride, String delimiter)- Parameters:
maxLinesPerSequence- Number of lines in each sequencesskipNumLines- Number of lines to skip at the start of the file (only skipped once, not per sequence)stride- Number of lines between records (increment window > 1 line)delimiter- Delimiter between entries in the same line, for example ","
-
-
Method Detail
-
initialize
public void initialize(Configuration conf, InputSplit split) throws IOException, InterruptedException
Description copied from interface:RecordReaderCalled once at initialization.- Specified by:
initializein interfaceRecordReader- Overrides:
initializein classCSVRecordReader- Parameters:
conf- a configuration for initializationsplit- the split that defines the range of records to read- Throws:
IOExceptionInterruptedException
-
hasNext
public boolean hasNext()
Description copied from interface:RecordReaderWhether there are anymore records- Specified by:
hasNextin interfaceRecordReader- Overrides:
hasNextin classCSVRecordReader- Returns:
-
sequenceRecord
public List<List<Writable>> sequenceRecord()
Description copied from interface:SequenceRecordReaderReturns a sequence record.- Specified by:
sequenceRecordin interfaceSequenceRecordReader- Returns:
- a sequence of records
-
sequenceRecord
public List<List<Writable>> sequenceRecord(URI uri, DataInputStream dataInputStream) throws IOException
Description copied from interface:SequenceRecordReaderLoad a sequence record from the given DataInputStream UnlikeRecordReader.next()the internal state of the RecordReader is not modified Implementations of this method should not close the DataInputStream- Specified by:
sequenceRecordin interfaceSequenceRecordReader- Throws:
IOException- if error occurs during reading from the input stream
-
nextSequence
public SequenceRecord nextSequence()
Description copied from interface:SequenceRecordReaderSimilar toSequenceRecordReader.sequenceRecord(), but returns aRecordobject, that may include metadata such as the source of the data- Specified by:
nextSequencein interfaceSequenceRecordReader- Returns:
- next sequence record
-
loadSequenceFromMetaData
public SequenceRecord loadSequenceFromMetaData(RecordMetaData recordMetaData) throws IOException
Description copied from interface:SequenceRecordReaderLoad a single sequence record from the givenRecordMetaDatainstance
Note: that for data that isn't splittable (i.e., text data that needs to be scanned/split), it is more efficient to load multiple records at once usingSequenceRecordReader.loadSequenceFromMetaData(List)- Specified by:
loadSequenceFromMetaDatain interfaceSequenceRecordReader- Parameters:
recordMetaData- Metadata for the sequence record that we want to load from- Returns:
- Single sequence record for the given RecordMetaData instance
- Throws:
IOException- If I/O error occurs during loading
-
loadSequenceFromMetaData
public List<SequenceRecord> loadSequenceFromMetaData(List<RecordMetaData> recordMetaDatas) throws IOException
Description copied from interface:SequenceRecordReaderLoad multiple sequence records from the given a list ofRecordMetaDatainstances- Specified by:
loadSequenceFromMetaDatain interfaceSequenceRecordReader- Parameters:
recordMetaDatas- Metadata for the records that we want to load from- Returns:
- Multiple sequence record for the given RecordMetaData instances
- Throws:
IOException- If I/O error occurs during loading
-
loadFromMetaData
public Record loadFromMetaData(RecordMetaData recordMetaData)
Description copied from interface:RecordReaderLoad a single record from the givenRecordMetaDatainstance
Note: that for data that isn't splittable (i.e., text data that needs to be scanned/split), it is more efficient to load multiple records at once usingRecordReader.loadFromMetaData(List)- Specified by:
loadFromMetaDatain interfaceRecordReader- Overrides:
loadFromMetaDatain classCSVRecordReader- Parameters:
recordMetaData- Metadata for the record that we want to load from- Returns:
- Single record for the given RecordMetaData instance
-
loadFromMetaData
public List<Record> loadFromMetaData(List<RecordMetaData> recordMetaDatas)
Description copied from interface:RecordReaderLoad multiple records from the given a list ofRecordMetaDatainstances- Specified by:
loadFromMetaDatain interfaceRecordReader- Overrides:
loadFromMetaDatain classCSVRecordReader- Parameters:
recordMetaDatas- Metadata for the records that we want to load from- Returns:
- Multiple records for the given RecordMetaData instances
-
reset
public void reset()
Description copied from interface:RecordReaderReset record reader iterator- Specified by:
resetin interfaceRecordReader- Overrides:
resetin classCSVRecordReader
-
-