Class CSVRecordReader
- java.lang.Object
-
- org.datavec.api.records.reader.BaseRecordReader
-
- org.datavec.api.records.reader.impl.LineRecordReader
-
- org.datavec.api.records.reader.impl.csv.CSVRecordReader
-
- All Implemented Interfaces:
Closeable,Serializable,AutoCloseable,Configurable,RecordReader
- Direct Known Subclasses:
CSVLineSequenceRecordReader,CSVMultiSequenceRecordReader,CSVNLinesSequenceRecordReader,CSVRegexRecordReader,CSVVariableSlidingWindowRecordReader
public class CSVRecordReader extends LineRecordReader
- See Also:
- Serialized Form
-
-
Field Summary
Fields Modifier and Type Field Description static charDEFAULT_DELIMITERstatic charDEFAULT_QUOTEstatic StringDELIMITERstatic StringQUOTEstatic StringSKIP_NUM_LINESprotected intskipNumLines-
Fields inherited from class org.datavec.api.records.reader.impl.LineRecordReader
charset, conf, initialized, lineIndex, locations, splitIndex
-
Fields inherited from class org.datavec.api.records.reader.BaseRecordReader
inputSplit, listeners, streamCreatorFn
-
Fields inherited from interface org.datavec.api.records.reader.RecordReader
APPEND_LABEL, LABELS, NAME_SPACE
-
-
Constructor Summary
Constructors Constructor Description CSVRecordReader()CSVRecordReader(char delimiter)Create a CSVRecordReader with the specified delimiterCSVRecordReader(int skipNumLines)Skip first n linesCSVRecordReader(int skipNumLines, char delimiter)Skip lines and use delimiterCSVRecordReader(int skipNumLines, char delimiter, char quote)Skip lines, use delimiter, and strip quotesCSVRecordReader(int skipNumLines, String delimiter)Deprecated.This constructor is deprecated; useCSVRecordReader(int, char)orCSVRecordReader(int, char, char)CSVRecordReader(int skipNumLines, String delimiter, String quote)Deprecated.This constructor is deprecated; useCSVRecordReader(int, char)orCSVRecordReader(int, char, char)
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description booleanbatchesSupported()This method returns true, if next(int) signature is supported by this RecordReader implementation.booleanhasNext()Whether there are anymore recordsvoidinitialize(Configuration conf, InputSplit split)Called once at initialization.List<Record>loadFromMetaData(List<RecordMetaData> recordMetaDatas)Load multiple records from the given a list ofRecordMetaDatainstancesRecordloadFromMetaData(RecordMetaData recordMetaData)Load a single record from the givenRecordMetaDatainstance
Note: that for data that isn't splittable (i.e., text data that needs to be scanned/split), it is more efficient to load multiple records at once usingRecordReader.loadFromMetaData(List)List<Writable>next()Get the next recordList<List<Writable>>next(int num)This method will be used, if batchesSupported() returns true.RecordnextRecord()Similar toRecordReader.next(), but returns aRecordobject, that may include metadata such as the source of the dataprotected voidonLocationOpen(URI location)protected List<Writable>parseLine(String line)protected StringreadStringLine()List<Writable>record(URI uri, DataInputStream dataInputStream)Load the record from the given DataInputStream UnlikeRecordReader.next()the internal state of the RecordReader is not modified Implementations of this method should not close the DataInputStreamvoidreset()Reset record reader iterator-
Methods inherited from class org.datavec.api.records.reader.impl.LineRecordReader
close, closeIfRequired, getConf, getIterator, getLabels, initialize, resetSupported, setConf
-
Methods inherited from class org.datavec.api.records.reader.BaseRecordReader
getListeners, invokeListeners, setListeners, setListeners
-
-
-
-
Field Detail
-
skipNumLines
protected int skipNumLines
-
DEFAULT_DELIMITER
public static final char DEFAULT_DELIMITER
- See Also:
- Constant Field Values
-
DEFAULT_QUOTE
public static final char DEFAULT_QUOTE
- See Also:
- Constant Field Values
-
SKIP_NUM_LINES
public static final String SKIP_NUM_LINES
-
DELIMITER
public static final String DELIMITER
-
QUOTE
public static final String QUOTE
-
-
Constructor Detail
-
CSVRecordReader
public CSVRecordReader(int skipNumLines)
Skip first n lines- Parameters:
skipNumLines- the number of lines to skip
-
CSVRecordReader
public CSVRecordReader(char delimiter)
Create a CSVRecordReader with the specified delimiter- Parameters:
delimiter- Delimiter character for CSV
-
CSVRecordReader
public CSVRecordReader(int skipNumLines, char delimiter)Skip lines and use delimiter- Parameters:
skipNumLines- the number of lines to skipdelimiter- the delimiter
-
CSVRecordReader
@Deprecated public CSVRecordReader(int skipNumLines, String delimiter)
Deprecated.This constructor is deprecated; useCSVRecordReader(int, char)orCSVRecordReader(int, char, char)- Parameters:
skipNumLines- Number of lines to skipdelimiter- Delimiter to use
-
CSVRecordReader
public CSVRecordReader(int skipNumLines, char delimiter, char quote)Skip lines, use delimiter, and strip quotes- Parameters:
skipNumLines- the number of lines to skipdelimiter- the delimiterquote- the quote to strip
-
CSVRecordReader
@Deprecated public CSVRecordReader(int skipNumLines, String delimiter, String quote)
Deprecated.This constructor is deprecated; useCSVRecordReader(int, char)orCSVRecordReader(int, char, char)Skip lines, use delimiter, and strip quotes- Parameters:
skipNumLines- the number of lines to skipdelimiter- the delimiterquote- the quote to strip
-
CSVRecordReader
public CSVRecordReader()
-
-
Method Detail
-
initialize
public void initialize(Configuration conf, InputSplit split) throws IOException, InterruptedException
Description copied from interface:RecordReaderCalled once at initialization.- Specified by:
initializein interfaceRecordReader- Overrides:
initializein classLineRecordReader- Parameters:
conf- a configuration for initializationsplit- the split that defines the range of records to read- Throws:
IOExceptionInterruptedException
-
batchesSupported
public boolean batchesSupported()
Description copied from interface:RecordReaderThis method returns true, if next(int) signature is supported by this RecordReader implementation.- Specified by:
batchesSupportedin interfaceRecordReader- Overrides:
batchesSupportedin classBaseRecordReader- Returns:
-
hasNext
public boolean hasNext()
Description copied from interface:RecordReaderWhether there are anymore records- Specified by:
hasNextin interfaceRecordReader- Overrides:
hasNextin classLineRecordReader- Returns:
-
next
public List<List<Writable>> next(int num)
Description copied from interface:RecordReaderThis method will be used, if batchesSupported() returns true.- Specified by:
nextin interfaceRecordReader- Overrides:
nextin classBaseRecordReader- Returns:
-
next
public List<Writable> next()
Description copied from interface:RecordReaderGet the next record- Specified by:
nextin interfaceRecordReader- Overrides:
nextin classLineRecordReader- Returns:
-
readStringLine
protected String readStringLine()
-
nextRecord
public Record nextRecord()
Description copied from interface:RecordReaderSimilar toRecordReader.next(), but returns aRecordobject, that may include metadata such as the source of the data- Specified by:
nextRecordin interfaceRecordReader- Overrides:
nextRecordin classLineRecordReader- Returns:
- next record
-
loadFromMetaData
public Record loadFromMetaData(RecordMetaData recordMetaData) throws IOException
Description copied from interface:RecordReaderLoad a single record from the givenRecordMetaDatainstance
Note: that for data that isn't splittable (i.e., text data that needs to be scanned/split), it is more efficient to load multiple records at once usingRecordReader.loadFromMetaData(List)- Specified by:
loadFromMetaDatain interfaceRecordReader- Overrides:
loadFromMetaDatain classLineRecordReader- Parameters:
recordMetaData- Metadata for the record that we want to load from- Returns:
- Single record for the given RecordMetaData instance
- Throws:
IOException- If I/O error occurs during loading
-
loadFromMetaData
public List<Record> loadFromMetaData(List<RecordMetaData> recordMetaDatas) throws IOException
Description copied from interface:RecordReaderLoad multiple records from the given a list ofRecordMetaDatainstances- Specified by:
loadFromMetaDatain interfaceRecordReader- Overrides:
loadFromMetaDatain classLineRecordReader- Parameters:
recordMetaDatas- Metadata for the records that we want to load from- Returns:
- Multiple records for the given RecordMetaData instances
- Throws:
IOException- If I/O error occurs during loading
-
record
public List<Writable> record(URI uri, DataInputStream dataInputStream) throws IOException
Description copied from interface:RecordReaderLoad the record from the given DataInputStream UnlikeRecordReader.next()the internal state of the RecordReader is not modified Implementations of this method should not close the DataInputStream- Specified by:
recordin interfaceRecordReader- Overrides:
recordin classLineRecordReader- Throws:
IOException- if error occurs during reading from the input stream
-
reset
public void reset()
Description copied from interface:RecordReaderReset record reader iterator- Specified by:
resetin interfaceRecordReader- Overrides:
resetin classLineRecordReader
-
onLocationOpen
protected void onLocationOpen(URI location)
- Overrides:
onLocationOpenin classLineRecordReader
-
-