Class RegexLineRecordReader
- java.lang.Object
-
- org.datavec.api.records.reader.BaseRecordReader
-
- org.datavec.api.records.reader.impl.LineRecordReader
-
- org.datavec.api.records.reader.impl.regex.RegexLineRecordReader
-
- All Implemented Interfaces:
Closeable,Serializable,AutoCloseable,Configurable,RecordReader
public class RegexLineRecordReader extends LineRecordReader
- See Also:
- Serialized Form
-
-
Field Summary
Fields Modifier and Type Field Description static StringSKIP_NUM_LINES-
Fields inherited from class org.datavec.api.records.reader.impl.LineRecordReader
charset, conf, initialized, lineIndex, locations, splitIndex
-
Fields inherited from class org.datavec.api.records.reader.BaseRecordReader
inputSplit, listeners, streamCreatorFn
-
Fields inherited from interface org.datavec.api.records.reader.RecordReader
APPEND_LABEL, LABELS, NAME_SPACE
-
-
Constructor Summary
Constructors Constructor Description RegexLineRecordReader(String regex, int skipNumLines)
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description voidinitialize(Configuration conf, InputSplit split)Called once at initialization.List<Record>loadFromMetaData(List<RecordMetaData> recordMetaDatas)Load multiple records from the given a list ofRecordMetaDatainstancesRecordloadFromMetaData(RecordMetaData recordMetaData)Load a single record from the givenRecordMetaDatainstance
Note: that for data that isn't splittable (i.e., text data that needs to be scanned/split), it is more efficient to load multiple records at once usingRecordReader.loadFromMetaData(List)List<Writable>next()Get the next recordRecordnextRecord()Similar toRecordReader.next(), but returns aRecordobject, that may include metadata such as the source of the dataList<Writable>record(URI uri, DataInputStream dataInputStream)Load the record from the given DataInputStream UnlikeRecordReader.next()the internal state of the RecordReader is not modified Implementations of this method should not close the DataInputStreamvoidreset()Reset record reader iterator-
Methods inherited from class org.datavec.api.records.reader.impl.LineRecordReader
close, closeIfRequired, getConf, getIterator, getLabels, hasNext, initialize, onLocationOpen, resetSupported, setConf
-
Methods inherited from class org.datavec.api.records.reader.BaseRecordReader
batchesSupported, getListeners, invokeListeners, next, setListeners, setListeners
-
-
-
-
Field Detail
-
SKIP_NUM_LINES
public static final String SKIP_NUM_LINES
-
-
Constructor Detail
-
RegexLineRecordReader
public RegexLineRecordReader(String regex, int skipNumLines)
-
-
Method Detail
-
initialize
public void initialize(Configuration conf, InputSplit split) throws IOException, InterruptedException
Description copied from interface:RecordReaderCalled once at initialization.- Specified by:
initializein interfaceRecordReader- Overrides:
initializein classLineRecordReader- Parameters:
conf- a configuration for initializationsplit- the split that defines the range of records to read- Throws:
IOExceptionInterruptedException
-
next
public List<Writable> next()
Description copied from interface:RecordReaderGet the next record- Specified by:
nextin interfaceRecordReader- Overrides:
nextin classLineRecordReader- Returns:
-
record
public List<Writable> record(URI uri, DataInputStream dataInputStream) throws IOException
Description copied from interface:RecordReaderLoad the record from the given DataInputStream UnlikeRecordReader.next()the internal state of the RecordReader is not modified Implementations of this method should not close the DataInputStream- Specified by:
recordin interfaceRecordReader- Overrides:
recordin classLineRecordReader- Throws:
IOException- if error occurs during reading from the input stream
-
reset
public void reset()
Description copied from interface:RecordReaderReset record reader iterator- Specified by:
resetin interfaceRecordReader- Overrides:
resetin classLineRecordReader
-
nextRecord
public Record nextRecord()
Description copied from interface:RecordReaderSimilar toRecordReader.next(), but returns aRecordobject, that may include metadata such as the source of the data- Specified by:
nextRecordin interfaceRecordReader- Overrides:
nextRecordin classLineRecordReader- Returns:
- next record
-
loadFromMetaData
public Record loadFromMetaData(RecordMetaData recordMetaData) throws IOException
Description copied from interface:RecordReaderLoad a single record from the givenRecordMetaDatainstance
Note: that for data that isn't splittable (i.e., text data that needs to be scanned/split), it is more efficient to load multiple records at once usingRecordReader.loadFromMetaData(List)- Specified by:
loadFromMetaDatain interfaceRecordReader- Overrides:
loadFromMetaDatain classLineRecordReader- Parameters:
recordMetaData- Metadata for the record that we want to load from- Returns:
- Single record for the given RecordMetaData instance
- Throws:
IOException- If I/O error occurs during loading
-
loadFromMetaData
public List<Record> loadFromMetaData(List<RecordMetaData> recordMetaDatas) throws IOException
Description copied from interface:RecordReaderLoad multiple records from the given a list ofRecordMetaDatainstances- Specified by:
loadFromMetaDatain interfaceRecordReader- Overrides:
loadFromMetaDatain classLineRecordReader- Parameters:
recordMetaDatas- Metadata for the records that we want to load from- Returns:
- Multiple records for the given RecordMetaData instances
- Throws:
IOException- If I/O error occurs during loading
-
-