com.univocity.parsers.common
Class AbstractParser<T extends CommonParserSettings<?>>

java.lang.Object
  extended by com.univocity.parsers.common.AbstractParser<T>
Type Parameters:
T - The specific parser settings configuration class, which can potentially provide additional configuration options supported by the parser implementation.
Direct Known Subclasses:
CsvParser, FixedWidthParser, TsvParser

public abstract class AbstractParser<T extends CommonParserSettings<?>>
extends Object

The AbstractParser class provides a common ground for all parsers in uniVocity-parsers.

It handles all settings defined by CommonParserSettings, and delegates the parsing algorithm implementation to its subclasses through the abstract method parseRecord()

The following (absolutely required) attributes are exposed to subclasses:

Author:
uniVocity Software Pty Ltd - parsers@univocity.com
See Also:
CsvParser, CsvParserSettings, FixedWidthParser, FixedWidthParserSettings, CharInputReader, ParserOutput

Field Summary
protected  char ch
           
protected  Map<Long,String> comments
           
protected  ParsingContext context
           
protected  CharInputReader input
           
protected  String lastComment
           
protected  ParserOutput output
           
protected  Processor processor
           
protected  RecordFactory recordFactory
           
protected  T settings
           
 
Constructor Summary
AbstractParser(T settings)
          All parsers must support, at the very least, the settings provided by CommonParserSettings.
 
Method Summary
 void beginParsing(File file)
          Starts an iterator-style parsing cycle.
 void beginParsing(File file, Charset encoding)
          Starts an iterator-style parsing cycle.
 void beginParsing(File file, String encoding)
          Starts an iterator-style parsing cycle.
 void beginParsing(InputStream input)
          Starts an iterator-style parsing cycle.
 void beginParsing(InputStream input, Charset encoding)
          Starts an iterator-style parsing cycle.
 void beginParsing(InputStream input, String encoding)
          Starts an iterator-style parsing cycle.
 void beginParsing(Reader reader)
          Starts an iterator-style parsing cycle.
protected  boolean consumeValueOnEOF()
          Allows the parser implementation to handle any value that was being consumed when the end of the input was reached
protected  ParsingContext createParsingContext()
           
 ParsingContext getContext()
          Returns the current parsing context with information about the status of the parser at any given time.
protected  InputAnalysisProcess getInputAnalysisProcess()
          Allows the parser implementation to traverse the input buffer before the parsing process starts, in order to enable automatic configuration and discovery of data formats.
 RecordMetaData getRecordMetadata()
          Returns the metadata associated with Records parsed from the input using parseAllRecords(File) or parseNextRecord().
protected  boolean inComment()
           
protected  void initialize()
           
 void parse(File file)
          Parses the entirety of a given file and delegates each parsed row to an instance of RowProcessor, defined by CommonParserSettings.getRowProcessor().
 void parse(File file, Charset encoding)
          Parses the entirety of a given file and delegates each parsed row to an instance of RowProcessor, defined by CommonParserSettings.getRowProcessor().
 void parse(File file, String encoding)
          Parses the entirety of a given file and delegates each parsed row to an instance of RowProcessor, defined by CommonParserSettings.getRowProcessor().
 void parse(InputStream input)
          Parses the entirety of a given input and delegates each parsed row to an instance of RowProcessor, defined by CommonParserSettings.getRowProcessor().
 void parse(InputStream input, Charset encoding)
          Parses the entirety of a given input and delegates each parsed row to an instance of RowProcessor, defined by CommonParserSettings.getRowProcessor().
 void parse(InputStream input, String encoding)
          Parses the entirety of a given input and delegates each parsed row to an instance of RowProcessor, defined by CommonParserSettings.getRowProcessor().
 void parse(Reader reader)
          Parses the entirety of a given input and delegates each parsed row to an instance of RowProcessor, defined by CommonParserSettings.getRowProcessor().
 List<String[]> parseAll(File file)
          Parses all records from a file and returns them in a list.
 List<String[]> parseAll(File file, Charset encoding)
          Parses all records from a file and returns them in a list.
 List<String[]> parseAll(File file, String encoding)
          Parses all records from a file and returns them in a list.
 List<String[]> parseAll(InputStream input)
          Parses all records from an input stream and returns them in a list.
 List<String[]> parseAll(InputStream input, Charset encoding)
          Parses all records from an input stream and returns them in a list.
 List<String[]> parseAll(InputStream input, String encoding)
          Parses all records from an input stream and returns them in a list.
 List<String[]> parseAll(Reader reader)
          Parses all records from the input and returns them in a list.
 List<Record> parseAllRecords(File file)
          Parses all records from a file and returns them in a list.
 List<Record> parseAllRecords(File file, Charset encoding)
          Parses all records from a file and returns them in a list.
 List<Record> parseAllRecords(File file, String encoding)
          Parses all records from a file and returns them in a list.
 List<Record> parseAllRecords(InputStream input)
          Parses all records from an input stream and returns them in a list.
 List<Record> parseAllRecords(InputStream input, Charset encoding)
          Parses all records from an input stream and returns them in a list.
 List<Record> parseAllRecords(InputStream input, String encoding)
          Parses all records from an input stream and returns them in a list.
 List<Record> parseAllRecords(Reader reader)
          Parses all records from the input and returns them in a list.
 String[] parseLine(String line)
          Parses a single line from a String in the format supported by the parser implementation.
 String[] parseNext()
          Parses the next record from the input.
 Record parseNextRecord()
          Parses the next record from the input.
protected abstract  void parseRecord()
          Parser-specific implementation for reading a single record from the input.
 Record parseRecord(String line)
          Parses a single line from a String in the format supported by the parser implementation.
protected  void processComment()
           
protected  void reloadHeaders()
          Reloads headers from settings.
 void stopParsing()
          Stops parsing and closes all open resources.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

settings

protected final T extends CommonParserSettings<?> settings

output

protected final ParserOutput output

context

protected ParsingContext context

processor

protected Processor processor

input

protected CharInputReader input

ch

protected char ch

recordFactory

protected RecordFactory recordFactory

comments

protected final Map<Long,String> comments

lastComment

protected String lastComment
Constructor Detail

AbstractParser

public AbstractParser(T settings)
All parsers must support, at the very least, the settings provided by CommonParserSettings. The AbstractParser requires its configuration to be properly initialized.

Parameters:
settings - the parser configuration
Method Detail

processComment

protected void processComment()

parse

public final void parse(Reader reader)
Parses the entirety of a given input and delegates each parsed row to an instance of RowProcessor, defined by CommonParserSettings.getRowProcessor().

Parameters:
reader - The input to be parsed.

parseRecord

protected abstract void parseRecord()
Parser-specific implementation for reading a single record from the input.

The AbstractParser handles the initialization and processing of the input until it is ready to be parsed.

It then delegates the input to the parser-specific implementation defined by parseRecord(). In general, an implementation of parseRecord() will perform the following steps:

Once the parseRecord() returns, the AbstractParser takes over and handles the information (generally, reorganizing it and passing it on to a RowProcessor).

After the record processing, the AbstractParser reads the next characters from the input, delegating control again to the parseRecord() implementation for processing of the next record.

This cycle repeats until the reading process is stopped by the user, the input is exhausted, or an error happens.

In case of errors, the unchecked exception TextParsingException will be thrown and all resources in use will be closed automatically. The exception should contain the cause and more information about where in the input the error happened.

See Also:
CharInputReader, CharAppender, ParserOutput, TextParsingException, RowProcessor

consumeValueOnEOF

protected boolean consumeValueOnEOF()
Allows the parser implementation to handle any value that was being consumed when the end of the input was reached

Returns:
a flag indicating whether the parser was processing a value when the end of the input was reached.

beginParsing

public final void beginParsing(Reader reader)
Starts an iterator-style parsing cycle. If a RowProcessor is provided in the configuration, it will be used to perform additional processing. The parsed records must be read one by one with the invocation of parseNext(). The user may invoke @link stopParsing() to stop reading from the input.

Parameters:
reader - The input to be parsed.

createParsingContext

protected ParsingContext createParsingContext()

initialize

protected void initialize()

getInputAnalysisProcess

protected InputAnalysisProcess getInputAnalysisProcess()
Allows the parser implementation to traverse the input buffer before the parsing process starts, in order to enable automatic configuration and discovery of data formats.

Returns:
a custom implementation of InputAnalysisProcess. By default, null is returned and no special input analysis will be performed.

stopParsing

public final void stopParsing()
Stops parsing and closes all open resources.


parseAll

public final List<String[]> parseAll(Reader reader)
Parses all records from the input and returns them in a list.

Parameters:
reader - the input to be parsed
Returns:
the list of all records parsed from the input.

inComment

protected boolean inComment()

parseNext

public final String[] parseNext()
Parses the next record from the input. Note that beginParsing(Reader) must have been invoked once before calling this method. If the end of the input is reached, then this method will return null. Additionally, all resources will be closed automatically at the end of the input or if any error happens while parsing.

Returns:
The record parsed from the input or null if there's no more characters to read.

reloadHeaders

protected final void reloadHeaders()
Reloads headers from settings.


parseRecord

public final Record parseRecord(String line)
Parses a single line from a String in the format supported by the parser implementation.

Parameters:
line - a line of text to be parsed
Returns:
the Record containing the values parsed from the input line

parseLine

public final String[] parseLine(String line)
Parses a single line from a String in the format supported by the parser implementation.

Parameters:
line - a line of text to be parsed
Returns:
the values parsed from the input line

parse

public final void parse(File file)
Parses the entirety of a given file and delegates each parsed row to an instance of RowProcessor, defined by CommonParserSettings.getRowProcessor().

Parameters:
file - The file to be parsed.

parse

public final void parse(File file,
                        String encoding)
Parses the entirety of a given file and delegates each parsed row to an instance of RowProcessor, defined by CommonParserSettings.getRowProcessor().

Parameters:
file - The file to be parsed.
encoding - the encoding of the file

parse

public final void parse(File file,
                        Charset encoding)
Parses the entirety of a given file and delegates each parsed row to an instance of RowProcessor, defined by CommonParserSettings.getRowProcessor().

Parameters:
file - The file to be parsed.
encoding - the encoding of the file

parse

public final void parse(InputStream input)
Parses the entirety of a given input and delegates each parsed row to an instance of RowProcessor, defined by CommonParserSettings.getRowProcessor().

Parameters:
input - The input to be parsed. The input stream will be closed automatically.

parse

public final void parse(InputStream input,
                        String encoding)
Parses the entirety of a given input and delegates each parsed row to an instance of RowProcessor, defined by CommonParserSettings.getRowProcessor().

Parameters:
input - The input to be parsed. The input stream will be closed automatically.
encoding - the encoding of the input stream

parse

public final void parse(InputStream input,
                        Charset encoding)
Parses the entirety of a given input and delegates each parsed row to an instance of RowProcessor, defined by CommonParserSettings.getRowProcessor().

Parameters:
input - The input to be parsed. The input stream will be closed automatically.
encoding - the encoding of the input stream

beginParsing

public final void beginParsing(File file)
Starts an iterator-style parsing cycle. If a RowProcessor is provided in the configuration, it will be used to perform additional processing. The parsed records must be read one by one with the invocation of parseNext(). The user may invoke @link stopParsing() to stop reading from the input.

Parameters:
file - The file to be parsed.

beginParsing

public final void beginParsing(File file,
                               String encoding)
Starts an iterator-style parsing cycle. If a RowProcessor is provided in the configuration, it will be used to perform additional processing. The parsed records must be read one by one with the invocation of parseNext(). The user may invoke @link stopParsing() to stop reading from the input.

Parameters:
file - The file to be parsed.
encoding - the encoding of the file

beginParsing

public final void beginParsing(File file,
                               Charset encoding)
Starts an iterator-style parsing cycle. If a RowProcessor is provided in the configuration, it will be used to perform additional processing. The parsed records must be read one by one with the invocation of parseNext(). The user may invoke @link stopParsing() to stop reading from the input.

Parameters:
file - The file to be parsed.
encoding - the encoding of the file

beginParsing

public final void beginParsing(InputStream input)
Starts an iterator-style parsing cycle. If a RowProcessor is provided in the configuration, it will be used to perform additional processing. The parsed records must be read one by one with the invocation of parseNext(). The user may invoke @link stopParsing() to stop reading from the input.

Parameters:
input - The input to be parsed. The input stream will be closed automatically in case of errors.

beginParsing

public final void beginParsing(InputStream input,
                               String encoding)
Starts an iterator-style parsing cycle. If a RowProcessor is provided in the configuration, it will be used to perform additional processing. The parsed records must be read one by one with the invocation of parseNext(). The user may invoke @link stopParsing() to stop reading from the input.

Parameters:
input - The input to be parsed. The input stream will be closed automatically in case of errors.
encoding - the encoding of the input stream

beginParsing

public final void beginParsing(InputStream input,
                               Charset encoding)
Starts an iterator-style parsing cycle. If a RowProcessor is provided in the configuration, it will be used to perform additional processing. The parsed records must be read one by one with the invocation of parseNext(). The user may invoke @link stopParsing() to stop reading from the input.

Parameters:
input - The input to be parsed. The input stream will be closed automatically in case of errors.
encoding - the encoding of the input stream

parseAll

public final List<String[]> parseAll(File file)
Parses all records from a file and returns them in a list.

Parameters:
file - the input file to be parsed
Returns:
the list of all records parsed from the file.

parseAll

public final List<String[]> parseAll(File file,
                                     String encoding)
Parses all records from a file and returns them in a list.

Parameters:
file - the input file to be parsed
encoding - the encoding of the file
Returns:
the list of all records parsed from the file.

parseAll

public final List<String[]> parseAll(File file,
                                     Charset encoding)
Parses all records from a file and returns them in a list.

Parameters:
file - the input file to be parsed
encoding - the encoding of the file
Returns:
the list of all records parsed from the file.

parseAll

public final List<String[]> parseAll(InputStream input)
Parses all records from an input stream and returns them in a list.

Parameters:
input - the input stream to be parsed. The input stream will be closed automatically
Returns:
the list of all records parsed from the input.

parseAll

public final List<String[]> parseAll(InputStream input,
                                     String encoding)
Parses all records from an input stream and returns them in a list.

Parameters:
input - the input stream to be parsed. The input stream will be closed automatically
encoding - the encoding of the input stream
Returns:
the list of all records parsed from the input.

parseAll

public final List<String[]> parseAll(InputStream input,
                                     Charset encoding)
Parses all records from an input stream and returns them in a list.

Parameters:
input - the input stream to be parsed. The input stream will be closed automatically
encoding - the encoding of the input stream
Returns:
the list of all records parsed from the input.

parseAllRecords

public final List<Record> parseAllRecords(File file)
Parses all records from a file and returns them in a list.

Parameters:
file - the input file to be parsed
Returns:
the list of all records parsed from the file.

parseAllRecords

public final List<Record> parseAllRecords(File file,
                                          String encoding)
Parses all records from a file and returns them in a list.

Parameters:
file - the input file to be parsed
encoding - the encoding of the file
Returns:
the list of all records parsed from the file.

parseAllRecords

public final List<Record> parseAllRecords(File file,
                                          Charset encoding)
Parses all records from a file and returns them in a list.

Parameters:
file - the input file to be parsed
encoding - the encoding of the file
Returns:
the list of all records parsed from the file.

parseAllRecords

public final List<Record> parseAllRecords(InputStream input)
Parses all records from an input stream and returns them in a list.

Parameters:
input - the input stream to be parsed. The input stream will be closed automatically
Returns:
the list of all records parsed from the input.

parseAllRecords

public final List<Record> parseAllRecords(InputStream input,
                                          String encoding)
Parses all records from an input stream and returns them in a list.

Parameters:
input - the input stream to be parsed. The input stream will be closed automatically
encoding - the encoding of the input stream
Returns:
the list of all records parsed from the input.

parseAllRecords

public final List<Record> parseAllRecords(InputStream input,
                                          Charset encoding)
Parses all records from an input stream and returns them in a list.

Parameters:
input - the input stream to be parsed. The input stream will be closed automatically
encoding - the encoding of the input stream
Returns:
the list of all records parsed from the input.

parseAllRecords

public final List<Record> parseAllRecords(Reader reader)
Parses all records from the input and returns them in a list.

Parameters:
reader - the input to be parsed
Returns:
the list of all records parsed from the input.

parseNextRecord

public final Record parseNextRecord()
Parses the next record from the input. Note that beginParsing(Reader) must have been invoked once before calling this method. If the end of the input is reached, then this method will return null. Additionally, all resources will be closed automatically at the end of the input or if any error happens while parsing.

Returns:
The record parsed from the input or null if there's no more characters to read.

getContext

public final ParsingContext getContext()
Returns the current parsing context with information about the status of the parser at any given time.

Returns:
the parsing context

getRecordMetadata

public final RecordMetaData getRecordMetadata()
Returns the metadata associated with Records parsed from the input using parseAllRecords(File) or parseNextRecord().

Returns:
the metadata of Records generated with the current input.


Copyright © 2016 uniVocity Software Pty Ltd. All rights reserved.