com.univocity.parsers.common.input
Class AbstractCharInputReader

java.lang.Object
  extended by com.univocity.parsers.common.input.AbstractCharInputReader
All Implemented Interfaces:
CharInputReader
Direct Known Subclasses:
ConcurrentCharInputReader, DefaultCharInputReader

public abstract class AbstractCharInputReader
extends Object
implements CharInputReader

The base class for implementing different flavours of CharInputReader.

It provides the essential conversion of sequences of newline characters defined by Format.getLineSeparator() into the normalized newline character provided in Format.getNormalizedNewline().

It also provides a default implementation for most of the methods specified by the CharInputReader interface.

Extending classes must essentially read characters from a given Reader and assign it to the public buffer when requested (in the reloadBuffer() method).

Author:
uniVocity Software Pty Ltd - parsers@univocity.com
See Also:
Format, DefaultCharInputReader, ConcurrentCharInputReader

Field Summary
 char[] buffer
          The buffer itself
 int i
          Current position in the buffer
 int length
          Number of characters available in the buffer.
 
Constructor Summary
AbstractCharInputReader(char normalizedLineSeparator)
          Creates a new instance that attempts to detect the newlines used in the input automatically.
AbstractCharInputReader(char[] lineSeparator, char normalizedLineSeparator)
          Creates a new instance with the mandatory characters for handling newlines transparently.
 
Method Summary
 void addInputAnalysisProcess(InputAnalysisProcess inputAnalysisProcess)
          Submits a custom InputAnalysisProcess to analyze the input buffer and potentially discover configuration options such as column separators is CSV, data formats, etc.
 long charCount()
          Returns the number of characters returned by CharInputReader.nextChar() at any given time.
 String currentParsedContent()
          Returns a String with the input character sequence parsed to produce the current record.
 void enableNormalizeLineEndings(boolean normalizeLineEndings)
          Indicates to the input reader that the parser is running in "escape" mode and new lines should be returned as-is to prevent modifying the content of the parsed value.
 char getChar()
          Returns the last character returned by the CharInputReader.nextChar() method.
 char[] getLineSeparator()
          Returns the line separator by this character input reader.
 String getString(char ch, char stop, boolean trim, String nullValue)
          Attempts to collect a String from the current position until a stop character is found on the input, or a line ending is reached.
 long lineCount()
          Returns the number of newlines read so far.
 void markRecordStart()
          Marks the start of a new record in the input, used internally to calculate the result of CharInputReader.currentParsedContent()
 char nextChar()
          Returns the next character in the input provided by the active Reader.
 String readComment()
          Collects the comment line found on the input.
protected abstract  void reloadBuffer()
          Informs the extending class that the buffer has been read entirely and requests for another batch of characters.
protected abstract  void setReader(Reader reader)
          Passes the Reader provided in the start(Reader) method to the extending class so it can begin loading characters from it.
 void skipLines(long lines)
          Skips characters in the input until the given number of lines is discarded.
 char skipWhitespace(char ch, char stopChar1, char stopChar2)
          Skips characters from the current input position, until a non-whitespace character, or a stop character is found
 void start(Reader reader)
          Initializes the CharInputReader implementation with a Reader which provides access to the input.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 
Methods inherited from interface com.univocity.parsers.common.input.CharInputReader
stop
 

Field Detail

i

public int i
Current position in the buffer


buffer

public char[] buffer
The buffer itself


length

public int length
Number of characters available in the buffer.

Constructor Detail

AbstractCharInputReader

public AbstractCharInputReader(char normalizedLineSeparator)
Creates a new instance that attempts to detect the newlines used in the input automatically.

Parameters:
normalizedLineSeparator - the normalized newline character (as defined in Format.getNormalizedNewline()) that is used to replace any lineSeparator sequence found in the input.

AbstractCharInputReader

public AbstractCharInputReader(char[] lineSeparator,
                               char normalizedLineSeparator)
Creates a new instance with the mandatory characters for handling newlines transparently.

Parameters:
lineSeparator - the sequence of characters that represent a newline, as defined in Format.getLineSeparator()
normalizedLineSeparator - the normalized newline character (as defined in Format.getNormalizedNewline()) that is used to replace any lineSeparator sequence found in the input.
Method Detail

setReader

protected abstract void setReader(Reader reader)
Passes the Reader provided in the start(Reader) method to the extending class so it can begin loading characters from it.

Parameters:
reader - the Reader provided in start(Reader)

reloadBuffer

protected abstract void reloadBuffer()
Informs the extending class that the buffer has been read entirely and requests for another batch of characters. Implementors must assign the new character buffer to the public buffer attribute, as well as the number of characters available to the public length attribute. To notify the input does not have any more characters, length must receive the -1 value


start

public final void start(Reader reader)
Description copied from interface: CharInputReader
Initializes the CharInputReader implementation with a Reader which provides access to the input.

Specified by:
start in interface CharInputReader
Parameters:
reader - A Reader that provides access to the input.

addInputAnalysisProcess

public final void addInputAnalysisProcess(InputAnalysisProcess inputAnalysisProcess)
Submits a custom InputAnalysisProcess to analyze the input buffer and potentially discover configuration options such as column separators is CSV, data formats, etc. The process will be execute only once.

Parameters:
inputAnalysisProcess - a custom process to analyze the contents of the input buffer.

nextChar

public final char nextChar()
Description copied from interface: CharInputReader
Returns the next character in the input provided by the active Reader.

If the input contains a sequence of newline characters (defined by Format.getLineSeparator()), this method will automatically converted them to the newline character specified in Format.getNormalizedNewline().

A subsequent call to this method will return the character after the newline sequence.

Specified by:
nextChar in interface CharInputReader
Returns:
the next character in the input. '\0' if there are no more characters in the input or if the CharInputReader was stopped.

getChar

public final char getChar()
Description copied from interface: CharInputReader
Returns the last character returned by the CharInputReader.nextChar() method.

Specified by:
getChar in interface CharInputReader
Returns:
the last character returned by the CharInputReader.nextChar() method.'\0' if there are no more characters in the input or if the CharInputReader was stopped.

lineCount

public final long lineCount()
Description copied from interface: CharInputReader
Returns the number of newlines read so far.

Specified by:
lineCount in interface CharInputReader
Returns:
the number of newlines read so far.

skipLines

public final void skipLines(long lines)
Description copied from interface: CharInputReader
Skips characters in the input until the given number of lines is discarded.

Specified by:
skipLines in interface CharInputReader
Parameters:
lines - the number of lines to skip from the current location in the input

readComment

public String readComment()
Description copied from interface: CharInputReader
Collects the comment line found on the input.

Specified by:
readComment in interface CharInputReader
Returns:
the text found in the comment from the current position.

charCount

public final long charCount()
Description copied from interface: CharInputReader
Returns the number of characters returned by CharInputReader.nextChar() at any given time.

Specified by:
charCount in interface CharInputReader
Returns:
the number of characters returned by CharInputReader.nextChar()

enableNormalizeLineEndings

public final void enableNormalizeLineEndings(boolean normalizeLineEndings)
Description copied from interface: CharInputReader
Indicates to the input reader that the parser is running in "escape" mode and new lines should be returned as-is to prevent modifying the content of the parsed value.

Specified by:
enableNormalizeLineEndings in interface CharInputReader
Parameters:
normalizeLineEndings - flag indicating that the parser is escaping values and line separators are to be returned as-is.

getLineSeparator

public char[] getLineSeparator()
Description copied from interface: CharInputReader
Returns the line separator by this character input reader. This could be the line separator defined in the Format.getLineSeparator() configuration, or the line separator sequence identified automatically when CommonParserSettings.isLineSeparatorDetectionEnabled() evaluates to true.

Specified by:
getLineSeparator in interface CharInputReader
Returns:
the line separator in use.

skipWhitespace

public final char skipWhitespace(char ch,
                                 char stopChar1,
                                 char stopChar2)
Description copied from interface: CharInputReader
Skips characters from the current input position, until a non-whitespace character, or a stop character is found

Specified by:
skipWhitespace in interface CharInputReader
Parameters:
ch - the current character of the input
stopChar1 - the first stop character (which can be a whitespace)
stopChar2 - the second character (which can be a whitespace)
Returns:
the first non-whitespace character (or delimiter) found in the input.

currentParsedContent

public final String currentParsedContent()
Description copied from interface: CharInputReader
Returns a String with the input character sequence parsed to produce the current record.

Specified by:
currentParsedContent in interface CharInputReader
Returns:
the text content parsed for the current input record.

markRecordStart

public final void markRecordStart()
Description copied from interface: CharInputReader
Marks the start of a new record in the input, used internally to calculate the result of CharInputReader.currentParsedContent()

Specified by:
markRecordStart in interface CharInputReader

getString

public String getString(char ch,
                        char stop,
                        boolean trim,
                        String nullValue)
Description copied from interface: CharInputReader
Attempts to collect a String from the current position until a stop character is found on the input, or a line ending is reached. If the String can be obtained, the current position of the parser will be updated to the last consumed character. If the internal buffer needs to be reloaded, this method will return null and the current position of the buffer will remain unchanged.

Specified by:
getString in interface CharInputReader
Parameters:
ch - the current character to be considered. If equal to the stop character the nullValue will be returned
stop - the stop character that identifies the end of the content to be collected
trim - flag indicating whether or not trailing whitespaces should be discarded
nullValue - value to return when the length of the content to be returned is 0.
Returns:
the String found on the input, or null if the buffer needs to reloaded.


Copyright © 2016 uniVocity Software Pty Ltd. All rights reserved.