com.univocity.parsers.common
Class CommonSettings<F extends Format>

java.lang.Object
  extended by com.univocity.parsers.common.CommonSettings<F>
Type Parameters:
F - the format supported by this settings class.
Direct Known Subclasses:
CommonParserSettings, CommonWriterSettings

public abstract class CommonSettings<F extends Format>
extends Object

This is the parent class for all configuration classes used by parsers (AbstractParser) and writers (AbstractWriter)

By default, all parsers and writers work with, at least, the following configuration options:

Author:
uniVocity Software Pty Ltd - parsers@univocity.com
See Also:
CommonParserSettings, CommonWriterSettings, CsvParserSettings, CsvWriterSettings, FixedWidthParserSettings, FixedWidthWriterSettings

Constructor Summary
CommonSettings()
          Creates a new instance of this settings object using the default format specified by the concrete class that inherits from CommonSettings
 
Method Summary
protected  void addConfiguration(Map<String,Object> out)
           
protected abstract  F createDefaultFormat()
          Extending classes must implement this method to return the default format settings for their parser/writer
 FieldSet<Enum> excludeFields(Enum... columns)
          Selects columns which will not be read/written, by their names
 FieldSet<String> excludeFields(String... fieldNames)
          Selects fields which will not be read/written, by their names
 FieldSet<Integer> excludeIndexes(Integer... fieldIndexes)
          Selects columns which will not be read/written, by their positions
 int getErrorContentLength()
          Configures the parser/writer to limit the length of displayed contents being parsed/written in the exception message when an error occurs
 F getFormat()
          The format of the file to be parsed/written (returns the format's defaults).
 String[] getHeaders()
          Returns the field names in the input/output, in the sequence they occur (defaults to null).
 boolean getIgnoreLeadingWhitespaces()
          Returns whether or not leading whitespaces from values being read/written should be skipped (defaults to true)
 boolean getIgnoreTrailingWhitespaces()
          Returns whether or not trailing whitespaces from values being read/written should be skipped (defaults to true)
 int getMaxCharsPerColumn()
          The maximum number of characters allowed for any given value being written/read.
 int getMaxColumns()
          Returns the hard limit of how many columns a record can have (defaults to 512).
 String getNullValue()
          Returns the String representation of a null value (defaults to null)
<T extends Context>
ProcessorErrorHandler<T>
getProcessorErrorHandler()
          Returns the custom error handler to be used to capture and handle errors that might happen while processing records with a Processor or a RowWriterProcessor (i.e.
 RowProcessorErrorHandler getRowProcessorErrorHandler()
          Deprecated. Use the getProcessorErrorHandler() method as it allows format-specific error handlers to be built to work with different implementations of Context. Implementations based on RowProcessorErrorHandler allow only parsers who provide a ParsingContext to be used.
 boolean getSkipEmptyLines()
          Returns whether or not empty lines should be ignored (defaults to true)
 boolean isAutoConfigurationEnabled()
          Indicates whether this settings object can automatically derive configuration options.
 boolean isProcessorErrorHandlerDefined()
          Returns a flag indicating whether or not a ProcessorErrorHandler has been defined through the use of method setProcessorErrorHandler(ProcessorErrorHandler)
 FieldSet<Enum> selectFields(Enum... columns)
          Selects a sequence of fields for reading/writing by their names
 FieldSet<String> selectFields(String... fieldNames)
          Selects a sequence of fields for reading/writing by their names.
 FieldSet<Integer> selectIndexes(Integer... fieldIndexes)
          Selects a sequence of fields for reading/writing by their positions.
 void setAutoConfigurationEnabled(boolean autoConfigurationEnabled)
          Indicates whether this settings object can automatically derive configuration options.
 void setErrorContentLength(int errorContentLength)
          Configures the parser/writer to limit the length of displayed contents being parsed/written in the exception message when an error occurs.
 void setFormat(F format)
          Defines the format of the file to be parsed/written (returns the format's defaults).
 void setHeaders(String... headers)
          Defines the field names in the input/output, in the sequence they occur (defaults to null).
 void setIgnoreLeadingWhitespaces(boolean ignoreLeadingWhitespaces)
          Defines whether or not leading whitespaces from values being read/written should be skipped (defaults to true)
 void setIgnoreTrailingWhitespaces(boolean ignoreTrailingWhitespaces)
          Defines whether or not trailing whitespaces from values being read/written should be skipped (defaults to true)
 void setMaxCharsPerColumn(int maxCharsPerColumn)
          Defines the maximum number of characters allowed for any given value being written/read.
 void setMaxColumns(int maxColumns)
          Defines a hard limit of how many columns a record can have (defaults to 512).
 void setNullValue(String emptyValue)
          Sets the String representation of a null value (defaults to null)
 void setProcessorErrorHandler(ProcessorErrorHandler<? extends Context> processorErrorHandler)
          Defines a custom error handler to capture and handle errors that might happen while processing records with a Processor or a RowWriterProcessor (i.e.
 void setRowProcessorErrorHandler(RowProcessorErrorHandler rowProcessorErrorHandler)
          Deprecated. Use the setProcessorErrorHandler(ProcessorErrorHandler) method as it allows format-specific error handlers to be built to work with different implementations of Context. Implementations based on RowProcessorErrorHandler allow only parsers who provide a ParsingContext to be used.
 void setSkipEmptyLines(boolean skipEmptyLines)
          Defines whether or not empty lines should be ignored (defaults to true)
 String toString()
           
 void trimValues(boolean trim)
          Configures the parser/writer to trim or keep leading and trailing whitespaces around values This has the same effect as invoking both setIgnoreLeadingWhitespaces(boolean) and setIgnoreTrailingWhitespaces(boolean) with the same value.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
 

Constructor Detail

CommonSettings

public CommonSettings()
Creates a new instance of this settings object using the default format specified by the concrete class that inherits from CommonSettings

Method Detail

getNullValue

public String getNullValue()
Returns the String representation of a null value (defaults to null)

When reading, if the parser does not read any character from the input, the nullValue is used instead of an empty string

When writing, if the writer has a null object to write to the output, the nullValue is used instead of an empty string

Returns:
the String representation of a null value

setNullValue

public void setNullValue(String emptyValue)
Sets the String representation of a null value (defaults to null)

When reading, if the parser does not read any character from the input, the nullValue is used instead of an empty string

When writing, if the writer has a null object to write to the output, the nullValue is used instead of an empty string

Parameters:
emptyValue - the String representation of a null value

getMaxCharsPerColumn

public int getMaxCharsPerColumn()
The maximum number of characters allowed for any given value being written/read. Used to avoid OutOfMemoryErrors (defaults to 4096).

If set to -1, then the internal internal array will expand automatically, up to the limit allowed by the JVM

Returns:
The maximum number of characters allowed for any given value being written/read

setMaxCharsPerColumn

public void setMaxCharsPerColumn(int maxCharsPerColumn)
Defines the maximum number of characters allowed for any given value being written/read. Used to avoid OutOfMemoryErrors (defaults to 4096).

To enable auto-expansion of the internal array, set this property to -1

Parameters:
maxCharsPerColumn - The maximum number of characters allowed for any given value being written/read

getSkipEmptyLines

public boolean getSkipEmptyLines()
Returns whether or not empty lines should be ignored (defaults to true)

when reading, if the parser reads a line that is empty, it will be skipped.

when writing, if the writer receives an empty or null row to write to the output, it will be ignored

Returns:
true if empty lines are configured to be ignored, false otherwise

setSkipEmptyLines

public void setSkipEmptyLines(boolean skipEmptyLines)
Defines whether or not empty lines should be ignored (defaults to true)

when reading, if the parser reads a line that is empty, it will be skipped.

when writing, if the writer receives an empty or null row to write to the output, it will be ignored

Parameters:
skipEmptyLines - true if empty lines should be ignored, false otherwise

getIgnoreTrailingWhitespaces

public boolean getIgnoreTrailingWhitespaces()
Returns whether or not trailing whitespaces from values being read/written should be skipped (defaults to true)

Returns:
true if trailing whitespaces from values being read/written should be skipped, false otherwise

setIgnoreTrailingWhitespaces

public void setIgnoreTrailingWhitespaces(boolean ignoreTrailingWhitespaces)
Defines whether or not trailing whitespaces from values being read/written should be skipped (defaults to true)

Parameters:
ignoreTrailingWhitespaces - true if trailing whitespaces from values being read/written should be skipped, false otherwise

getIgnoreLeadingWhitespaces

public boolean getIgnoreLeadingWhitespaces()
Returns whether or not leading whitespaces from values being read/written should be skipped (defaults to true)

Returns:
true if leading whitespaces from values being read/written should be skipped, false otherwise

setIgnoreLeadingWhitespaces

public void setIgnoreLeadingWhitespaces(boolean ignoreLeadingWhitespaces)
Defines whether or not leading whitespaces from values being read/written should be skipped (defaults to true)

Parameters:
ignoreLeadingWhitespaces - true if leading whitespaces from values being read/written should be skipped, false otherwise

setHeaders

public void setHeaders(String... headers)
Defines the field names in the input/output, in the sequence they occur (defaults to null).

when reading, the given header names will be used to refer to each column irrespective of whether or not the input contains a header row

when writing, the given header names will be used to refer to each column and can be used for writing the header row

Parameters:
headers - the field name sequence associated with each column in the input/output.

getHeaders

public String[] getHeaders()
Returns the field names in the input/output, in the sequence they occur (defaults to null).

when reading, the given header names will be used to refer to each column irrespective of whether or not the input contains a header row

when writing, the given header names will be used to refer to each column and can be used for writing the header row

Returns:
the field name sequence associated with each column in the input/output.

getMaxColumns

public int getMaxColumns()
Returns the hard limit of how many columns a record can have (defaults to 512). You need this to avoid OutOfMemory errors in case of inputs that might be inconsistent with the format you are dealing with .

Returns:
The maximum number of columns a record can have.

setMaxColumns

public void setMaxColumns(int maxColumns)
Defines a hard limit of how many columns a record can have (defaults to 512). You need this to avoid OutOfMemory errors in case of inputs that might be inconsistent with the format you are dealing with.

Parameters:
maxColumns - The maximum number of columns a record can have.

getFormat

public F getFormat()
The format of the file to be parsed/written (returns the format's defaults).

Returns:
The format of the file to be parsed/written

setFormat

public void setFormat(F format)
Defines the format of the file to be parsed/written (returns the format's defaults).

Parameters:
format - The format of the file to be parsed/written

selectFields

public FieldSet<String> selectFields(String... fieldNames)
Selects a sequence of fields for reading/writing by their names.

When reading, only the values of the selected columns will be parsed, and the content of the other columns ignored. The resulting rows will be returned with the selected columns only, in the order specified. If you want to obtain the original row format, with all columns included and nulls in the fields that have not been selected, set CommonParserSettings.setColumnReorderingEnabled(boolean) with false.

When writing, the sequence provided represents the expected format of the input rows. For example, headers can be "H1,H2,H3", but the input data is coming with values for two columns and in a different order, such as "V_H3, V_H1". Selecting fields "H3" and "H1" will allow the writer to write values in the expected locations. Using the given example, the output row will be generated as: "V_H1,null,V_H3"

Parameters:
fieldNames - The field names to read/write
Returns:
the (modifiable) set of selected fields

excludeFields

public FieldSet<String> excludeFields(String... fieldNames)
Selects fields which will not be read/written, by their names

When reading, only the values of the selected columns will be parsed, and the content of the other columns ignored. The resulting rows will be returned with the selected columns only, in the order specified. If you want to obtain the original row format, with all columns included and nulls in the fields that have not been selected, set CommonParserSettings.setColumnReorderingEnabled(boolean) with false.

When writing, the sequence of non-excluded fields represents the expected format of the input rows. For example, headers can be "H1,H2,H3", but the input data is coming with values for two columns and in a different order, such as "V_H3, V_H1". Selecting fields "H3" and "H1" will allow the writer to write values in the expected locations. Using the given example, the output row will be generated as: "V_H1,null,V_H3"

Parameters:
fieldNames - The field names to exclude from the parsing/writing process
Returns:
the (modifiable) set of ignored fields

selectIndexes

public FieldSet<Integer> selectIndexes(Integer... fieldIndexes)
Selects a sequence of fields for reading/writing by their positions.

When reading, only the values of the selected columns will be parsed, and the content of the other columns ignored. The resulting rows will be returned with the selected columns only, in the order specified. If you want to obtain the original row format, with all columns included and nulls in the fields that have not been selected, set CommonParserSettings.setColumnReorderingEnabled(boolean) with false.

When writing, the sequence provided represents the expected format of the input rows. For example, headers can be "H1,H2,H3", but the input data is coming with values for two columns and in a different order, such as "V_H3, V_H1". Selecting indexes "2" and "0" will allow the writer to write values in the expected locations. Using the given example, the output row will be generated as: "V_H1,null,V_H3"

Parameters:
fieldIndexes - The indexes to read/write
Returns:
the (modifiable) set of selected fields

excludeIndexes

public FieldSet<Integer> excludeIndexes(Integer... fieldIndexes)
Selects columns which will not be read/written, by their positions

When reading, only the values of the selected columns will be parsed, and the content of the other columns ignored. The resulting rows will be returned with the selected columns only, in the order specified. If you want to obtain the original row format, with all columns included and nulls in the fields that have not been selected, set CommonParserSettings.setColumnReorderingEnabled(boolean) with false.

When writing, the sequence of non-excluded fields represents the expected format of the input rows. For example, headers can be "H1,H2,H3", but the input data is coming with values for two columns and in a different order, such as "V_H3, V_H1". Selecting fields by index, such as "2" and "0" will allow the writer to write values in the expected locations. Using the given example, the output row will be generated as: "V_H1,null,V_H3"

Parameters:
fieldIndexes - indexes of columns to exclude from the parsing/writing process
Returns:
the (modifiable) set of ignored fields

selectFields

public FieldSet<Enum> selectFields(Enum... columns)
Selects a sequence of fields for reading/writing by their names

When reading, only the values of the selected columns will be parsed, and the content of the other columns ignored. The resulting rows will be returned with the selected columns only, in the order specified. If you want to obtain the original row format, with all columns included and nulls in the fields that have not been selected, set CommonParserSettings.setColumnReorderingEnabled(boolean) with false.

When writing, the sequence provided represents the expected format of the input rows. For example, headers can be "H1,H2,H3", but the input data is coming with values for two columns and in a different order, such as "V_H3, V_H1". Selecting fields "H3" and "H1" will allow the writer to write values in the expected locations. Using the given example, the output row will be generated as: "V_H1,null,V_H3"

Parameters:
columns - The columns to read/write
Returns:
the (modifiable) set of selected fields

excludeFields

public FieldSet<Enum> excludeFields(Enum... columns)
Selects columns which will not be read/written, by their names

When reading, only the values of the selected columns will be parsed, and the content of the other columns ignored. The resulting rows will be returned with the selected columns only, in the order specified. If you want to obtain the original row format, with all columns included and nulls in the fields that have not been selected, set CommonParserSettings.setColumnReorderingEnabled(boolean) with false.

When writing, the sequence of non-excluded fields represents the expected format of the input rows. For example, headers can be "H1,H2,H3", but the input data is coming with values for two columns and in a different order, such as "V_H3, V_H1". Selecting fields "H3" and "H1" will allow the writer to write values in the expected locations. Using the given example, the output row will be generated as: "V_H1,null,V_H3"

Parameters:
columns - The columns to exclude from the parsing/writing process
Returns:
the (modifiable) set of ignored fields

isAutoConfigurationEnabled

public final boolean isAutoConfigurationEnabled()
Indicates whether this settings object can automatically derive configuration options. This is used, for example, to define the headers when the user provides a BeanWriterProcessor where the bean class contains a Headers annotation, or to enable header extraction when the bean class of a BeanProcessor has attributes mapping to header names.

Defaults to true

Returns:
true if the automatic configuration feature is enabled, false otherwise

setAutoConfigurationEnabled

public final void setAutoConfigurationEnabled(boolean autoConfigurationEnabled)
Indicates whether this settings object can automatically derive configuration options. This is used, for example, to define the headers when the user provides a BeanWriterProcessor where the bean class contains a Headers annotation, or to enable header extraction when the bean class of a BeanProcessor has attributes mapping to header names.

Parameters:
autoConfigurationEnabled - a flag to turn the automatic configuration feature on/off.

getRowProcessorErrorHandler

@Deprecated
public RowProcessorErrorHandler getRowProcessorErrorHandler()
Deprecated. Use the getProcessorErrorHandler() method as it allows format-specific error handlers to be built to work with different implementations of Context. Implementations based on RowProcessorErrorHandler allow only parsers who provide a ParsingContext to be used.

Returns the custom error handler to be used to capture and handle errors that might happen while processing records with a RowProcessor or a RowWriterProcessor (i.e. non-fatal DataProcessingExceptions).

The parsing/writing process won't stop (unless the error handler rethrows the DataProcessingException or manually stops the process).

Returns:
the callback error handler with custom code to manage occurrences of DataProcessingException.

setRowProcessorErrorHandler

@Deprecated
public void setRowProcessorErrorHandler(RowProcessorErrorHandler rowProcessorErrorHandler)
Deprecated. Use the setProcessorErrorHandler(ProcessorErrorHandler) method as it allows format-specific error handlers to be built to work with different implementations of Context. Implementations based on RowProcessorErrorHandler allow only parsers who provide a ParsingContext to be used.

Defines a custom error handler to capture and handle errors that might happen while processing records with a RowProcessor or a RowWriterProcessor (i.e. non-fatal DataProcessingExceptions).

The parsing parsing/writing won't stop (unless the error handler rethrows the DataProcessingException or manually stops the process).

Parameters:
rowProcessorErrorHandler - the callback error handler with custom code to manage occurrences of DataProcessingException.

getProcessorErrorHandler

public <T extends Context> ProcessorErrorHandler<T> getProcessorErrorHandler()
Returns the custom error handler to be used to capture and handle errors that might happen while processing records with a Processor or a RowWriterProcessor (i.e. non-fatal DataProcessingExceptions).

The parsing/writing process won't stop (unless the error handler rethrows the DataProcessingException or manually stops the process).

Type Parameters:
T - the Context type provided by the parser implementation.
Returns:
the callback error handler with custom code to manage occurrences of DataProcessingException.

setProcessorErrorHandler

public void setProcessorErrorHandler(ProcessorErrorHandler<? extends Context> processorErrorHandler)
Defines a custom error handler to capture and handle errors that might happen while processing records with a Processor or a RowWriterProcessor (i.e. non-fatal DataProcessingExceptions).

The parsing parsing/writing won't stop (unless the error handler rethrows the DataProcessingException or manually stops the process).

Parameters:
processorErrorHandler - the callback error handler with custom code to manage occurrences of DataProcessingException.

isProcessorErrorHandlerDefined

public boolean isProcessorErrorHandlerDefined()
Returns a flag indicating whether or not a ProcessorErrorHandler has been defined through the use of method setProcessorErrorHandler(ProcessorErrorHandler)

Returns:
true if the parser/writer is configured to use a ProcessorErrorHandler

createDefaultFormat

protected abstract F createDefaultFormat()
Extending classes must implement this method to return the default format settings for their parser/writer

Returns:
Default format configuration for the given parser/writer settings.

trimValues

public final void trimValues(boolean trim)
Configures the parser/writer to trim or keep leading and trailing whitespaces around values This has the same effect as invoking both setIgnoreLeadingWhitespaces(boolean) and setIgnoreTrailingWhitespaces(boolean) with the same value.

Parameters:
trim - a flag indicating whether the whitespaces should remove whitespaces around values parsed/written.

getErrorContentLength

public int getErrorContentLength()
Configures the parser/writer to limit the length of displayed contents being parsed/written in the exception message when an error occurs

If set to 0, then no exceptions will include the content being manipulated in their attributes, and the "<omitted>" string will appear in error messages as the parsed/written content.

defaults to -1 (no limit)

.

Returns:
the maximum length of contents displayed in exception messages in case of errors while parsing/writing.

setErrorContentLength

public void setErrorContentLength(int errorContentLength)
Configures the parser/writer to limit the length of displayed contents being parsed/written in the exception message when an error occurs.

If set to 0, then no exceptions will include the content being manipulated in their attributes, and the "<omitted>" string will appear in error messages as the parsed/written content.

defaults to -1 (no limit)

.

Parameters:
errorContentLength - maximum length of contents displayed in exception messages in case of errors while parsing/writing.

toString

public final String toString()
Overrides:
toString in class Object

addConfiguration

protected void addConfiguration(Map<String,Object> out)


Copyright © 2016 uniVocity Software Pty Ltd. All rights reserved.