Package com.clickhouse.data
Class ClickHouseDataProcessor
java.lang.Object
com.clickhouse.data.ClickHouseDataProcessor
This defines a data processor for dealing with serialization and
deserialization of one or multiple
ClickHouseFormat. Unlike
ClickHouseDeserializer and ClickHouseSerializer, which is for
specific column or data type, data processor is a combination of both, and it
can handle more scenarios like separator between columns and rows.-
Nested Class Summary
Nested ClassesModifier and TypeClassDescriptionprotected static final class -
Field Summary
FieldsModifier and TypeFieldDescriptionprotected final ClickHouseColumn[]protected final ClickHouseDataConfigprotected final ClickHouseRecordstatic final List<ClickHouseColumn>protected final ClickHouseDeserializer[]protected static final Stringprotected static final Stringprotected static final Stringprotected static final Stringprotected final ClickHouseInputStreamprotected final ClickHouseOutputStreamprotected intprotected final Iterator<ClickHouseRecord>protected final ClickHouseSerializer[]protected final ClickHouseValue[]protected final Iterator<ClickHouseValue>protected intColumn index shared bywrite(ClickHouseValue). -
Constructor Summary
ConstructorsModifierConstructorDescriptionprotectedClickHouseDataProcessor(ClickHouseDataConfig config, ClickHouseInputStream input, ClickHouseOutputStream output, List<ClickHouseColumn> columns, Map<String, Serializable> settings) Default constructor. -
Method Summary
Modifier and TypeMethodDescriptionprotected ClickHouseDeserializer[]Builds list of steps to deserialize value for the given column.protected ClickHouseSerializer[]buildSerializeSteps(ClickHouseColumn column) Builds list of steps to serialize value for the given column.protected abstract ClickHouseRecordFactory method to create a record.final List<ClickHouseColumn>Gets list of columns to process.abstract ClickHouseDeserializergetDeserializer(ClickHouseDataConfig config, ClickHouseColumn column) final ClickHouseDeserializer[]getDeserializers(ClickHouseDataConfig config, List<ClickHouseColumn> columns) abstract ClickHouseSerializergetSerializer(ClickHouseDataConfig config, ClickHouseColumn column) final ClickHouseSerializer[]getSerializers(ClickHouseDataConfig config, List<ClickHouseColumn> columns) protected booleanChecks whether there's more to read from input stream.protected Iterator<ClickHouseRecord>Initializes iterator ofClickHouseRecordfor reading values record by record.protected Iterator<ClickHouseValue>Initializes iterator ofClickHouseValuefor reading values one by one.read(ClickHouseValue value) Reads deserialized value of next column(atreadPosition) directly from input stream.protected voidReads columns(starting fromreadPosition) from input stream and fill deserialized data into the given record.protected voidreadAndFill(ClickHouseValue value) Reads next column(atreadPositionfrom input stream and fill deserialized data into the given value object.protected abstract List<ClickHouseColumn>Reads columns from input stream.final Iterable<ClickHouseRecord>records()Returns an iterable collection of records which can be walked through in a foreach-loop.final Iterable<ClickHouseValue>values()Returns an iterable collection of values which can be walked through in a foreach-loop.voidwrite(ClickHouseValue value) Writes serialized value of next column(atreadPosition) to output stream.
-
Field Details
-
DEFAULT_COLUMNS
-
ERROR_FAILED_TO_READ
- See Also:
-
ERROR_FAILED_TO_WRITE
- See Also:
-
ERROR_REACHED_END_OF_STREAM
- See Also:
-
ERROR_UNKNOWN_DATA_TYPE
- See Also:
-
config
-
input
-
output
-
columns
-
currentRecord
-
templates
-
settings
-
records
-
values
-
deserializers
-
serializers
-
readPosition
protected int readPosition -
writePosition
protected int writePositionColumn index shared bywrite(ClickHouseValue).
-
-
Constructor Details
-
ClickHouseDataProcessor
protected ClickHouseDataProcessor(ClickHouseDataConfig config, ClickHouseInputStream input, ClickHouseOutputStream output, List<ClickHouseColumn> columns, Map<String, Serializable> settings) throws IOExceptionDefault constructor.- Parameters:
config- non-null confinguration contains information like formatinput- input stream for deserialization, can be null whenoutputis availableoutput- outut stream for serialization, can be null wheninputis availablecolumns- nullable columnssettings- nullable settings- Throws:
IOException- when failed to read columns from input stream
-
-
Method Details
-
hasMoreToRead
Checks whether there's more to read from input stream.- Returns:
- true if there's more; false otherwise
- Throws:
UncheckedIOException- when failed to read data from input stream
-
buildDeserializeSteps
Builds list of steps to deserialize value for the given column.- Parameters:
column- non-null column- Returns:
- non-null list of steps for deserialization
-
buildSerializeSteps
Builds list of steps to serialize value for the given column.- Parameters:
column- non-null column- Returns:
- non-null list of steps for serialization
-
createRecord
Factory method to create a record.- Returns:
- new record
-
initRecords
Initializes iterator ofClickHouseRecordfor reading values record by record. Usually this should be only called once during instantiation.- Returns:
- non-null iterator of
ClickHouseRecord
-
initValues
Initializes iterator ofClickHouseValuefor reading values one by one. Usually this should be only called once during instantiation.- Returns:
- non-null iterator of
ClickHouseValue
-
readAndFill
Reads columns(starting fromreadPosition) from input stream and fill deserialized data into the given record. This method is only used when iterating throughrecords().- Parameters:
r- non-null record to fill- Throws:
IOException- when failed to read columns from input stream
-
readAndFill
Reads next column(atreadPositionfrom input stream and fill deserialized data into the given value object. This method is mainly used when iterating throughvalues(). In default implementation, it's also used inreadAndFill(ClickHouseRecord)for simplicity.- Parameters:
value- non-null value object to fill- Throws:
IOException- when failed to read column from input stream
-
readColumns
Reads columns from input stream. Usually this will be only called once during instantiation.- Returns:
- non-null list of columns
- Throws:
IOException- when failed to read columns from input stream
-
getDeserializer
public abstract ClickHouseDeserializer getDeserializer(ClickHouseDataConfig config, ClickHouseColumn column) -
getDeserializers
public final ClickHouseDeserializer[] getDeserializers(ClickHouseDataConfig config, List<ClickHouseColumn> columns) -
getSerializer
public abstract ClickHouseSerializer getSerializer(ClickHouseDataConfig config, ClickHouseColumn column) -
getSerializers
public final ClickHouseSerializer[] getSerializers(ClickHouseDataConfig config, List<ClickHouseColumn> columns) -
getColumns
Gets list of columns to process.- Returns:
- list of columns to process
-
records
Returns an iterable collection of records which can be walked through in a foreach-loop. Please pay attention that: 1)UncheckedIOExceptionmight be thrown when iterating through the collection; and 2) it's not supposed to be called for more than once because the input stream will be closed at the end of reading.- Returns:
- non-null iterable records
- Throws:
UncheckedIOException- when failed to access the input stream
-
values
Returns an iterable collection of values which can be walked through in a foreach-loop. In general, this is slower thanrecords(), because the latter reads data in bulk. However, it's particular useful when you're reading large values with limited memory - e.g. a binary field with a few GB bytes. Similarly, the input stream will be closed at the end of reading.- Returns:
- non-null iterable values
- Throws:
UncheckedIOException- when failed to access the input stream
-
read
Reads deserialized value of next column(atreadPosition) directly from input stream. Unlikerecords(), which reads multiple values at a time, this method will only read one for each call.- Parameters:
value- value to update, could be null- Returns:
- updated
valueor a newClickHouseValuewhen it is null - Throws:
IOException- when failed to read data from input stream
-
write
Writes serialized value of next column(atreadPosition) to output stream.- Parameters:
value- non-null value to be serialized- Throws:
IOException- when failed to write data to output stream
-