public abstract class AbstractColumnReader<V extends org.apache.flink.table.data.columnar.vector.writable.WritableColumnVector> extends Object implements org.apache.flink.formats.parquet.vector.reader.ColumnReader<V>
ColumnReader.
See org.apache.parquet.column.impl.ColumnReaderImpl,
part of the code is referred from Apache Spark and Apache Parquet.
Note: Reference Flink release 1.11.2 AbstractColumnReader
because some of the package scope methods.
| Modifier and Type | Field and Description |
|---|---|
protected org.apache.parquet.column.ColumnDescriptor |
descriptor |
protected org.apache.parquet.column.Dictionary |
dictionary
The dictionary, if this column has dictionary encoding.
|
protected int |
maxDefLevel
Maximum definition level for this column.
|
protected org.apache.hudi.table.format.cow.vector.reader.RunLengthDecoder |
runLenDecoder
Run length decoder for data and dictionary.
|
| Constructor and Description |
|---|
AbstractColumnReader(org.apache.parquet.column.ColumnDescriptor descriptor,
org.apache.parquet.column.page.PageReader pageReader) |
| Modifier and Type | Method and Description |
|---|---|
protected void |
afterReadPage()
After read a page, we may need some initialization.
|
protected void |
checkTypeName(org.apache.parquet.schema.PrimitiveType.PrimitiveTypeName expectedName) |
protected abstract void |
readBatch(int rowId,
int num,
V column)
Read batch from
runLenDecoder and dataInputStream. |
protected abstract void |
readBatchFromDictionaryIds(int rowId,
int num,
V column,
org.apache.flink.table.data.columnar.vector.writable.WritableIntVector dictionaryIds)
Decode dictionary ids to data.
|
void |
readToVector(int readNumber,
V vector)
Reads `total` values from this columnReader into column.
|
protected boolean |
supportLazyDecode()
Support lazy dictionary ids decode.
|
protected final org.apache.parquet.column.Dictionary dictionary
protected final int maxDefLevel
protected final org.apache.parquet.column.ColumnDescriptor descriptor
protected org.apache.hudi.table.format.cow.vector.reader.RunLengthDecoder runLenDecoder
public AbstractColumnReader(org.apache.parquet.column.ColumnDescriptor descriptor,
org.apache.parquet.column.page.PageReader pageReader)
throws IOException
IOExceptionprotected void checkTypeName(org.apache.parquet.schema.PrimitiveType.PrimitiveTypeName expectedName)
public final void readToVector(int readNumber,
V vector)
throws IOException
readToVector in interface org.apache.flink.formats.parquet.vector.reader.ColumnReader<V extends org.apache.flink.table.data.columnar.vector.writable.WritableColumnVector>IOExceptionprotected void afterReadPage()
protected boolean supportLazyDecode()
ParquetDictionary.
If return false, we will decode all the data first.protected abstract void readBatch(int rowId,
int num,
V column)
runLenDecoder and dataInputStream.protected abstract void readBatchFromDictionaryIds(int rowId,
int num,
V column,
org.apache.flink.table.data.columnar.vector.writable.WritableIntVector dictionaryIds)
runLenDecoder and dictionaryIdsDecoder.Copyright © 2023 The Apache Software Foundation. All rights reserved.