public class HoodieParquetDataBlock extends HoodieDataBlock
HoodieLogBlock.FooterMetadataType, HoodieLogBlock.HeaderMetadataType, HoodieLogBlock.HoodieLogBlockContentLocation, HoodieLogBlock.HoodieLogBlockTypeinternalSchema, readerSchemareadBlockLazily, version| Constructor and Description |
|---|
HoodieParquetDataBlock(org.apache.hadoop.fs.FSDataInputStream inputStream,
Option<byte[]> content,
boolean readBlockLazily,
HoodieLogBlock.HoodieLogBlockContentLocation logBlockContentLocation,
Option<org.apache.avro.Schema> readerSchema,
Map<HoodieLogBlock.HeaderMetadataType,String> header,
Map<HoodieLogBlock.HeaderMetadataType,String> footer,
String keyField) |
HoodieParquetDataBlock(List<org.apache.avro.generic.IndexedRecord> records,
Map<HoodieLogBlock.HeaderMetadataType,String> header,
String keyField,
org.apache.parquet.hadoop.metadata.CompressionCodecName compressionCodecName) |
| Modifier and Type | Method and Description |
|---|---|
protected ClosableIterator<org.apache.avro.generic.IndexedRecord> |
deserializeRecords(byte[] content) |
HoodieLogBlock.HoodieLogBlockType |
getBlockType() |
static ClosableIterator<org.apache.avro.generic.IndexedRecord> |
getProjectedParquetRecordsIterator(org.apache.hadoop.conf.Configuration conf,
org.apache.avro.Schema readerSchema,
org.apache.parquet.io.InputFile inputFile) |
protected ClosableIterator<org.apache.avro.generic.IndexedRecord> |
readRecordsFromBlockPayload()
NOTE: We're overriding the whole reading sequence to make sure we properly respect
the requested Reader's schema and only fetch the columns that have been explicitly
requested by the caller (providing projected Reader's schema)
|
protected byte[] |
serializeRecords(List<org.apache.avro.generic.IndexedRecord> records) |
getContentBytes, getKeyField, getRecordIterator, getRecordIterator, getRecordKey, getSchema, getWriterSchema, lookupRecordsdeflate, getBlockContentLocation, getContent, getLogBlockFooter, getLogBlockHeader, getLogBlockLength, getLogMetadata, getLogMetadataBytes, getMagic, inflate, tryReadContentpublic HoodieParquetDataBlock(org.apache.hadoop.fs.FSDataInputStream inputStream,
Option<byte[]> content,
boolean readBlockLazily,
HoodieLogBlock.HoodieLogBlockContentLocation logBlockContentLocation,
Option<org.apache.avro.Schema> readerSchema,
Map<HoodieLogBlock.HeaderMetadataType,String> header,
Map<HoodieLogBlock.HeaderMetadataType,String> footer,
String keyField)
public HoodieLogBlock.HoodieLogBlockType getBlockType()
getBlockType in class HoodieDataBlockprotected byte[] serializeRecords(List<org.apache.avro.generic.IndexedRecord> records) throws IOException
serializeRecords in class HoodieDataBlockIOExceptionpublic static ClosableIterator<org.apache.avro.generic.IndexedRecord> getProjectedParquetRecordsIterator(org.apache.hadoop.conf.Configuration conf, org.apache.avro.Schema readerSchema, org.apache.parquet.io.InputFile inputFile) throws IOException
IOExceptionprotected ClosableIterator<org.apache.avro.generic.IndexedRecord> readRecordsFromBlockPayload() throws IOException
readRecordsFromBlockPayload in class HoodieDataBlockIOExceptionprotected ClosableIterator<org.apache.avro.generic.IndexedRecord> deserializeRecords(byte[] content)
deserializeRecords in class HoodieDataBlockCopyright © 2022 The Apache Software Foundation. All rights reserved.