public class HoodieMergedLogRecordScanner extends AbstractHoodieLogRecordReader implements Iterable<HoodieRecord<? extends HoodieRecordPayload>>
NOTE: If readBlockLazily is turned on, does not merge, instead keeps reading log blocks and merges everything at once This is an optimization to avoid seek() back and forth to read new block (forward seek()) and lazily read content of seen block (reverse and forward seek()) during merge | | Read Block 1 Metadata | | Read Block 1 Data | | | Read Block 2 Metadata | | Read Block 2 Data | | I/O Pass 1 | ..................... | I/O Pass 2 | ................. | | | Read Block N Metadata | | Read Block N Data |
This results in two I/O passes over the log file.
| Modifier and Type | Class and Description |
|---|---|
static class |
HoodieMergedLogRecordScanner.Builder
Builder used to build
HoodieUnMergedLogRecordScanner. |
AbstractHoodieLogRecordReader.KeySpec| Modifier and Type | Field and Description |
|---|---|
protected ExternalSpillableMap<String,HoodieRecord<? extends HoodieRecordPayload>> |
records |
HoodieTimer |
timer |
forceFullScan, logFilePaths, readerSchema| Modifier | Constructor and Description |
|---|---|
protected |
HoodieMergedLogRecordScanner(org.apache.hadoop.fs.FileSystem fs,
String basePath,
List<String> logFilePaths,
org.apache.avro.Schema readerSchema,
String latestInstantTime,
Long maxMemorySizeInBytes,
boolean readBlocksLazily,
boolean reverseReader,
int bufferSize,
String spillableMapBasePath,
Option<InstantRange> instantRange,
ExternalSpillableMap.DiskMapType diskMapType,
boolean isBitCaskDiskMapCompressionEnabled,
boolean withOperationField,
boolean forceFullScan,
Option<String> partitionName,
InternalSchema internalSchema) |
| Modifier and Type | Method and Description |
|---|---|
void |
close() |
long |
getNumMergedRecordsInLog() |
Map<String,HoodieRecord<? extends HoodieRecordPayload>> |
getRecords() |
long |
getTotalTimeTakenToReadAndMergeBlocks() |
Iterator<HoodieRecord<? extends HoodieRecordPayload>> |
iterator() |
static HoodieMergedLogRecordScanner.Builder |
newBuilder()
Returns the builder for
HoodieMergedLogRecordScanner. |
protected void |
performScan() |
protected void |
processNextDeletedRecord(DeleteRecord deleteRecord)
Process next deleted record.
|
protected void |
processNextRecord(HoodieRecord<? extends HoodieRecordPayload> hoodieRecord)
Process next record.
|
createHoodieRecord, getKeyField, getPartitionName, getPayloadClassFQN, getProgress, getTotalCorruptBlocks, getTotalLogBlocks, getTotalLogFiles, getTotalLogRecords, getTotalRollbacks, isWithOperationField, scan, scan, scanInternalclone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, waitforEach, spliteratorpublic final HoodieTimer timer
protected final ExternalSpillableMap<String,HoodieRecord<? extends HoodieRecordPayload>> records
protected HoodieMergedLogRecordScanner(org.apache.hadoop.fs.FileSystem fs,
String basePath,
List<String> logFilePaths,
org.apache.avro.Schema readerSchema,
String latestInstantTime,
Long maxMemorySizeInBytes,
boolean readBlocksLazily,
boolean reverseReader,
int bufferSize,
String spillableMapBasePath,
Option<InstantRange> instantRange,
ExternalSpillableMap.DiskMapType diskMapType,
boolean isBitCaskDiskMapCompressionEnabled,
boolean withOperationField,
boolean forceFullScan,
Option<String> partitionName,
InternalSchema internalSchema)
protected void performScan()
public Iterator<HoodieRecord<? extends HoodieRecordPayload>> iterator()
iterator in interface Iterable<HoodieRecord<? extends HoodieRecordPayload>>public Map<String,HoodieRecord<? extends HoodieRecordPayload>> getRecords()
public long getNumMergedRecordsInLog()
public static HoodieMergedLogRecordScanner.Builder newBuilder()
HoodieMergedLogRecordScanner.protected void processNextRecord(HoodieRecord<? extends HoodieRecordPayload> hoodieRecord) throws IOException
AbstractHoodieLogRecordReaderprocessNextRecord in class AbstractHoodieLogRecordReaderhoodieRecord - Hoodie Record to processIOExceptionprotected void processNextDeletedRecord(DeleteRecord deleteRecord)
AbstractHoodieLogRecordReaderprocessNextDeletedRecord in class AbstractHoodieLogRecordReaderdeleteRecord - Deleted record(hoodie key and ordering value)public long getTotalTimeTakenToReadAndMergeBlocks()
public void close()
Copyright © 2022 The Apache Software Foundation. All rights reserved.