public class VectorizedOrcAcidRowBatchReader extends Object implements org.apache.hadoop.mapred.RecordReader<org.apache.hadoop.io.NullWritable,VectorizedRowBatch>
| Modifier and Type | Class and Description |
|---|---|
protected static interface |
VectorizedOrcAcidRowBatchReader.DeleteEventRegistry
An interface that can determine which rows have been deleted
from a given vectorized row batch.
|
| Modifier and Type | Field and Description |
|---|---|
protected Object[] |
partitionValues |
protected float |
progress |
| Constructor and Description |
|---|
VectorizedOrcAcidRowBatchReader(OrcSplit inputSplit,
org.apache.hadoop.mapred.JobConf conf,
org.apache.hadoop.mapred.Reporter reporter,
org.apache.hadoop.mapred.RecordReader<org.apache.hadoop.io.NullWritable,VectorizedRowBatch> baseReader,
VectorizedRowBatchCtx rbCtx,
boolean isFlatPayload,
MapWork mapWork)
LLAP IO c'tor
|
| Modifier and Type | Method and Description |
|---|---|
void |
close() |
org.apache.hadoop.io.NullWritable |
createKey() |
VectorizedRowBatch |
createValue() |
long |
getPos() |
float |
getProgress() |
boolean |
includeAcidColumns() |
boolean |
next(org.apache.hadoop.io.NullWritable key,
VectorizedRowBatch value)
There are 2 types of schema from the
baseReader that this handles. |
void |
setBaseAndInnerReader(org.apache.hadoop.mapred.RecordReader<org.apache.hadoop.io.NullWritable,VectorizedRowBatch> baseReader) |
protected float progress
protected Object[] partitionValues
public VectorizedOrcAcidRowBatchReader(OrcSplit inputSplit, org.apache.hadoop.mapred.JobConf conf, org.apache.hadoop.mapred.Reporter reporter, org.apache.hadoop.mapred.RecordReader<org.apache.hadoop.io.NullWritable,VectorizedRowBatch> baseReader, VectorizedRowBatchCtx rbCtx, boolean isFlatPayload, MapWork mapWork) throws IOException
IOExceptionpublic boolean includeAcidColumns()
public void setBaseAndInnerReader(org.apache.hadoop.mapred.RecordReader<org.apache.hadoop.io.NullWritable,VectorizedRowBatch> baseReader)
public boolean next(org.apache.hadoop.io.NullWritable key,
VectorizedRowBatch value)
throws IOException
baseReader that this handles. In the case
the data was written to a transactional table from the start, every row is decorated with
transaction related info and looks like <op, owid, writerId, rowid, cwid, <f1, ... fn>>.
The other case is when data was written to non-transactional table and thus only has the user
data: <f1, ... fn>. Then this table was then converted to a transactional table but the data
files are not changed until major compaction. These are the "original" files.
In this case we may need to decorate the outgoing data with transactional column values at
read time. (It's done somewhat out of band via VectorizedRowBatchCtx - ask Teddy Choi).
The "owid, writerId, rowid" columns represent RecordIdentifier. They are assigned
each time the table is read in a way that needs to project VirtualColumn.ROWID.
Major compaction will attach these values to each row permanently.
It's critical that these generated column values are assigned exactly the same way by each
read of the same row and by the Compactor.
See CompactorMR and
OrcRawRecordMerger.OriginalReaderPairToCompact for the Compactor read path.
(Longer term should make compactor use this class)
This only decorates original rows with metadata if something above is requesting these values
or if there are Delete events to apply.next in interface org.apache.hadoop.mapred.RecordReader<org.apache.hadoop.io.NullWritable,VectorizedRowBatch>value is emptyIOExceptionpublic org.apache.hadoop.io.NullWritable createKey()
createKey in interface org.apache.hadoop.mapred.RecordReader<org.apache.hadoop.io.NullWritable,VectorizedRowBatch>public VectorizedRowBatch createValue()
createValue in interface org.apache.hadoop.mapred.RecordReader<org.apache.hadoop.io.NullWritable,VectorizedRowBatch>public long getPos()
throws IOException
getPos in interface org.apache.hadoop.mapred.RecordReader<org.apache.hadoop.io.NullWritable,VectorizedRowBatch>IOExceptionpublic void close()
throws IOException
close in interface Closeableclose in interface AutoCloseableclose in interface org.apache.hadoop.mapred.RecordReader<org.apache.hadoop.io.NullWritable,VectorizedRowBatch>IOExceptionpublic float getProgress()
throws IOException
getProgress in interface org.apache.hadoop.mapred.RecordReader<org.apache.hadoop.io.NullWritable,VectorizedRowBatch>IOExceptionCopyright © 2022 The Apache Software Foundation. All rights reserved.