Package io.trino.parquet.reader.flat
Class FilteredRowRangesIterator
java.lang.Object
io.trino.parquet.reader.flat.FilteredRowRangesIterator
- All Implemented Interfaces:
RowRangesIterator
When filtering using column indexes we might skip reading some pages for different columns. Because the rows are
not aligned between the pages of the different columns it might be required to skip some values. The values (and the
related rl and dl) are skipped based on the iterator of the required row indexes and the first row index of each
page.
For example:
rows col1 col2 col3
┌──────┬──────┬──────┐
0 │ p0 │ │ │
╞══════╡ p0 │ p0 │
20 │ p1(X)│------│------│
╞══════╪══════╡ │
40 │ p2(X)│ │------│
╞══════╡ p1(X)╞══════╡
60 │ p3(X)│ │------│
╞══════╪══════╡ │
80 │ p4 │ │ p1 │
╞══════╡ p2 │ │
100 │ p5 │ │ │
└──────┴──────┴──────┘
The pages 1, 2, 3 in col1 are skipped, so we have to skip the rows [20, 79]. Because page 1 in col2 contains values
only for the rows [40, 79] we skip this entire page as well. To synchronize the row reading we have to skip the
values (and the related rl and dl) for the rows [20, 39] in the end of the page 0 for col2. Similarly, we have to
skip values while reading page0 and page1 for col3.-
Nested Class Summary
Nested classes/interfaces inherited from interface io.trino.parquet.reader.flat.RowRangesIterator
RowRangesIterator.AllRowRangesIterator -
Field Summary
Fields inherited from interface io.trino.parquet.reader.flat.RowRangesIterator
ALL_ROW_RANGES_ITERATOR -
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionintadvanceRange(int chunkSize) intbooleanisPageFullyConsumed(int pageValueCount) Returns whether the current page with the provided value count is fully contained within the current row range.voidresetForNewPage(OptionalLong firstRowIndex) Must be called at the beginning of reading a new page.intseekForward(int chunkSize) Seek forward in the page by chunkSize.long
-
Constructor Details
-
FilteredRowRangesIterator
-
-
Method Details
-
getRowsLeftInCurrentRange
public int getRowsLeftInCurrentRange()- Specified by:
getRowsLeftInCurrentRangein interfaceRowRangesIterator- Returns:
- Size of the next read within current range, bounded by chunkSize.
-
advanceRange
public int advanceRange(int chunkSize) - Specified by:
advanceRangein interfaceRowRangesIterator- Returns:
- Size of the next read within current range, bounded by chunkSize. When all the rows of the current range have been read, advance to the next range.
-
seekForward
public int seekForward(int chunkSize) Seek forward in the page by chunkSize. Advance rowRanges if we seek beyond currentRange.- Specified by:
seekForwardin interfaceRowRangesIterator- Returns:
- number of values skipped within rowRanges
-
skipToRangeStart
public long skipToRangeStart()- Specified by:
skipToRangeStartin interfaceRowRangesIterator- Returns:
- Count of values to be skipped when current range start is after current position in the page
-
resetForNewPage
Must be called at the beginning of reading a new page. Advances rowRanges if current range has no overlap with the new page.- Specified by:
resetForNewPagein interfaceRowRangesIterator
-
isPageFullyConsumed
public boolean isPageFullyConsumed(int pageValueCount) Returns whether the current page with the provided value count is fully contained within the current row range.- Specified by:
isPageFullyConsumedin interfaceRowRangesIterator
-