Class FilteredRowRangesIterator

java.lang.Object
io.trino.parquet.reader.flat.FilteredRowRangesIterator
All Implemented Interfaces:
RowRangesIterator

public class FilteredRowRangesIterator extends Object implements RowRangesIterator
When filtering using column indexes we might skip reading some pages for different columns. Because the rows are not aligned between the pages of the different columns it might be required to skip some values. The values (and the related rl and dl) are skipped based on the iterator of the required row indexes and the first row index of each page. For example:
 rows   col1   col2   col3
      ┌──────┬──────┬──────┐
   0  │  p0  │      │      │
      ╞══════╡  p0  │  p0  │
  20  │ p1(X)│------│------│
      ╞══════╪══════╡      │
  40  │ p2(X)│      │------│
      ╞══════╡ p1(X)╞══════╡
  60  │ p3(X)│      │------│
      ╞══════╪══════╡      │
  80  │  p4  │      │  p1  │
      ╞══════╡  p2  │      │
 100  │  p5  │      │      │
      └──────┴──────┴──────┘
 
The pages 1, 2, 3 in col1 are skipped, so we have to skip the rows [20, 79]. Because page 1 in col2 contains values only for the rows [40, 79] we skip this entire page as well. To synchronize the row reading we have to skip the values (and the related rl and dl) for the rows [20, 39] in the end of the page 0 for col2. Similarly, we have to skip values while reading page0 and page1 for col3.
  • Constructor Details

    • FilteredRowRangesIterator

      public FilteredRowRangesIterator(FilteredRowRanges rowRanges)
  • Method Details

    • getRowsLeftInCurrentRange

      public int getRowsLeftInCurrentRange()
      Specified by:
      getRowsLeftInCurrentRange in interface RowRangesIterator
      Returns:
      Size of the next read within current range, bounded by chunkSize.
    • advanceRange

      public int advanceRange(int chunkSize)
      Specified by:
      advanceRange in interface RowRangesIterator
      Returns:
      Size of the next read within current range, bounded by chunkSize. When all the rows of the current range have been read, advance to the next range.
    • seekForward

      public int seekForward(int chunkSize)
      Seek forward in the page by chunkSize. Advance rowRanges if we seek beyond currentRange.
      Specified by:
      seekForward in interface RowRangesIterator
      Returns:
      number of values skipped within rowRanges
    • skipToRangeStart

      public long skipToRangeStart()
      Specified by:
      skipToRangeStart in interface RowRangesIterator
      Returns:
      Count of values to be skipped when current range start is after current position in the page
    • resetForNewPage

      public void resetForNewPage(OptionalLong firstRowIndex)
      Must be called at the beginning of reading a new page. Advances rowRanges if current range has no overlap with the new page.
      Specified by:
      resetForNewPage in interface RowRangesIterator
    • isPageFullyConsumed

      public boolean isPageFullyConsumed(int pageValueCount)
      Returns whether the current page with the provided value count is fully contained within the current row range.
      Specified by:
      isPageFullyConsumed in interface RowRangesIterator