Class ScanResultValueFramesIterable

  • All Implemented Interfaces:
    Iterable<FrameSignaturePair>

    public class ScanResultValueFramesIterable
    extends Object
    implements Iterable<FrameSignaturePair>
    Returns a thread-unsafe iterable, that converts a sequence of ScanResultValue to an iterable of FrameSignaturePair. ScanResultValues can have heterogenous row signatures, and the returned sequence would have batched them into frames appropriately.

    The batching process greedily merges the values from the scan result values that have the same signature, while still maintaining the manageable frame sizes that is determined by the memory allocator by splitting the rows whenever necessary.

    It is necessary that we don't batch and store the ScanResultValues somewhere (like a List) while we do this processing to prevent the heap from exhausting, without limit. It has to be done online - as the scan result values get materialized, we produce frames. A few ScanResultValues might be stored however (if the frame got cut off in the middle)

    Assuming that we have a sequence of scan result values like:

    ScanResultValue1 - RowSignatureA - 3 rows ScanResultValue2 - RowSignatureB - 2 rows ScanResultValue3 - RowSignatureA - 1 rows ScanResultValue4 - RowSignatureA - 4 rows ScanResultValue5 - RowSignatureB - 3 rows

    Also, assume that each individual frame can hold two rows (in practice, it is determined by the row size and the memory block allocated by the memory allocator factory)

    The output would be a sequence like: Frame1 - RowSignatureA - rows 1-2 from ScanResultValue1 Frame2 - RowSignatureA - row 3 from ScanResultValue1 Frame3 - RowSignatureB - rows 1-2 from ScanResultValue2 Frame4 - RowSignatureA - row 1 from ScanResultValue3, row 1 from ScanResultValue4 Frame5 - RowSignatureA - row 2-3 from ScanResultValue4 Frame6 - RowSignatureA - row 4 from ScanResultValue4 Frame7 - RowSignatureB - row 1-2 from ScanResultValue5 Frame8 - RowSignatureB - row 3 from ScanResultValue6