Class FixedBitIntReaderWriterV2

    • Method Summary

      Modifier and Type Method Description
      void close()  
      int readInt​(int index)
      Read dictionaryId for a particular docId
      void readInt​(int startDocId, int length, int[] buffer)
      Array based API to read dictionaryIds for a contiguous range of docIds starting at startDocId for a given length
      void readValues​(int[] docIds, int docIdStartIndex, int docIdLength, int[] values, int valuesStartIndex)
      Array based API to read dictionaryIds for an array of docIds which are monotonically increasing but not necessarily contiguous.
      void writeInt​(int index, int value)  
      void writeInt​(int startIndex, int length, int[] values)  
    • Constructor Detail

      • FixedBitIntReaderWriterV2

        public FixedBitIntReaderWriterV2​(PinotDataBuffer dataBuffer,
                                         int numValues,
                                         int numBitsPerValue)
    • Method Detail

      • readInt

        public int readInt​(int index)
        Read dictionaryId for a particular docId
        Parameters:
        index - docId to get the dictionaryId for
        Returns:
        dictionaryId
      • readInt

        public void readInt​(int startDocId,
                            int length,
                            int[] buffer)
        Array based API to read dictionaryIds for a contiguous range of docIds starting at startDocId for a given length
        Parameters:
        startDocId - docId range start
        length - length of contiguous docId range
        buffer - out buffer to read dictionaryIds into
      • readValues

        public void readValues​(int[] docIds,
                               int docIdStartIndex,
                               int docIdLength,
                               int[] values,
                               int valuesStartIndex)
        Array based API to read dictionaryIds for an array of docIds which are monotonically increasing but not necessarily contiguous. The difference between this and previous array based API readInt(int, int, int[]) is that unlike the other API, we are provided an array of docIds. So even though the docIds in docIds[] array are monotonically increasing, they may not necessarily be contiguous. They can have gaps. PinotDataBitSetV2 implements efficient bulk contiguous API PinotDataBitSetV2.readInt(long, int, int[]) to read dictionaryIds for a contiguous range of docIds represented by startDocId and length. This API although works on docIds with gaps, it still tries to leverage the underlying bulk contiguous API as much as possible to get benefits of vectorization. For a given docIds[] array, we determine if we should use the bulk contiguous API or not by checking if the length of the array is >= 50% of actual docIdRange (lastDocId - firstDocId + 1). This sort of gives a very rough idea of the gaps in docIds. We will benefit from bulk contiguous read if the gaps are narrow implying fewer dictIds unpacked as part of contiguous read will have to be thrown away/ignored. If the gaps are wide, a higher number of dictIds will be thrown away before we construct the out array This method of determining if bulk contiguous should be used or not is inaccurate since it is solely dependent on the first and last docId. However, getting an exact idea of the gaps in docIds[] array will first require a single pass through the array to compute the deviations between each docId and then take mean/stddev of that. This will be expensive as it requires pre-processing. To increase the probability of using the bulk contiguous API, we make this decision for every fixed-size chunk of docIds[] array.
        Parameters:
        docIds - array of docIds to read the dictionaryIds for
        docIdStartIndex - start index in docIds array
        docIdLength - length to process in docIds array
        values - out array to store the dictionaryIds into
        valuesStartIndex - start index in values array
      • writeInt

        public void writeInt​(int index,
                             int value)
      • writeInt

        public void writeInt​(int startIndex,
                             int length,
                             int[] values)