Class OnHeapStringDictionary

  • All Implemented Interfaces:
    Closeable, AutoCloseable, Dictionary

    public class OnHeapStringDictionary
    extends BaseImmutableDictionary
    Implementation of String dictionary that cache all values on-heap.

    This is useful for String columns that:

    • Has low cardinality string dictionary where memory footprint on-heap is acceptably small
    • Is heavily queried

    This helps avoid creation of String from byte[], which is expensive as well as creates garbage.

    • Constructor Detail

      • OnHeapStringDictionary

        public OnHeapStringDictionary​(PinotDataBuffer dataBuffer,
                                      int length,
                                      int numBytesPerValue,
                                      byte paddingByte)
    • Method Detail

      • insertionIndexOf

        public int insertionIndexOf​(String stringValue)
        WARNING: With non-zero padding byte, binary search result might not reflect the real insertion index for the value. E.g. with padding byte 'b', if unpadded value "aa" is in the dictionary, and stored as "aab", then unpadded value "a" will be mis-positioned after value "aa"; unpadded value "aab" will return positive value even if value "aab" is not in the dictionary. TODO: Clean up the segments with legacy non-zero padding byte, and remove the support for non-zero padding byte
      • get

        public String get​(int dictId)
      • getIntValue

        public int getIntValue​(int dictId)
      • getLongValue

        public long getLongValue​(int dictId)
      • getFloatValue

        public float getFloatValue​(int dictId)
      • getDoubleValue

        public double getDoubleValue​(int dictId)
      • getBigDecimalValue

        public BigDecimal getBigDecimalValue​(int dictId)
      • getStringValue

        public String getStringValue​(int dictId)
      • getBytesValue

        public byte[] getBytesValue​(int dictId)