Class OffHeapJsonIndexCreator

  • All Implemented Interfaces:
    Closeable, AutoCloseable, JsonIndexCreator, IndexCreator

    public class OffHeapJsonIndexCreator
    extends BaseJsonIndexCreator
    Implementation of JsonIndexCreator that uses off-heap memory.

    The posting lists (map from value to doc ids) are initially stored in a TreeMap, then flushed into a file for every 100,000 documents (unflattened records) added. After all the documents are added, we read all the posting lists from the file and merge them using a priority queue to calculate the final posting lists. Then we generate the string dictionary and inverted index from the final posting lists and create the json index on top of them.

    Off-heap creator uses less heap memory, but is more expensive on computation and needs flush data to disk which can slow down the creation because of the IO latency. Use off-heap creator in the environment where there is limited heap memory or garbage collection can cause performance issue (e.g. index creation at loading time on Pinot Server).