Class RangeIndexCreator

  • All Implemented Interfaces:
    Closeable, AutoCloseable, CombinedInvertedIndexCreator, DictionaryBasedInvertedIndexCreator, InvertedIndexCreator, RawValueBasedInvertedIndexCreator

    public final class RangeIndexCreator
    extends Object
    implements CombinedInvertedIndexCreator
    Range index creator that uses off-heap memory.

    We use 2 passes to create the range index.

    • In the first pass (adding values phase), when add() method is called, store the raw values into the value buffer (for multi-valued column we flatten the values). We also store the corresponding docId in docIdBuffer which will be sorted in the next phase based on the value in valueBuffer.
    • In the second pass (processing values phase), when seal() method is called, we sort the docIdBuffer based on the value in valueBuffer. We then iterate over the sorted docIdBuffer and create ranges such that each range comprises of _numDocsPerRange.
    • Field Summary

      Fields 
      Modifier and Type Field Description
      static int VERSION  
    • Method Summary

      Modifier and Type Method Description
      void add​(double value)  
      void add​(double[] values, int length)  
      void add​(float value)  
      void add​(float[] values, int length)  
      void add​(int value)  
      void add​(int[] values, int length)  
      void add​(long value)  
      void add​(long[] values, int length)  
      void close()  
      int getNumValuesPerRange()  
      void seal()
      Generates the range Index file Sample output by running RangeIndexCreatorTest with TRACE=true and change log4.xml in core to info 15:18:47.330 RangeIndexCreator - Before sorting 15:18:47.333 RangeIndexCreator - DocIdBuffer [ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, ] 15:18:47.333 RangeIndexCreator - ValueBuffer [ 3, 0, 0, 0, 3, 1, 3, 0, 2, 4, 4, 2, 4, 3, 2, 1, 0, 2, 0, 3, ] 15:18:47.371 RangeIndexCreator - After sorting 15:18:47.371 RangeIndexCreator - DocIdBuffer [ 16, 3, 1, 2, 7, 18, 15, 5, 14, 8, 17, 11, 0, 4, 6, 13, 19, 10, 9, 12, ] 15:18:47.371 RangeIndexCreator - ValueBuffer [ 0, 0, 0, 0, 0, 0, 1, 1, 2, 2, 2, 2, 3, 3, 3, 3, 3, 4, 4, 4, ] 15:18:47.372 RangeIndexCreator - rangeOffsets = [ (0,5) ,(6,7) ,(8,11) ,(12,16) ,(17,19) , ] 15:18:47.372 RangeIndexCreator - rangeValues = [ (0,0) ,(1,1) ,(2,2) ,(3,3) ,(4,4) , ]
    • Constructor Detail

      • RangeIndexCreator

        public RangeIndexCreator​(File indexDir,
                                 FieldSpec fieldSpec,
                                 FieldSpec.DataType valueType,
                                 int numRanges,
                                 int numValuesPerRange,
                                 int numDocs,
                                 int numValues)
                          throws IOException
        Parameters:
        indexDir - destination of the range index file
        fieldSpec - fieldspec of the column to generate the range index
        valueType - DataType of the column, INT if dictionary encoded, or INT, FLOAT, LONG, DOUBLE for raw encoded
        numRanges - Number of ranges, use DEFAULT_NUM_RANGES if not configured (<= 0)
        numValuesPerRange - Number of values per range, calculate from numRanges if not configured (<= 0)
        numDocs - total number of documents
        numValues - total number of values, used for Multi value columns (for single value columns numDocs== numValues)
        Throws:
        IOException