Class LuceneDocIdCollector

  • All Implemented Interfaces:
    org.apache.lucene.search.Collector

    public class LuceneDocIdCollector
    extends Object
    implements org.apache.lucene.search.Collector
    A simple collector created to bypass all the heap heavy process of collecting the results in Lucene. Lucene by default will create a TopScoreDocCollector which internally uses a TopDocsCollector and uses a PriorityQueue to maintain the top results. From the heap usage experiments (please see the design doc), we found out that this was substantially contributing to heap whereas we currently don't need any scoring or top doc collecting. Every time Lucene finds a matching document for the text search query, a callback is invoked into this collector that simply collects the matching doc's docID. We store the docID in a bitmap to be traversed later as part of doc id iteration etc.
    • Constructor Summary

      Constructors 
      Constructor Description
      LuceneDocIdCollector​(org.roaringbitmap.buffer.MutableRoaringBitmap docIds, org.apache.pinot.segment.local.segment.index.readers.text.LuceneTextIndexReader.DocIdTranslator docIdTranslator)  
    • Constructor Detail

      • LuceneDocIdCollector

        public LuceneDocIdCollector​(org.roaringbitmap.buffer.MutableRoaringBitmap docIds,
                                    org.apache.pinot.segment.local.segment.index.readers.text.LuceneTextIndexReader.DocIdTranslator docIdTranslator)
    • Method Detail

      • scoreMode

        public org.apache.lucene.search.ScoreMode scoreMode()
        Specified by:
        scoreMode in interface org.apache.lucene.search.Collector
      • getLeafCollector

        public org.apache.lucene.search.LeafCollector getLeafCollector​(org.apache.lucene.index.LeafReaderContext context)
        Specified by:
        getLeafCollector in interface org.apache.lucene.search.Collector