Class EarlyTerminatingSortingCollector
- java.lang.Object
-
- org.apache.lucene.search.Collector
-
- org.apache.lucene.index.sorter.EarlyTerminatingSortingCollector
-
public class EarlyTerminatingSortingCollector extends Collector
ACollectorthat early terminates collection of documents on a per-segment basis, if the segment was sorted according to the givenSorter.NOTE: the
Collectordetects sorted segments according toSortingMergePolicy, so it's best used in conjunction with it. Also, it collects up to a specified num docs from each segment, and therefore is mostly suitable for use in conjunction with collectors such asTopDocsCollector, and not e.g.TotalHitCountCollector.NOTE: If you wrap a
TopDocsCollectorthat sorts in the same order as the index order, the returnedTopDocsCollector.topDocs()will be correct. However the total ofhit countwill be underestimated since not all matching documents will have been collected.NOTE: This
CollectorusesSorter.getID()to detect whether a segment was sorted with the sameSorteras the one given inEarlyTerminatingSortingCollector(Collector, Sorter, int). This has two implications:- if
Sorter.getID()is not implemented correctly and returns different identifiers for equivalentSorters, this collector will not detect sorted segments, - if you suddenly change the
IndexWriter'sSortingMergePolicyto sort according to another criterion and if both the old and the newSorters have the same identifier, thisCollectorwill incorrectly detect sorted segments.
- if
-
-
Constructor Summary
Constructors Constructor Description EarlyTerminatingSortingCollector(Collector in, Sorter sorter, int numDocsToCollect)Create a newEarlyTerminatingSortingCollectorinstance.
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description booleanacceptsDocsOutOfOrder()Returntrueif this collector does not require the matching docIDs to be delivered in int sort order (smallest to largest) toCollector.collect(int).voidcollect(int doc)Called once for every document matching a query, with the unbased document number.voidsetNextReader(AtomicReaderContext context)Called before collecting from eachAtomicReaderContext.voidsetScorer(Scorer scorer)Called before successive calls toCollector.collect(int).
-
-
-
Constructor Detail
-
EarlyTerminatingSortingCollector
public EarlyTerminatingSortingCollector(Collector in, Sorter sorter, int numDocsToCollect)
Create a newEarlyTerminatingSortingCollectorinstance.- Parameters:
in- the collector to wrapsorter- the same sorter as the one which is used byIndexWriter'sSortingMergePolicynumDocsToCollect- the number of documents to collect on each segment. When wrapping aTopDocsCollector, this number should be the number of hits.
-
-
Method Detail
-
setScorer
public void setScorer(Scorer scorer) throws IOException
Description copied from class:CollectorCalled before successive calls toCollector.collect(int). Implementations that need the score of the current document (passed-in toCollector.collect(int)), should save the passed-in Scorer and call scorer.score() when needed.- Specified by:
setScorerin classCollector- Throws:
IOException
-
collect
public void collect(int doc) throws IOExceptionDescription copied from class:CollectorCalled once for every document matching a query, with the unbased document number.Note: The collection of the current segment can be terminated by throwing a
CollectionTerminatedException. In this case, the last docs of the currentAtomicReaderContextwill be skipped andIndexSearcherwill swallow the exception and continue collection with the next leaf.Note: This is called in an inner search loop. For good search performance, implementations of this method should not call
IndexSearcher.doc(int)orIndexReader.document(int)on every hit. Doing so can slow searches by an order of magnitude or more.- Specified by:
collectin classCollector- Throws:
IOException
-
setNextReader
public void setNextReader(AtomicReaderContext context) throws IOException
Description copied from class:CollectorCalled before collecting from eachAtomicReaderContext. All doc ids inCollector.collect(int)will correspond toIndexReaderContext.reader(). AddAtomicReaderContext.docBaseto the currentIndexReaderContext.reader()'s internal document id to re-base ids inCollector.collect(int).- Specified by:
setNextReaderin classCollector- Parameters:
context- next atomic reader context- Throws:
IOException
-
acceptsDocsOutOfOrder
public boolean acceptsDocsOutOfOrder()
Description copied from class:CollectorReturntrueif this collector does not require the matching docIDs to be delivered in int sort order (smallest to largest) toCollector.collect(int).Most Lucene Query implementations will visit matching docIDs in order. However, some queries (currently limited to certain cases of
BooleanQuery) can achieve faster searching if theCollectorallows them to deliver the docIDs out of order.Many collectors don't mind getting docIDs out of order, so it's important to return
truehere.- Specified by:
acceptsDocsOutOfOrderin classCollector
-
-