public class EarlyTerminatingSortingCollector extends Collector
Collector that early terminates collection of documents on a
per-segment basis, if the segment was sorted according to the given
Sorter.
NOTE: the Collector detects sorted segments according to
SortingMergePolicy, so it's best used in conjunction with it. Also,
it collects up to a specified num docs from each segment, and therefore is
mostly suitable for use in conjunction with collectors such as
TopDocsCollector, and not e.g. TotalHitCountCollector.
NOTE: If you wrap a TopDocsCollector that sorts in the same
order as the index order, the returned TopDocsCollector.topDocs()
will be correct. However the total of hit count will be underestimated since not all matching documents will have
been collected.
NOTE: This Collector uses Sorter.getID() to detect
whether a segment was sorted with the same Sorter as the one given in
EarlyTerminatingSortingCollector(Collector, Sorter, int). This has
two implications:
Sorter.getID() is not implemented correctly and returns
different identifiers for equivalent Sorters, this collector will not
detect sorted segments,IndexWriter's
SortingMergePolicy to sort according to another criterion and if both
the old and the new Sorters have the same identifier, this
Collector will incorrectly detect sorted segments.| Constructor and Description |
|---|
EarlyTerminatingSortingCollector(Collector in,
Sorter sorter,
int numDocsToCollect)
Create a new
EarlyTerminatingSortingCollector instance. |
| Modifier and Type | Method and Description |
|---|---|
boolean |
acceptsDocsOutOfOrder()
Return
true if this collector does not
require the matching docIDs to be delivered in int sort
order (smallest to largest) to Collector.collect(int). |
void |
collect(int doc)
Called once for every document matching a query, with the unbased document
number.
|
void |
setNextReader(AtomicReaderContext context)
Called before collecting from each
AtomicReaderContext. |
void |
setScorer(Scorer scorer)
Called before successive calls to
Collector.collect(int). |
public EarlyTerminatingSortingCollector(Collector in, Sorter sorter, int numDocsToCollect)
EarlyTerminatingSortingCollector instance.in - the collector to wrapsorter - the same sorter as the one which is used by IndexWriter's
SortingMergePolicynumDocsToCollect - the number of documents to collect on each segment. When wrapping
a TopDocsCollector, this number should be the number of
hits.public void setScorer(Scorer scorer) throws IOException
CollectorCollector.collect(int). Implementations
that need the score of the current document (passed-in to
Collector.collect(int)), should save the passed-in Scorer and call
scorer.score() when needed.setScorer in class CollectorIOExceptionpublic void collect(int doc)
throws IOException
CollectorNote: The collection of the current segment can be terminated by throwing
a CollectionTerminatedException. In this case, the last docs of the
current AtomicReaderContext will be skipped and IndexSearcher
will swallow the exception and continue collection with the next leaf.
Note: This is called in an inner search loop. For good search performance,
implementations of this method should not call IndexSearcher.doc(int) or
IndexReader.document(int) on every hit.
Doing so can slow searches by an order of magnitude or more.
collect in class CollectorIOExceptionpublic void setNextReader(AtomicReaderContext context) throws IOException
CollectorAtomicReaderContext. All doc ids in
Collector.collect(int) will correspond to IndexReaderContext.reader().
Add AtomicReaderContext.docBase to the current IndexReaderContext.reader()'s
internal document id to re-base ids in Collector.collect(int).setNextReader in class Collectorcontext - next atomic reader contextIOExceptionpublic boolean acceptsDocsOutOfOrder()
Collectortrue if this collector does not
require the matching docIDs to be delivered in int sort
order (smallest to largest) to Collector.collect(int).
Most Lucene Query implementations will visit
matching docIDs in order. However, some queries
(currently limited to certain cases of BooleanQuery) can achieve faster searching if the
Collector allows them to deliver the
docIDs out of order.
Many collectors don't mind getting docIDs out of
order, so it's important to return true
here.
acceptsDocsOutOfOrder in class CollectorCopyright © 2010 - 2020 Adobe. All Rights Reserved