|
|||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | ||||||||
java.lang.Objectit.unimi.dsi.fastutil.ints.AbstractIntIterator
it.unimi.dsi.mg4j.index.AbstractIndexIterator
it.unimi.dsi.mg4j.index.BitStreamIndexReader.BitStreamIndexReaderIndexIterator
protected static final class BitStreamIndexReader.BitStreamIndexReaderIndexIterator
| Field Summary | |
|---|---|
protected int |
b
The parameter b for Golomb coding of pointers. |
protected int |
count
The current count (if this index contains counts). |
protected CompressionFlags.Coding |
countCoding
The cached copy of index.countCoding. |
protected int |
currentDocument
The last document pointer we read from current list, -1 if we just read the frequency, DocumentIterator.END_OF_LIST if we are beyond the end of list. |
protected int |
currentTerm
The current term. |
protected int |
frequency
The current frequency. |
protected boolean |
hasCounts
The cached copy of index.hasCounts. |
protected boolean |
hasPayloads
The cached copy of index.hasPayloads. |
protected boolean |
hasPointers
Whether the current terms has pointers at all (this happens when the frequency is smaller than the number of documents). |
protected boolean |
hasPositions
The cached copy of index.hasPositions. |
protected boolean |
hasSkips
Whether the underlying index has skips. |
int |
height
The parameter h (the maximum height of a skip tower). |
protected InputBitStream |
ibs
The underlying input bit stream. |
protected BitStreamIndex |
index
The reference index. |
protected int |
log2b
The parameter log2b for Golomb coding of pointers; it is the most significant bit of b. |
protected int |
numberOfDocumentRecord
The number of the document record we are going to read inside the current inverted list. |
protected Payload |
payload
The payload, in case the index of this reader has payloads, or null. |
protected CompressionFlags.Coding |
pointerCoding
The cached copy of index.pointerCoding. |
protected int[] |
positionCache
The cached position array. |
protected CompressionFlags.Coding |
positionCoding
The cached copy of index.positionCoding. |
int |
quantum
The quantum. |
int |
quantumDivisionShift
The shift giving result of the division by quantum. |
int |
quantumModuloMask
The bit mask giving the remainder of the division by quantum. |
protected int |
state
This variable tracks the current state of the reader. |
| Fields inherited from class it.unimi.dsi.mg4j.index.AbstractIndexIterator |
|---|
id, term, weight |
| Fields inherited from interface it.unimi.dsi.mg4j.search.DocumentIterator |
|---|
END_OF_LIST |
| Constructor Summary | |
|---|---|
BitStreamIndexReader.BitStreamIndexReaderIndexIterator(BitStreamIndexReader parent,
InputBitStream ibs)
|
|
| Method Summary | |
|---|---|
protected IndexIterator |
advance()
|
int |
count()
Returns the count, that is, the number of occurrences of the term in the current document. |
void |
dispose()
Disposes this document iterator, releasing all resources. |
int |
document()
Returns the last document returned by DocumentIterator.nextDocument(). |
int |
frequency()
Returns the frequency, that is, the number of documents that will be returned by this iterator. |
boolean |
hasNext()
|
Index |
index()
Returns the index over which this iterator is built. |
ReferenceSet<Index> |
indices()
Returns the set of indices over which this iterator is built. |
IntervalIterator |
intervalIterator()
Returns the interval iterator of this document iterator for single-index queries. |
IntervalIterator |
intervalIterator(Index index)
Returns the interval iterator of this document iterator for the given index. |
Reference2ReferenceMap<Index,IntervalIterator> |
intervalIterators()
Returns an unmodifiable map from indices to interval iterators. |
int |
nextDocument()
Returns the next document provided by this document iterator, or -1 if no more documents are available. |
int |
nextInt()
Returns the next document. |
Payload |
payload()
Returns the payload, if any, associated with the current document. |
protected void |
position(int term)
Positions the index on the inverted list of a given term. |
int[] |
positionArray()
Returns the positions at which the term appears in the current document in an array. |
IntIterator |
positions()
Returns the positions at which the term appears in the current document. |
int |
positions(int[] position)
Stores the positions at which the term appears in the current document in a given array. |
int |
skipTo(int p)
Skips all documents smaller than n. |
int |
termNumber()
Returns the number of the term whose inverted list is returned by this index iterator. |
String |
toString()
|
protected void |
updatePositionCache()
We read positions, assuming state <= BEFORE_POSITIONS |
| Methods inherited from class it.unimi.dsi.mg4j.index.AbstractIndexIterator |
|---|
accept, acceptOnTruePaths, id, id, iterator, term, term, weight, weight |
| Methods inherited from class it.unimi.dsi.fastutil.ints.AbstractIntIterator |
|---|
next, remove, skip |
| Methods inherited from class java.lang.Object |
|---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait |
| Methods inherited from interface it.unimi.dsi.mg4j.index.IndexIterator |
|---|
id, id, term, term, weight |
| Methods inherited from interface it.unimi.dsi.mg4j.search.DocumentIterator |
|---|
accept, acceptOnTruePaths, iterator, weight |
| Methods inherited from interface it.unimi.dsi.fastutil.ints.IntIterator |
|---|
skip |
| Methods inherited from interface java.util.Iterator |
|---|
next, remove |
| Field Detail |
|---|
protected final BitStreamIndex index
protected final InputBitStream ibs
protected final boolean hasPositions
index.hasPositions.
protected final boolean hasCounts
index.hasCounts.
protected final boolean hasPayloads
index.hasPayloads.
protected final boolean hasSkips
protected final CompressionFlags.Coding pointerCoding
index.pointerCoding.
protected final CompressionFlags.Coding countCoding
index.countCoding.
protected final CompressionFlags.Coding positionCoding
index.positionCoding.
protected final Payload payload
null.
protected int b
b for Golomb coding of pointers.
protected int log2b
log2b for Golomb coding of pointers; it is the most significant bit of b.
protected int currentTerm
protected int frequency
protected boolean hasPointers
frequency is smaller than the number of documents).
protected int count
protected int currentDocument
DocumentIterator.END_OF_LIST if we are beyond the end of list.
protected int numberOfDocumentRecord
protected int state
public final int height
h (the maximum height of a skip tower).
public int quantum
public int quantumModuloMask
quantum.
public int quantumDivisionShift
quantum.
protected int[] positionCache
| Constructor Detail |
|---|
public BitStreamIndexReader.BitStreamIndexReaderIndexIterator(BitStreamIndexReader parent,
InputBitStream ibs)
| Method Detail |
|---|
protected void position(int term)
throws IOException
This method can be called at any time. Note that it is always possible to call this method with argument 0, even if offsets have not been loaded.
term - a term.
IOExceptionpublic int termNumber()
IndexIteratorUsually, the term number is automatically set by IndexReader.documents(CharSequence) or IndexReader.documents(int).
termNumber in interface IndexIteratorIndexIterator.term()
protected IndexIterator advance()
throws IOException
IOExceptionpublic Index index()
IndexIterator
index in interface IndexIteratorpublic int frequency()
IndexIterator
frequency in interface IndexIteratorpublic int document()
DocumentIteratorDocumentIterator.nextDocument().
document in interface DocumentIteratorDocumentIterator.nextDocument(), -1 if no document has been returned yet, and
DocumentIterator.END_OF_LIST if the list of results has been exhausted.
public Payload payload()
throws IOException
IndexIterator
payload in interface IndexIteratorIOException
public int count()
throws IOException
IndexIterator
count in interface IndexIteratorIOException
protected void updatePositionCache()
throws IOException
IOException
public IntIterator positions()
throws IOException
IndexIterator
positions in interface IndexIteratorIOException
public int[] positionArray()
throws IOException
IndexIteratorImplementations are allowed to return the same array across different calls to this method.
positionArray in interface IndexIteratorIOException
public int positions(int[] position)
throws IOException
IndexIteratorIf the array is not large enough (i.e., it does not contain IndexIterator.count() elements),
this method will return a negative number (the opposite of the count).
positions in interface IndexIteratorposition - an array that will be used to store positions.
positions cannot
hold all positions.
IOException
public int nextDocument()
throws IOException
DocumentIteratorWarning: the specification of this method has significantly changed as of MG4J 1.2.
The special return value -1 is used to mark the end of iteration (a NoSuchElementException
would have been thrown before in that case, so ho harm should be caused by this change). The reason
for this change is providing fully lazy iteration over documents. Fully lazy iteration
does not provide an hasNext() method—you have to actually ask for the next
element and check the return value. Fully lazy iteration is much lighter on method calls (half) and
in most (if not all) MG4J classes leads to a much simpler logic. Moreover, DocumentIterator.nextDocument()
can be specified as throwing an IOException, which avoids the pernicious proliferation
of try/catch blocks in very short, low-level methods (it was having a detectable impact on performance).
nextDocument in interface DocumentIteratorIOException
public int skipTo(int p)
throws IOException
DocumentIteratorn. If Iterator.hasNext() has been called returning
true but DocumentIterator.nextDocument() has not been called afterwards, then a call
to DocumentIterator.skipTo(int) will be implicitly preceded by
a call to DocumentIterator.nextDocument() (the only consequence is that skipping to the current
document after a call to Iterator.hasNext() will return the next document).
Define the current document k associated with this document iterator
as follows:
DocumentIterator.nextDocument() and this method have never been called;
DocumentIterator.END_OF_LIST, if a call to this method returned DocumentIterator.END_OF_LIST, or
DocumentIterator.nextDocument() returned -1;
DocumentIterator.nextDocument() or this method, otherwise.
If k is larger than or equal to n, then
this method does nothing and returns k. Otherwise, a
call to this method is equivalent to
while( ( k = nextDocument() ) < n && k != -1 ); return k == -1 ? END_OF_LIST : k;
Thus, when a result k ≠ DocumentIterator.END_OF_LIST
is returned, the state of this iterator
will be exactly the same as after a call to DocumentIterator.nextDocument()
that returned k.
In particular, the first document larger than or equal to n (when returned
by this method) will not be returned by the next call to
DocumentIterator.nextDocument().
skipTo in interface DocumentIteratorp - a document pointer.
n if available, DocumentIterator.END_OF_LIST
otherwise.
IOException
public void dispose()
throws IOException
DocumentIteratorThis method should propagate down to the underlying index iterators, where it should release resources such as open files and network connections. If you're doing your own resource tracking and pooling, then you do not need to call this method.
dispose in interface DocumentIteratorIOExceptionpublic boolean hasNext()
hasNext in interface Iterator<Integer>public int nextInt()
DocumentIterator
nextInt in interface IntIteratornextInt in interface DocumentIteratornextInt in class AbstractIntIteratorDocumentIterator.nextDocument()public String toString()
toString in class Object
public Reference2ReferenceMap<Index,IntervalIterator> intervalIterators()
throws IOException
DocumentIteratorAfter a call to DocumentIterator.nextDocument(), this map
can be used to retrieve the intervals in the current document. An invocation of Map.get(java.lang.Object)
on this map with argument index yields the same result as
intervalIterator(index).
intervalIterators in interface DocumentIteratorIOExceptionDocumentIterator.intervalIterator(Index)
public IntervalIterator intervalIterator()
throws IOException
DocumentIteratorThis is a commodity method that can be used only for queries built over a single index.
intervalIterator in interface DocumentIteratorIOExceptionDocumentIterator.intervalIterator(Index)
public IntervalIterator intervalIterator(Index index)
throws IOException
DocumentIteratorAfter a call to DocumentIterator.nextDocument(), this iterator
can be used to retrieve the intervals in the current document (the
one returned by DocumentIterator.nextDocument()) for
the index index.
Note that if all indices have positions, it is guaranteed that at least one index will return an interval. However, for disjunctive queries it cannot be guaranteed that all indices will return an interval.
Indices without positions always return IntervalIterators.TRUE.
Thus, in presence of indices without positions it is possible that no
intervals at all are available.
intervalIterator in interface DocumentIteratorindex - an index (must be one over which the query was built).
index.
IOExceptionpublic ReferenceSet<Index> indices()
DocumentIterator
indices in interface DocumentIterator
|
|||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | ||||||||