Class TermVectorsFields

  • All Implemented Interfaces:
    Iterable<String>

    public final class TermVectorsFields
    extends Fields
    This class represents the result of a TermVectorsRequest. It works exactly like the Fields class except for one thing: It can return offsets and payloads even if positions are not present. You must call nextPosition() anyway to move the counter although this method only returns -1,, if no positions were returned by the TermVectorsRequest.

    The data is stored in two byte arrays (headerRef and termVectors, both BytesRef) that have the following format:

    headerRef: Stores offsets per field in the termVectors array and some header information as BytesRef. Format is

    • String : "TV"
    • vint: version (=-1)
    • boolean: hasTermStatistics (are the term statistics stored?)
    • boolean: hasFieldStatitsics (are the field statistics stored?)
    • vint: number of fields
      • String: field name 1
      • vint: offset in termVectors for field 1
      • ...
      • String: field name last field
      • vint: offset in termVectors for last field

    termVectors: Stores the actual term vectors as a BytesRef.

    Term vectors for each fields are stored in blocks, one for each field. The offsets in headerRef are used to find where the block for a field starts. Each block begins with a

    • vint: number of terms
    • boolean: positions (has it positions stored?)
    • boolean: offsets (has it offsets stored?)
    • boolean: payloads (has it payloads stored?)
    If the field statistics were requested (hasFieldStatistics is true, see headerRef), the following numbers are stored:
    • vlong: sum of total term frequencies of the field (sumTotalTermFreq)
    • vlong: sum of document frequencies for each term (sumDocFreq)
    • vint: number of documents in the shard that has an entry for this field (docCount)

    After that, for each term it stores

    • vint: term lengths
    • BytesRef: term name

    If term statistics are requested (hasTermStatistics is true, see headerRef):

    • vint: document frequency, how often does this term appear in documents?
    • vlong: total term frequency. Sum of terms in this field.
    After that
    • vint: frequency (always returned)
      • vint: position_1 (if positions)
      • vint: startOffset_1 (if offset)
      • vint: endOffset_1 (if offset)
      • BytesRef: payload_1 (if payloads)
      • ...
      • vint: endOffset_freqency (if offset)
      • BytesRef: payload_freqency (if payloads)
    • Field Detail

      • hasScores

        public final boolean hasScores
    • Method Detail

      • size

        public int size()
        Description copied from class: Fields
        Returns the number of fields or -1 if the number of distinct field names is unknown. If >= 0, Fields.iterator() will return as many field names.
        Specified by:
        size in class Fields