Class BlendedTermQuery


  • public abstract class BlendedTermQuery
    extends Query
    BlendedTermQuery can be used to unify term statistics across one or more fields in the index. A common problem with structured documents is that a term that is significant in on field might not be significant in other fields like in a scenario where documents represent users with a "first_name" and a "second_name". When someone searches for "simon" it will very likely get "paul simon" first since "simon" is a an uncommon last name ie. has a low document frequency. This query tries to "lie" about the global statistics like document frequency as well total term frequency to rank based on the estimated statistics.

    While aggregating the total term frequency is trivial since it can be summed up not every Similarity makes use of this statistic. The document frequency which is used in the ClassicSimilarity can only be estimated as an lower-bound since it is a document based statistic. For the document frequency the maximum frequency across all fields per term is used which is the minimum number of documents the terms occurs in.

    • Constructor Detail

      • BlendedTermQuery

        public BlendedTermQuery​(Term[] terms,
                                float[] boosts)
    • Method Detail

      • rewrite

        public Query rewrite​(IndexReader reader)
                      throws IOException
        Description copied from class: Query
        Expert: called to re-write queries into primitive queries. For example, a PrefixQuery will be rewritten into a BooleanQuery that consists of TermQuerys.
        Overrides:
        rewrite in class Query
        Throws:
        IOException
      • topLevelQuery

        protected abstract Query topLevelQuery​(Term[] terms,
                                               TermStates[] ctx,
                                               int[] docFreqs,
                                               int maxDoc)
      • getTerms

        public List<Term> getTerms()
      • toString

        public String toString​(String field)
        Description copied from class: Query
        Prints a query to a string, with field assumed to be the default field and omitted.
        Specified by:
        toString in class Query
      • equals

        public boolean equals​(Object o)
        Description copied from class: Query
        Override and implement query instance equivalence properly in a subclass. This is required so that QueryCache works properly. Typically a query will be equal to another only if it's an instance of the same class and its document-filtering properties are identical that other instance. Utility methods are provided for certain repetitive code.
        Specified by:
        equals in class Query
        See Also:
        Query.sameClassAs(Object), Query.classHash()
      • hashCode

        public int hashCode()
        Description copied from class: Query
        Override and implement query hash code properly in a subclass. This is required so that QueryCache works properly.
        Specified by:
        hashCode in class Query
        See Also:
        Query.equals(Object)
      • commonTermsBlendedQuery

        @Deprecated
        public static BlendedTermQuery commonTermsBlendedQuery​(Term[] terms,
                                                               float[] boosts,
                                                               float maxTermFrequency)
        Deprecated.
        Since max_score optimization landed in 7.0, normal MultiMatchQuery will achieve the same result without any configuration.
      • dismaxBlendedQuery

        public static BlendedTermQuery dismaxBlendedQuery​(Term[] terms,
                                                          float tieBreakerMultiplier)
      • dismaxBlendedQuery

        public static BlendedTermQuery dismaxBlendedQuery​(Term[] terms,
                                                          float[] boosts,
                                                          float tieBreakerMultiplier)