Class CosineHashFunctionFamily

  • All Implemented Interfaces:
    LocalitySensitiveHashFunctionFamily<elki.data.NumberVector>

    @Reference(authors="M. S. Charikar",title="Similarity estimation techniques from rounding algorithms",booktitle="Proc. 34th ACM Symposium on Theory of Computing, STOC\'02",url="https://doi.org/10.1145/509907.509965",bibkey="DBLP:conf/stoc/Charikar02") @Reference(authors="M. Henzinger",title="Finding near-duplicate web pages: a large-scale evaluation of algorithms",booktitle="Proc. 29th ACM Conf. Research and Development in Information Retrieval (SIGIR 2006)",url="https://doi.org/10.1145/1148170.1148222",bibkey="DBLP:conf/sigir/Henzinger06")
    public class CosineHashFunctionFamily
    extends java.lang.Object
    implements LocalitySensitiveHashFunctionFamily<elki.data.NumberVector>
    Hash function family to use with Cosine distance, using simplified hash functions where the projection is only drawn from +-1, instead of Gaussian distributions.

    References:

    M. S. Charikar
    Similarity estimation techniques from rounding algorithms
    Proc. 34th ACM Symposium on Theory of Computing, STOC'02

    M. Henzinger
    Finding near-duplicate web pages: a large-scale evaluation of algorithms
    Proc. 29th ACM Conf. Research and Development in Information Retrieval (SIGIR 2006)

    Since:
    0.7.0
    Author:
    Evgeniy Faerman
    • Field Summary

      Fields 
      Modifier and Type Field Description
      private int k
      The number of projections to use for each hash function.
      private elki.data.projection.random.RandomProjectionFamily proj
      Projection family to use.
    • Constructor Summary

      Constructors 
      Constructor Description
      CosineHashFunctionFamily​(int k, elki.utilities.random.RandomFactory random)
      Constructor.
    • Method Summary

      All Methods Instance Methods Concrete Methods 
      Modifier and Type Method Description
      java.util.ArrayList<? extends LocalitySensitiveHashFunction<? super elki.data.NumberVector>> generateHashFunctions​(elki.database.relation.Relation<? extends elki.data.NumberVector> relation, int l)
      Generate hash functions for the given relation.
      elki.data.type.TypeInformation getInputTypeRestriction()
      Get the input type information.
      boolean isCompatible​(elki.distance.Distance<?> df)
      Check whether the given distance function can be accelerated using this hash family.
      • Methods inherited from class java.lang.Object

        clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
    • Field Detail

      • proj

        private elki.data.projection.random.RandomProjectionFamily proj
        Projection family to use.
      • k

        private int k
        The number of projections to use for each hash function.
    • Constructor Detail

      • CosineHashFunctionFamily

        public CosineHashFunctionFamily​(int k,
                                        elki.utilities.random.RandomFactory random)
        Constructor.
        Parameters:
        k - Number of projections to use.
        random - Random factory.