Class MultipleProjectionsLocalitySensitiveHashFunction

  • All Implemented Interfaces:
    LocalitySensitiveHashFunction<elki.data.NumberVector>

    @Reference(authors="M. Datar, N. Immorlica, P. Indyk, V. S. Mirrokni",
               title="Locality-sensitive hashing scheme based on p-stable distributions",
               booktitle="Proc. 20th Annual Symposium on Computational Geometry",
               url="https://doi.org/10.1145/997817.997857",
               bibkey="DBLP:conf/compgeom/DatarIIM04")
    public class MultipleProjectionsLocalitySensitiveHashFunction
    extends java.lang.Object
    implements LocalitySensitiveHashFunction<elki.data.NumberVector>
    LSH hash function for vector space data. Depending on the choice of random vectors, it can be appropriate for Manhattan and Euclidean distances.

    Reference:

    M. Datar, N. Immorlica, P. Indyk, V. S. Mirrokni
    Locality-sensitive hashing scheme based on p-stable distributions
    Proc. 20th Annual Symposium on Computational Geometry

    Since:
    0.6.0
    Author:
    Erich Schubert
    • Field Summary

      Fields 
      Modifier and Type Field Description
      (package private) double iwidth
      Scaling factor: inverse of width.
      private static long MASK32
      Bit mask for signed int to unsigned long conversion.
      (package private) elki.data.projection.random.RandomProjectionFamily.Projection projection
      Projection matrix.
      (package private) int[] randoms1
      Random numbers for mixing the hash codes of the individual functions
      (package private) double[] shift
      Shift offset.
    • Method Summary

      All Methods Static Methods Instance Methods Concrete Methods 
      Modifier and Type Method Description
      static int fastModPrime​(long data)
      Fast modulo operation for the largest unsigned integer prime.
      int getNumberOfProjections()
      Get the number of projections performed.
      int hashObject​(elki.data.NumberVector vec)
      Compute the hash value of an object.
      int hashObject​(elki.data.NumberVector vec, double[] buf)
      Compute the hash value of an object (faster version).
      • Methods inherited from class java.lang.Object

        clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
    • Field Detail

      • projection

        elki.data.projection.random.RandomProjectionFamily.Projection projection
        Projection matrix.
      • shift

        double[] shift
        Shift offset.
      • iwidth

        double iwidth
        Scaling factor: inverse of width.
      • randoms1

        int[] randoms1
        Random numbers for mixing the hash codes of the individual functions
      • MASK32

        private static final long MASK32
        Bit mask for signed int to unsigned long conversion.
        See Also:
        Constant Field Values
    • Constructor Detail

      • MultipleProjectionsLocalitySensitiveHashFunction

        public MultipleProjectionsLocalitySensitiveHashFunction​(elki.data.projection.random.RandomProjectionFamily.Projection projection,
                                                                double width,
                                                                java.util.Random rnd)
        Constructor.
        Parameters:
        projection - Projection vectors
        width - Width of bins
        rnd - Random number generator
    • Method Detail

      • hashObject

        public int hashObject​(elki.data.NumberVector vec,
                              double[] buf)
        Description copied from interface: LocalitySensitiveHashFunction
        Compute the hash value of an object (faster version).
        Specified by:
        hashObject in interface LocalitySensitiveHashFunction<elki.data.NumberVector>
        Parameters:
        vec - Object to hash
        buf - Buffer, sized according to the number of projections.
        Returns:
        Hash value
      • fastModPrime

        public static int fastModPrime​(long data)
        Fast modulo operation for the largest unsigned integer prime.
        Parameters:
        data - Long input
        Returns:
        data % (2^32 - 5).