Class NNDescent<O>

  • Type Parameters:
    O - Object type
    All Implemented Interfaces:
    elki.index.Index, elki.index.KNNIndex<O>

    @Reference(authors="W. Dong, C. Moses, K. Li",
               title="Efficient k-nearest neighbor graph construction for generic similarity measures",
               booktitle="Proc. 20th Int. Conf. on World Wide Web (WWW\'11)",
               url="https://doi.org/10.1145/1963405.1963487",
               bibkey="DBLP:conf/www/DongCL11")
    public class NNDescent<O>
    extends AbstractMaterializeKNNPreprocessor<O>
    NN-descent (also known as KNNGraph) is an approximate nearest neighbor search algorithm beginning with a random sample, then iteratively refining this sample until.

    Reference:

    W. Dong and C. Moses and K. Li
    Efficient k-nearest neighbor graph construction for generic similarity measures
    Proc. 20th Int. Conf. on World Wide Web (WWW'11)

    TODO: collect and log some query statistics.

    Since:
    0.7.5
    Author:
    Evelyn Kirner
    • Nested Class Summary

      Nested Classes 
      Modifier and Type Class Description
      static class  NNDescent.Factory<O>
      Index factory.
    • Field Summary

      Fields 
      Modifier and Type Field Description
      private double delta
      early termination parameter
      private int iterations
      maximum number of iterations
      private static elki.logging.Logging LOG
      Logger
      private boolean noInitialNeighbors
      Do not use initial neighbors
      private java.lang.String prefix
      Log prefix.
      private double rho
      sample rate
      private elki.utilities.random.RandomFactory rnd
      Random generator
      private elki.database.datastore.WritableDataStore<elki.database.ids.KNNHeap> store
      store for neighbors
    • Constructor Summary

      Constructors 
      Constructor Description
      NNDescent​(elki.database.relation.Relation<O> relation, elki.distance.Distance<? super O> distance, int k, elki.utilities.random.RandomFactory rnd, double delta, double rho, boolean noInitialNeighbors, int iterations)
      Constructor.
    • Method Summary

      All Methods Instance Methods Concrete Methods 
      Modifier and Type Method Description
      private boolean add​(elki.database.ids.DBIDRef cur, elki.database.ids.DBIDRef cand, double distance)
      Add cand to cur's heap neighbors with distance
      private void addpair​(elki.database.datastore.WritableDataStore<elki.database.ids.HashSetModifiableDBIDs> newNeighbors, elki.database.ids.DBIDRef o1, elki.database.ids.DBIDRef o2)  
      private void boundSize​(elki.database.ids.HashSetModifiableDBIDs set, int items)
      Bound the size of a set by random sampling.
      private void clearAll​(elki.database.ids.DBIDs ids, elki.database.datastore.WritableDataStore<elki.database.ids.HashSetModifiableDBIDs> sets)
      Clear (but reuse) all sets in the given storage.
      protected elki.logging.Logging getLogger()
      Get the classes static logger.
      elki.database.query.knn.KNNSearcher<O> kNNByObject​(elki.database.query.distance.DistanceQuery<O> distanceQuery, int maxk, int flags)  
      protected void preprocess()
      Perform the preprocessing step.
      private int processNewNeighbors​(elki.database.datastore.WritableDataStore<elki.database.ids.HashSetModifiableDBIDs> flag, elki.database.ids.HashSetModifiableDBIDs newFwd, elki.database.ids.HashSetModifiableDBIDs oldFwd, elki.database.ids.HashSetModifiableDBIDs newRev, elki.database.ids.HashSetModifiableDBIDs oldRev)
      Process new neighbors.
      private void reverse​(elki.database.datastore.WritableDataStore<elki.database.ids.HashSetModifiableDBIDs> sampleNewHash, elki.database.datastore.WritableDataStore<elki.database.ids.HashSetModifiableDBIDs> newReverseNeighbors, elki.database.datastore.WritableDataStore<elki.database.ids.HashSetModifiableDBIDs> oldReverseNeighbors)
      calculates new and old neighbors for database
      private int sampleNew​(elki.database.ids.DBIDs ids, elki.database.datastore.WritableDataStore<elki.database.ids.HashSetModifiableDBIDs> sampleNewNeighbors, elki.database.datastore.WritableDataStore<elki.database.ids.HashSetModifiableDBIDs> newNeighborHash, int items)
      samples newNeighbors for every object
      • Methods inherited from class java.lang.Object

        clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
      • Methods inherited from interface elki.index.Index

        logStatistics
    • Field Detail

      • LOG

        private static final elki.logging.Logging LOG
        Logger
      • prefix

        private java.lang.String prefix
        Log prefix.
      • rnd

        private final elki.utilities.random.RandomFactory rnd
        Random generator
      • delta

        private double delta
        early termination parameter
      • rho

        private double rho
        sample rate
      • iterations

        private int iterations
        maximum number of iterations
      • noInitialNeighbors

        private boolean noInitialNeighbors
        Do not use initial neighbors
      • store

        private elki.database.datastore.WritableDataStore<elki.database.ids.KNNHeap> store
        store for neighbors
    • Constructor Detail

      • NNDescent

        public NNDescent​(elki.database.relation.Relation<O> relation,
                         elki.distance.Distance<? super O> distance,
                         int k,
                         elki.utilities.random.RandomFactory rnd,
                         double delta,
                         double rho,
                         boolean noInitialNeighbors,
                         int iterations)
        Constructor.
        Parameters:
        relation - Relation to index
        distance - distance function
        k - k
        rnd - Random generator
        delta - Delta threshold
        rho - Rho threshold
        noInitialNeighbors - Do not use initial neighbors
        iterations - Maximum number of iterations
    • Method Detail

      • clearAll

        private void clearAll​(elki.database.ids.DBIDs ids,
                              elki.database.datastore.WritableDataStore<elki.database.ids.HashSetModifiableDBIDs> sets)
        Clear (but reuse) all sets in the given storage.
        Parameters:
        ids - Ids to process
        sets - Sets to clear
      • boundSize

        private void boundSize​(elki.database.ids.HashSetModifiableDBIDs set,
                               int items)
        Bound the size of a set by random sampling.
        Parameters:
        set - Set to process
        items - Maximum size
      • processNewNeighbors

        private int processNewNeighbors​(elki.database.datastore.WritableDataStore<elki.database.ids.HashSetModifiableDBIDs> flag,
                                        elki.database.ids.HashSetModifiableDBIDs newFwd,
                                        elki.database.ids.HashSetModifiableDBIDs oldFwd,
                                        elki.database.ids.HashSetModifiableDBIDs newRev,
                                        elki.database.ids.HashSetModifiableDBIDs oldRev)
        Process new neighbors. This is a complex join, because we do not need to join old neighbors with old neighbors, and we have forward- and reverse neighbors each.
        Parameters:
        flag - Flags to mark new neighbors.
        newFwd - New forward neighbors
        oldFwd - Old forward neighbors
        newRev - New reverse neighbors
        oldRev - Old reverse neighbors
        Returns:
        Number of new neighbors
      • add

        private boolean add​(elki.database.ids.DBIDRef cur,
                            elki.database.ids.DBIDRef cand,
                            double distance)
        Add cand to cur's heap neighbors with distance
        Parameters:
        cur - Current object
        cand - Neighbor candidate
        distance - Distance
        Returns:
        true if it was a new neighbor.
      • addpair

        private void addpair​(elki.database.datastore.WritableDataStore<elki.database.ids.HashSetModifiableDBIDs> newNeighbors,
                             elki.database.ids.DBIDRef o1,
                             elki.database.ids.DBIDRef o2)
      • sampleNew

        private int sampleNew​(elki.database.ids.DBIDs ids,
                              elki.database.datastore.WritableDataStore<elki.database.ids.HashSetModifiableDBIDs> sampleNewNeighbors,
                              elki.database.datastore.WritableDataStore<elki.database.ids.HashSetModifiableDBIDs> newNeighborHash,
                              int items)
        samples newNeighbors for every object
        Parameters:
        ids - All ids
        sampleNewNeighbors - Output of sampled new neighbors
        newNeighborHash - - new neighbors for every object
        items - Number of items to collect
        Returns:
        Number of new neighbors
      • reverse

        private void reverse​(elki.database.datastore.WritableDataStore<elki.database.ids.HashSetModifiableDBIDs> sampleNewHash,
                             elki.database.datastore.WritableDataStore<elki.database.ids.HashSetModifiableDBIDs> newReverseNeighbors,
                             elki.database.datastore.WritableDataStore<elki.database.ids.HashSetModifiableDBIDs> oldReverseNeighbors)
        calculates new and old neighbors for database
        Parameters:
        sampleNewHash - new neighbors for every object
        newReverseNeighbors - new reverse neighbors
        oldReverseNeighbors - old reverse neighbors
      • kNNByObject

        public elki.database.query.knn.KNNSearcher<O> kNNByObject​(elki.database.query.distance.DistanceQuery<O> distanceQuery,
                                                                  int maxk,
                                                                  int flags)
        Specified by:
        kNNByObject in interface elki.index.KNNIndex<O>
        Overrides:
        kNNByObject in class AbstractMaterializeKNNPreprocessor<O>