Class CLARANS<O>

  • Type Parameters:
    O - Input data type
    All Implemented Interfaces:
    elki.Algorithm, ClusteringAlgorithm<Clustering<MedoidModel>>, KMedoidsClustering<O>
    Direct Known Subclasses:
    FastCLARANS

    @Reference(authors="R. T. Ng, J. Han",
               title="CLARANS: a method for clustering objects for spatial data mining",
               booktitle="IEEE Transactions on Knowledge and Data Engineering 14(5)",
               url="https://doi.org/10.1109/TKDE.2002.1033770",
               bibkey="DBLP:journals/tkde/NgH02")
    public class CLARANS<O>
    extends java.lang.Object
    implements KMedoidsClustering<O>
    CLARANS: a method for clustering objects for spatial data mining is inspired by PAM (partitioning around medoids, PAM) and CLARA and also based on sampling.

    This implementation tries to balance memory and computation time. By caching the distances to the two nearest medoids, we usually only need O(n) instead of O(nk) distance computations for one iteration, at the cost of needing O(2n) memory to store them.

    The implementation is fairly ugly, because we have three solutions (the best found so far, the current solution, and a neighbor candidate); and for each point in each solution we need the best and second best assignments. But with Java 11, we may be able to switch to value types that would clean this code significantly, without the overhead of O(n) objects.

    Reference:

    R. T. Ng, J. Han
    CLARANS: a method for clustering objects for spatial data mining
    IEEE Transactions on Knowledge and Data Engineering 14(5)

    Since:
    0.7.5
    Author:
    Erich Schubert
    • Nested Class Summary

      Nested Classes 
      Modifier and Type Class Description
      protected static class  CLARANS.Assignment
      Assignment state.
      static class  CLARANS.Par<V>
      Parameterization class.
      • Nested classes/interfaces inherited from interface elki.Algorithm

        elki.Algorithm.Utils
    • Field Summary

      Fields 
      Modifier and Type Field Description
      protected elki.distance.Distance<? super O> distance
      Distance function used.
      protected int k
      Number of clusters to find.
      private static elki.logging.Logging LOG
      Class logger.
      protected double maxneighbor
      Sampling rate.
      protected int numlocal
      Number of samples to draw (i.e. restarts).
      protected elki.utilities.random.RandomFactory random
      Random factory for initialization.
    • Constructor Summary

      Constructors 
      Constructor Description
      CLARANS​(elki.distance.Distance<? super O> distance, int k, int numlocal, double maxneighbor, elki.utilities.random.RandomFactory random)
      Constructor.
    • Method Summary

      All Methods Instance Methods Concrete Methods 
      Modifier and Type Method Description
      elki.data.type.TypeInformation[] getInputTypeRestriction()  
      Clustering<MedoidModel> run​(elki.database.relation.Relation<O> relation)
      Run CLARANS clustering.
      Clustering<MedoidModel> run​(elki.database.relation.Relation<O> relation, int k, elki.database.query.distance.DistanceQuery<? super O> distQ)
      Run k-medoids clustering with a given distance query.
      Not a very elegant API, but needed for some types of nested k-medoids.
      • Methods inherited from class java.lang.Object

        clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
    • Field Detail

      • LOG

        private static final elki.logging.Logging LOG
        Class logger.
      • distance

        protected elki.distance.Distance<? super O> distance
        Distance function used.
      • k

        protected int k
        Number of clusters to find.
      • numlocal

        protected int numlocal
        Number of samples to draw (i.e. restarts).
      • maxneighbor

        protected double maxneighbor
        Sampling rate. If less than 1, it is considered to be a relative value.
      • random

        protected elki.utilities.random.RandomFactory random
        Random factory for initialization.
    • Constructor Detail

      • CLARANS

        public CLARANS​(elki.distance.Distance<? super O> distance,
                       int k,
                       int numlocal,
                       double maxneighbor,
                       elki.utilities.random.RandomFactory random)
        Constructor.
        Parameters:
        distance - Distance function to use
        k - Number of clusters to produce
        numlocal - Number of samples (restarts)
        maxneighbor - Neighbor sampling rate (absolute or relative)
        random - Random generator
    • Method Detail

      • run

        public Clustering<MedoidModel> run​(elki.database.relation.Relation<O> relation)
        Run CLARANS clustering.
        Specified by:
        run in interface KMedoidsClustering<O>
        Parameters:
        relation - Data relation
        Returns:
        Clustering
      • run

        public Clustering<MedoidModel> run​(elki.database.relation.Relation<O> relation,
                                           int k,
                                           elki.database.query.distance.DistanceQuery<? super O> distQ)
        Description copied from interface: KMedoidsClustering
        Run k-medoids clustering with a given distance query.
        Not a very elegant API, but needed for some types of nested k-medoids.
        Specified by:
        run in interface KMedoidsClustering<O>
        Parameters:
        relation - relation to use
        k - Number of clusters
        distQ - Distance query to use
        Returns:
        result
      • getInputTypeRestriction

        public elki.data.type.TypeInformation[] getInputTypeRestriction()
        Specified by:
        getInputTypeRestriction in interface elki.Algorithm