Class SampleKMeans<V extends elki.data.NumberVector>

  • Type Parameters:
    V - Vector type
    All Implemented Interfaces:
    KMeansInitialization

    @Reference(authors="P. S. Bradley, U. M. Fayyad",
               title="Refining Initial Points for K-Means Clustering",
               booktitle="Proc. 15th Int. Conf. on Machine Learning (ICML 1998)",
               bibkey="DBLP:conf/icml/BradleyF98")
    public class SampleKMeans<V extends elki.data.NumberVector>
    extends AbstractKMeansInitialization
    Initialize k-means by running k-means on a sample of the data set only.

    Reference:

    The idea of finding centers on a sample can be found in:

    P. S. Bradley, U. M. Fayyad
    Refining Initial Points for K-Means Clustering
    Proc. 15th Int. Conf. on Machine Learning (ICML 1998)

    But Bradley and Fayyad also suggest to repeat this multiple times. This implementation uses a single attempt only.

    Since:
    0.6.0
    Author:
    Erich Schubert
    • Constructor Summary

      Constructors 
      Constructor Description
      SampleKMeans​(elki.utilities.random.RandomFactory rnd, KMeans<V,​?> innerkMeans, double rate)
      Constructor.
    • Method Summary

      All Methods Instance Methods Concrete Methods 
      Modifier and Type Method Description
      double[][] chooseInitialMeans​(elki.database.relation.Relation<? extends elki.data.NumberVector> relation, int k, elki.distance.NumberVectorDistance<?> distance)
      Choose initial means
      • Methods inherited from class java.lang.Object

        clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
    • Field Detail

      • innerkMeans

        private KMeans<V extends elki.data.NumberVector,​?> innerkMeans
        Variant of kMeans to use for initialization.
      • rate

        private double rate
        Sample size.
    • Constructor Detail

      • SampleKMeans

        public SampleKMeans​(elki.utilities.random.RandomFactory rnd,
                            KMeans<V,​?> innerkMeans,
                            double rate)
        Constructor.
        Parameters:
        rnd - Random generator.
        innerkMeans - Inner k-means algorithm.
        rate - Sampling rate.
    • Method Detail

      • chooseInitialMeans

        public double[][] chooseInitialMeans​(elki.database.relation.Relation<? extends elki.data.NumberVector> relation,
                                             int k,
                                             elki.distance.NumberVectorDistance<?> distance)
        Description copied from interface: KMeansInitialization
        Choose initial means
        Parameters:
        relation - Relation
        k - Parameter k
        distance - Distance function
        Returns:
        List of chosen means for k-means