Class XMeans<V extends elki.data.NumberVector,​M extends MeanModel>

  • Type Parameters:
    V - Vector type
    M - Model type
    All Implemented Interfaces:
    elki.Algorithm, ClusteringAlgorithm<Clustering<M>>, KMeans<V,​M>
    Direct Known Subclasses:
    GMeans

    @Title("X-means")
    @Reference(authors="D. Pelleg, A. Moore",
               title="X-means: Extending K-means with Efficient Estimation on the Number of Clusters",
               booktitle="Proc. 17th Int. Conf. on Machine Learning (ICML 2000)",
               url="http://www.pelleg.org/shared/hp/download/xmeans.ps",
               bibkey="DBLP:conf/icml/PellegM00")
    public class XMeans<V extends elki.data.NumberVector,​M extends MeanModel>
    extends AbstractKMeans<V,​M>
    X-means: Extending K-means with Efficient Estimation on the Number of Clusters.

    Note: this implementation does currently not use a k-d-tree for acceleration. Also note that kmax is not a hard threshold - the algorithm can return up to 2*kmax clusters!

    Reference:

    D. Pelleg, A. Moore
    X-means: Extending K-means with Efficient Estimation on the Number of Clusters
    Proc. 17th Int. Conf. on Machine Learning (ICML 2000)

    Since:
    0.7.0
    Author:
    Tibor Goldschwendt, Erich Schubert
    • Field Detail

      • LOG

        private static final elki.logging.Logging LOG
        The logger for this class.
      • innerKMeans

        protected KMeans<V extends elki.data.NumberVector,​M extends MeanModel> innerKMeans
        Inner k-means algorithm.
      • k_min

        private int k_min
        Effective number of clusters, minimum and maximum.
      • k_max

        private int k_max
        Effective number of clusters, minimum and maximum.
      • splitInitializer

        Predefined splitInitializer
        Initializer for k-means.
      • informationCriterion

        KMeansQualityMeasure<V extends elki.data.NumberVector> informationCriterion
        Information criterion to choose the better split.
      • rnd

        elki.utilities.random.RandomFactory rnd
        Random factory.
    • Constructor Detail

      • XMeans

        public XMeans​(elki.distance.NumberVectorDistance<? super V> distance,
                      int k_min,
                      int k_max,
                      int maxiter,
                      KMeans<V,​M> innerKMeans,
                      KMeansInitialization initializer,
                      KMeansQualityMeasure<V> informationCriterion,
                      elki.utilities.random.RandomFactory random)
        Constructor.
        Parameters:
        distance - Distance function
        k_min - k_min parameter - minimum number of result clusters
        k_max - k_max parameter - maximum number of result clusters
        maxiter - Maximum number of iterations each.
        innerKMeans - K-Means variant to use inside.
        informationCriterion - The information criterion used for the splitting step
        random - Random factory
    • Method Detail

      • run

        public Clustering<M> run​(elki.database.relation.Relation<V> relation)
        Run the algorithm on a database and relation.
        Parameters:
        relation - Data relation
        Returns:
        Clustering result.
      • splitCluster

        protected java.util.List<Cluster<M>> splitCluster​(Cluster<M> parentCluster,
                                                          elki.database.relation.Relation<V> relation)
        Conditionally splits the clusters based on the information criterion.
        Parameters:
        parentCluster - Cluster to split
        relation - Data relation
        Returns:
        Parent cluster when split decreases clustering quality or child clusters when split improves clustering.
      • splitCentroid

        protected double[][] splitCentroid​(Cluster<? extends MeanModel> parentCluster,
                                           elki.database.relation.Relation<V> relation)
        Split an existing centroid into two initial centers.
        Parameters:
        parentCluster - Existing cluster
        relation - Data relation
        Returns:
        List of new centroids
      • getInputTypeRestriction

        public elki.data.type.TypeInformation[] getInputTypeRestriction()
        Specified by:
        getInputTypeRestriction in interface elki.Algorithm
        Overrides:
        getInputTypeRestriction in class AbstractKMeans<V extends elki.data.NumberVector,​M extends MeanModel>
      • getLogger

        protected elki.logging.Logging getLogger()
        Description copied from class: AbstractKMeans
        Get the (STATIC) logger for this class.
        Specified by:
        getLogger in class AbstractKMeans<V extends elki.data.NumberVector,​M extends MeanModel>
        Returns:
        the static logger