Class CFKPlusPlusTree


  • @Alias("tree")
    @Reference(authors="Andreas Lang and Erich Schubert",
               title="BETULA: Fast Clustering of Large Data with Improved BIRCH CF-Trees",
               booktitle="Information Systems",
               url="https://doi.org/10.1016/j.is.2021.101918",
               bibkey="DBLP:journals/is/LangS22")
    public class CFKPlusPlusTree
    extends AbstractCFKMeansInitialization
    Initialize K-means by following tree paths weighted by their variance contribution. This is the strategy denoted "tree" in the reference.

    References:

    Andreas Lang and Erich Schubert
    BETULA: Fast Clustering of Large Data with Improved BIRCH CF-Trees
    Information Systems

    Since:
    0.8.0
    Author:
    Andreas Lang
    • Nested Class Summary

      Nested Classes 
      Modifier and Type Class Description
      static class  CFKPlusPlusTree.Par
      Parameterization class.
    • Field Summary

      Fields 
      Modifier and Type Field Description
      (package private) CFInitWeight dist
      Distance function to use for initial means
      (package private) boolean firstUniform
      Choose the first center uniformly from the cluster features.
      (package private) int maxdepth
      Maximum depth to choose at.
    • Constructor Summary

      Constructors 
      Constructor Description
      CFKPlusPlusTree​(CFInitWeight dist, boolean firstUniform, int maxdepth, elki.utilities.random.RandomFactory rf)
      Constructor.
    • Field Detail

      • dist

        CFInitWeight dist
        Distance function to use for initial means
      • firstUniform

        boolean firstUniform
        Choose the first center uniformly from the cluster features.
      • maxdepth

        int maxdepth
        Maximum depth to choose at.
    • Constructor Detail

      • CFKPlusPlusTree

        public CFKPlusPlusTree​(CFInitWeight dist,
                               boolean firstUniform,
                               int maxdepth,
                               elki.utilities.random.RandomFactory rf)
        Constructor.
        Parameters:
        dist - distance function
        firstUniform - choose first center uniformly from the leaves
        maxdepth - maximum depth
        rf - random generator
    • Method Detail

      • chooseInitialMeans

        public double[][] chooseInitialMeans​(CFTree<?> tree,
                                             java.util.List<? extends ClusterFeature> cfs,
                                             int k)
        Description copied from class: AbstractCFKMeansInitialization
        Build the initial models.
        Specified by:
        chooseInitialMeans in class AbstractCFKMeansInitialization
        Parameters:
        tree - CF tree
        cfs - Cluster features of the tree (may be ignored for tree-based initializations, should be an array list for efficiency)
        k - Number of clusters.
        Returns:
        initial cluster means
      • chooseNextNode

        private AsClusterFeature chooseNextNode​(CFNode<?> current,
                                                java.util.List<? extends ClusterFeature> ccs,
                                                java.util.Random rnd)
        Choose a child of the current node.
        Parameters:
        current - Current node
        ccs - Currently chosen cluster centers
        rnd - Random generator
        Returns:
        New cluster center