Class KmeansSamplingFactory<I extends org.api4.java.ai.ml.core.dataset.supervised.ILabeledInstance & org.apache.commons.math3.ml.clustering.Clusterable,​D extends org.api4.java.ai.ml.core.dataset.supervised.ILabeledDataset<I>>

    • Constructor Detail

      • KmeansSamplingFactory

        public KmeansSamplingFactory()
    • Method Detail

      • setPreviousRun

        public void setPreviousRun​(KmeansSampling<I,​D> previousRun)
        Description copied from interface: IRerunnableSamplingAlgorithmFactory
        Set the previous run of the sampling algorithm, if one occurred, can be set here to get data from it.
        Specified by:
        setPreviousRun in interface IRerunnableSamplingAlgorithmFactory<I extends org.api4.java.ai.ml.core.dataset.supervised.ILabeledInstance & org.apache.commons.math3.ml.clustering.Clusterable,​D extends org.api4.java.ai.ml.core.dataset.supervised.ILabeledDataset<I>>
        Parameters:
        previousRun - Algorithm object of the previous of the sampling algorithm.
      • setK

        public void setK​(int k)
        Set how many clusters shall be created. Default is the sample size;
        Parameters:
        k - Parameter k of k-means.
      • setClusterSeed

        public void setClusterSeed​(long clusterSeed)
        Set the seed the clustering will use for initialization. Default is without a fix seed and the system time instead.
        Parameters:
        clusterSeed -
      • setDistanceMeassure

        public void setDistanceMeassure​(org.apache.commons.math3.ml.distance.DistanceMeasure distanceMeassure)
        Set the distance measure for the clustering. Default is the Manhattan distance.
        Parameters:
        distanceMeassure -
      • getAlgorithm

        public KmeansSampling<I,​D> getAlgorithm​(int sampleSize,
                                                      D inputDataset,
                                                      java.util.Random random)
        Description copied from interface: ISamplingAlgorithmFactory
        After the necessary config is done, this method returns a fully configured instance of a sampling algorithm.
        Specified by:
        getAlgorithm in interface ISamplingAlgorithmFactory<I extends org.api4.java.ai.ml.core.dataset.supervised.ILabeledInstance & org.apache.commons.math3.ml.clustering.Clusterable,​D extends org.api4.java.ai.ml.core.dataset.supervised.ILabeledDataset<I>>
        Parameters:
        sampleSize - Desired size of the sample that will be created.
        inputDataset - Dataset where the sample will be drawn from.
        random - Random object to make samples reproducible.
        Returns:
        Configured sampling algorithm object.