Class ClustersWithNoiseExtraction
- java.lang.Object
-
- elki.clustering.hierarchical.extraction.ClustersWithNoiseExtraction
-
- All Implemented Interfaces:
elki.Algorithm,ClusteringAlgorithm<Clustering<Model>>
@Reference(authors="Erich Schubert, Michael Gertz", title="Semantic Word Clouds with Background Corpus Normalization and t-distributed Stochastic Neighbor Embedding", booktitle="ArXiV preprint, 1708.03569", url="http://arxiv.org/abs/1708.03569", bibkey="DBLP:journals/corr/abs-1708-03569") @Priority(206) public class ClustersWithNoiseExtraction extends java.lang.Object implements ClusteringAlgorithm<Clustering<Model>>Extraction of a given number of clusters with a minimum size, and noise.This will execute the highest-most cut where we retain k clusters, each with a minimum size, plus noise (single points that would only merge afterwards). If no such cut can be found, it returns a result with a relaxed k.
You need to specify: A) the minimum size of a cluster (it does not make much sense to use 1 - then it will simply execute all but the last k merges) and B) the desired number of clusters with at least minSize elements each.
Reference:
Erich Schubert, Michael Gertz
Semantic Word Clouds with Background Corpus Normalization and t-distributed Stochastic Neighbor Embedding
ArXiV preprint, 1708.03569TODO: Also provide representatives and last merge height for clusters.
- Since:
- 0.7.5
- Author:
- Erich Schubert
-
-
Nested Class Summary
Nested Classes Modifier and Type Class Description protected classClustersWithNoiseExtraction.InstanceInstance for a single data set.static classClustersWithNoiseExtraction.ParParameterization class.
-
Field Summary
Fields Modifier and Type Field Description private HierarchicalClusteringAlgorithmalgorithmClustering algorithm to run to obtain the hierarchy.private static elki.logging.LoggingLOGClass logger.private intminClSizeMinimum cluster size.private intnumClMinimum number of clusters.
-
Constructor Summary
Constructors Constructor Description ClustersWithNoiseExtraction(HierarchicalClusteringAlgorithm algorithm, int numCl, int minClSize)Constructor.
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description Clustering<Model>autorun(elki.database.Database database)elki.data.type.TypeInformation[]getInputTypeRestriction()Clustering<Model>run(ClusterMergeHistory merges)Process an existing result.
-
-
-
Field Detail
-
LOG
private static final elki.logging.Logging LOG
Class logger.
-
numCl
private int numCl
Minimum number of clusters.
-
minClSize
private int minClSize
Minimum cluster size.
-
algorithm
private HierarchicalClusteringAlgorithm algorithm
Clustering algorithm to run to obtain the hierarchy.
-
-
Constructor Detail
-
ClustersWithNoiseExtraction
public ClustersWithNoiseExtraction(HierarchicalClusteringAlgorithm algorithm, int numCl, int minClSize)
Constructor.- Parameters:
algorithm- Algorithm to runnumCl- Number of clustersminClSize- Minimum cluster size
-
-
Method Detail
-
autorun
public Clustering<Model> autorun(elki.database.Database database)
- Specified by:
autorunin interfaceelki.Algorithm- Specified by:
autorunin interfaceClusteringAlgorithm<Clustering<Model>>
-
run
public Clustering<Model> run(ClusterMergeHistory merges)
Process an existing result.- Parameters:
merges- Existing result in pointer representation.- Returns:
- Clustering
-
getInputTypeRestriction
public elki.data.type.TypeInformation[] getInputTypeRestriction()
- Specified by:
getInputTypeRestrictionin interfaceelki.Algorithm
-
-