public class FarthestFirst extends RandomizableClusterer implements TechnicalInformationHandler
@article{Hochbaum1985,
author = {Hochbaum and Shmoys},
journal = {Mathematics of Operations Research},
number = {2},
pages = {180-184},
title = {A best possible heuristic for the k-center problem},
volume = {10},
year = {1985}
}
@inproceedings{Dasgupta2002,
author = {Sanjoy Dasgupta},
booktitle = {15th Annual Conference on Computational Learning Theory},
pages = {351-363},
publisher = {Springer},
title = {Performance Guarantees for Hierarchical Clustering},
year = {2002}
}
Valid options are:
-N <num> number of clusters. (default = 2).
-S <num> Random number seed. (default 1)
RandomizableClusterer,
Serialized Form| Modifier and Type | Field and Description |
|---|---|
protected Instances |
m_ClusterCentroids
holds the cluster centroids
|
protected Instances |
m_instances
training instances, not necessary to keep, could be replaced by
m_ClusterCentroids where needed for header info
|
protected int |
m_NumClusters
number of clusters to generate
|
protected ReplaceMissingValues |
m_ReplaceMissingFilter
replace missing values in training instances
|
m_Seed, m_SeedDefaultm_Debug, m_DoNotCheckCapabilities| Constructor and Description |
|---|
FarthestFirst() |
| Modifier and Type | Method and Description |
|---|---|
void |
buildClusterer(Instances data)
Generates a clusterer.
|
int |
clusterInstance(Instance instance)
Classifies a given instance.
|
protected int |
clusterProcessedInstance(Instance instance)
clusters an instance that has been through the filters
|
protected double |
difference(int index,
double val1,
double val2)
Computes the difference between two given attribute values.
|
protected double |
distance(Instance first,
Instance second)
Calculates the distance between two instances
|
protected int |
farthestAway(double[] minDistance,
boolean[] selected) |
Capabilities |
getCapabilities()
Returns default capabilities of the clusterer.
|
Instances |
getClusterCentroids()
Get the centroids found by FarthestFirst
|
int |
getNumClusters()
gets the number of clusters to generate
|
java.lang.String[] |
getOptions()
Gets the current settings of FarthestFirst
|
java.lang.String |
getRevision()
Returns the revision string.
|
TechnicalInformation |
getTechnicalInformation()
Returns an instance of a TechnicalInformation object, containing detailed
information about the technical background of this class, e.g., paper
reference or book this class is based on.
|
java.lang.String |
globalInfo()
Returns a string describing this clusterer
|
protected void |
initMinMax(Instances data) |
java.util.Enumeration<Option> |
listOptions()
Returns an enumeration describing the available options.
|
static void |
main(java.lang.String[] argv)
Main method for testing this class.
|
protected double |
norm(double x,
int i)
Normalizes a given value of a numeric attribute.
|
int |
numberOfClusters()
Returns the number of clusters.
|
java.lang.String |
numClustersTipText()
Returns the tip text for this property
|
void |
setNumClusters(int n)
set the number of clusters to generate
|
void |
setOptions(java.lang.String[] options)
Parses a given list of options.
|
java.lang.String |
toString()
return a string describing this clusterer
|
protected void |
updateMinDistance(double[] minDistance,
boolean[] selected,
Instances data,
Instance center) |
getSeed, seedTipText, setSeeddebugTipText, distributionForInstance, doNotCheckCapabilitiesTipText, forName, getDebug, getDoNotCheckCapabilities, makeCopies, makeCopy, postExecution, preExecution, run, runClusterer, setDebug, setDoNotCheckCapabilitiesprotected Instances m_instances
protected ReplaceMissingValues m_ReplaceMissingFilter
protected int m_NumClusters
protected Instances m_ClusterCentroids
public java.lang.String globalInfo()
public TechnicalInformation getTechnicalInformation()
getTechnicalInformation in interface TechnicalInformationHandlerpublic Capabilities getCapabilities()
getCapabilities in interface ClusterergetCapabilities in interface CapabilitiesHandlergetCapabilities in class AbstractClustererCapabilitiespublic void buildClusterer(Instances data) throws java.lang.Exception
buildClusterer in interface ClustererbuildClusterer in class AbstractClustererdata - set of instances serving as training datajava.lang.Exception - if the clusterer has not been generated successfullyprotected void updateMinDistance(double[] minDistance,
boolean[] selected,
Instances data,
Instance center)
protected int farthestAway(double[] minDistance,
boolean[] selected)
protected void initMinMax(Instances data)
protected int clusterProcessedInstance(Instance instance)
instance - the instance to assign a cluster topublic int clusterInstance(Instance instance) throws java.lang.Exception
clusterInstance in interface ClustererclusterInstance in class AbstractClustererinstance - the instance to be assigned to a clusterjava.lang.Exception - if instance could not be classified successfullyprotected double distance(Instance first, Instance second)
first - the first instancesecond - the second instanceprotected double difference(int index,
double val1,
double val2)
protected double norm(double x,
int i)
x - the value to be normalizedi - the attribute's indexpublic int numberOfClusters()
throws java.lang.Exception
numberOfClusters in interface ClusterernumberOfClusters in class AbstractClustererjava.lang.Exception - if number of clusters could not be returned successfullypublic Instances getClusterCentroids()
public java.util.Enumeration<Option> listOptions()
listOptions in interface OptionHandlerlistOptions in class RandomizableClustererpublic java.lang.String numClustersTipText()
public void setNumClusters(int n)
throws java.lang.Exception
n - the number of clusters to generatejava.lang.Exception - if number of clusters is negativepublic int getNumClusters()
public void setOptions(java.lang.String[] options)
throws java.lang.Exception
-N <num> number of clusters. (default = 2).
-S <num> Random number seed. (default 1)
setOptions in interface OptionHandlersetOptions in class RandomizableClustereroptions - the list of options as an array of stringsjava.lang.Exception - if an option is not supportedpublic java.lang.String[] getOptions()
getOptions in interface OptionHandlergetOptions in class RandomizableClustererpublic java.lang.String toString()
toString in class java.lang.Objectpublic java.lang.String getRevision()
getRevision in interface RevisionHandlergetRevision in class AbstractClustererpublic static void main(java.lang.String[] argv)
argv - should contain the following arguments:
-t training file [-N number of clusters]