public class PrincipalComponents extends Filter implements OptionHandler, UnsupervisedFilter
-C Center (rather than standardize) the data and compute PCA using the covariance (rather than the correlation) matrix.
-R <num> Retain enough PC attributes to account for this proportion of variance in the original data. (default: 0.95)
-A <num> Maximum number of attributes to include in transformed attribute names. (-1 = include all, default: 5)
-M <num> Maximum number of PC attributes to retain. (-1 = include all, default: -1)
| Modifier and Type | Field and Description |
|---|---|
protected Remove |
m_AttributeFilter
Filter for removing class attribute, nominal attributes with 0 or 1 value.
|
protected Center |
m_centerFilter
Filter for centering the data
|
protected int |
m_ClassIndex
Class index.
|
protected no.uib.cipr.matrix.UpperSymmDenseMatrix |
m_Correlation
Correlation matrix for the original data.
|
protected double |
m_CoverVariance
the amount of varaince to cover in the original data when retaining the
best n PC's.
|
protected double[] |
m_Eigenvalues
Eigenvalues for the corresponding eigenvectors.
|
protected double[][] |
m_Eigenvectors
Will hold the unordered linear transformations of the (normalized) original
data.
|
protected boolean |
m_HasClass
Data has a class set.
|
protected int |
m_MaxAttributes
maximum number of attributes in the transformed data (-1 for all).
|
protected int |
m_MaxAttrsInName
maximum number of attributes in the transformed attribute name.
|
protected NominalToBinary |
m_NominalToBinaryFilter
Filter for turning nominal values into numeric ones.
|
protected int |
m_NumAttribs
Number of attributes.
|
protected int |
m_NumInstances
Number of instances.
|
protected int |
m_OutputNumAtts
The number of attributes in the pc transformed data.
|
protected ReplaceMissingValues |
m_ReplaceMissingFilter
Filters for replacing missing values.
|
protected int[] |
m_SortedEigens
Sorted eigenvalues.
|
protected Standardize |
m_standardizeFilter
Filter for standardizing the data
|
protected double |
m_SumOfEigenValues
sum of the eigenvalues.
|
protected Instances |
m_TrainCopy
Keep a copy for the class attribute (if set).
|
protected Instances |
m_TrainInstances
The data to transform analyse/transform.
|
protected Instances |
m_TransformedFormat
The header for the transformed data format.
|
m_Debug, m_DoNotCheckCapabilities, m_FirstBatchDone, m_InputRelAtts, m_InputStringAtts, m_NewBatch, m_OutputRelAtts, m_OutputStringAtts| Constructor and Description |
|---|
PrincipalComponents() |
| Modifier and Type | Method and Description |
|---|---|
boolean |
batchFinished()
Signify that this batch of input to the filter is finished.
|
java.lang.String |
centerDataTipText()
Returns the tip text for this property
|
protected Instance |
convertInstance(Instance instance)
Transform an instance in original (unormalized) format.
|
protected Instances |
determineOutputFormat(Instances inputFormat)
Determines the output format based on the input format and returns this.
|
protected void |
fillCovariance() |
Capabilities |
getCapabilities()
Returns the capabilities of this evaluator.
|
boolean |
getCenterData()
Get whether to center (rather than standardize) the data.
|
int |
getMaximumAttributeNames()
Gets maximum number of attributes to include in transformed attribute
names.
|
int |
getMaximumAttributes()
Gets maximum number of PC attributes to retain.
|
java.lang.String[] |
getOptions()
Gets the current settings of the filter.
|
java.lang.String |
getRevision()
Returns the revision string.
|
double |
getVarianceCovered()
Gets the proportion of total variance to account for when retaining
principal components.
|
java.lang.String |
globalInfo()
Returns a string describing this filter.
|
boolean |
input(Instance instance)
Input an instance for filtering.
|
java.util.Enumeration<Option> |
listOptions()
Returns an enumeration describing the available options.
|
static void |
main(java.lang.String[] args)
Main method for running this filter.
|
java.lang.String |
maximumAttributeNamesTipText()
Returns the tip text for this property.
|
java.lang.String |
maximumAttributesTipText()
Returns the tip text for this property.
|
void |
setCenterData(boolean center)
Set whether to center (rather than standardize) the data.
|
boolean |
setInputFormat(Instances instanceInfo)
Sets the format of the input instances.
|
void |
setMaximumAttributeNames(int value)
Sets maximum number of attributes to include in transformed attribute
names.
|
void |
setMaximumAttributes(int value)
Sets maximum number of PC attributes to retain.
|
void |
setOptions(java.lang.String[] options)
Parses a list of options for this object.
|
protected void |
setup(Instances instances)
Initializes the filter with the given input data.
|
void |
setVarianceCovered(double value)
Sets the amount of variance to account for when retaining principal
components.
|
java.lang.String |
varianceCoveredTipText()
Returns the tip text for this property.
|
batchFilterFile, bufferInput, copyValues, copyValues, debugTipText, doNotCheckCapabilitiesTipText, filterFile, flushInput, getCapabilities, getCopyOfInputFormat, getDebug, getDoNotCheckCapabilities, getInputFormat, getOutputFormat, initInputLocators, initOutputLocators, inputFormatPeek, isFirstBatchDone, isNewBatch, isOutputFormatDefined, makeCopies, makeCopy, mayRemoveInstanceAfterFirstBatchDone, numPendingOutput, output, outputFormatPeek, outputPeek, postExecution, preExecution, push, push, resetQueue, run, runFilter, setDebug, setDoNotCheckCapabilities, setOutputFormat, testInputFormat, toString, useFilter, wekaStaticWrapperprotected Instances m_TrainInstances
protected Instances m_TrainCopy
protected Instances m_TransformedFormat
protected boolean m_HasClass
protected int m_ClassIndex
protected int m_NumAttribs
protected int m_NumInstances
protected no.uib.cipr.matrix.UpperSymmDenseMatrix m_Correlation
protected double[][] m_Eigenvectors
protected double[] m_Eigenvalues
protected int[] m_SortedEigens
protected double m_SumOfEigenValues
protected ReplaceMissingValues m_ReplaceMissingFilter
protected NominalToBinary m_NominalToBinaryFilter
protected Remove m_AttributeFilter
protected Standardize m_standardizeFilter
protected Center m_centerFilter
protected int m_OutputNumAtts
protected double m_CoverVariance
protected int m_MaxAttrsInName
protected int m_MaxAttributes
public java.lang.String globalInfo()
public java.util.Enumeration<Option> listOptions()
listOptions in interface OptionHandlerlistOptions in class Filterpublic void setOptions(java.lang.String[] options)
throws java.lang.Exception
-C Center (rather than standardize) the data and compute PCA using the covariance (rather than the correlation) matrix.
-R <num> Retain enough PC attributes to account for this proportion of variance in the original data. (default: 0.95)
-A <num> Maximum number of attributes to include in transformed attribute names. (-1 = include all, default: 5)
-M <num> Maximum number of PC attributes to retain. (-1 = include all, default: -1)
setOptions in interface OptionHandlersetOptions in class Filteroptions - the list of options as an array of stringsjava.lang.Exception - if an option is not supportedpublic java.lang.String[] getOptions()
getOptions in interface OptionHandlergetOptions in class Filterpublic java.lang.String centerDataTipText()
public void setCenterData(boolean center)
center - true if the data is to be centered rather than standardizedpublic boolean getCenterData()
public java.lang.String varianceCoveredTipText()
public void setVarianceCovered(double value)
value - the proportion of total variance to account forpublic double getVarianceCovered()
public java.lang.String maximumAttributeNamesTipText()
public void setMaximumAttributeNames(int value)
value - the maximum number of attributespublic int getMaximumAttributeNames()
public java.lang.String maximumAttributesTipText()
public void setMaximumAttributes(int value)
value - the maximum number of attributespublic int getMaximumAttributes()
public Capabilities getCapabilities()
getCapabilities in interface CapabilitiesHandlergetCapabilities in class FilterCapabilitiesprotected Instances determineOutputFormat(Instances inputFormat) throws java.lang.Exception
inputFormat - the input format to base the output format onjava.lang.Exception - in case the determination goes wrongbatchFinished()protected void fillCovariance()
throws java.lang.Exception
java.lang.Exceptionprotected Instance convertInstance(Instance instance) throws java.lang.Exception
instance - an instance in the original (unormalized) formatjava.lang.Exception - if instance can't be transformedprotected void setup(Instances instances) throws java.lang.Exception
instances - the data to processjava.lang.Exception - in case the processing goes wrongbatchFinished()public boolean setInputFormat(Instances instanceInfo) throws java.lang.Exception
setInputFormat in class FilterinstanceInfo - an Instances object containing the input instance
structure (any instances contained in the object are ignored -
only the structure is required).java.lang.Exception - if the input format can't be set successfullypublic boolean input(Instance instance) throws java.lang.Exception
public boolean batchFinished()
throws java.lang.Exception
batchFinished in class Filterjava.lang.NullPointerException - if no input structure has been defined,java.lang.Exception - if there was a problem finishing the batch.public java.lang.String getRevision()
getRevision in interface RevisionHandlergetRevision in class Filterpublic static void main(java.lang.String[] args)
args - should contain arguments to the filter: use -h for help