public class MergeInfrequentNominalValues extends SimpleBatchFilter implements UnsupervisedFilter, WeightedAttributesHandler, WeightedInstancesHandler
-N <int> The minimum frequency for a value to remain (default: 2).
-R <range> Sets list of attributes to act on (or its inverse). 'first and 'last' are accepted as well.' E.g.: first-5,7,9,20-last (default: 1,2)
-V Invert matching sense (i.e. act on all attributes not specified in list)
-S Use short IDs for merged attribute values.
-output-debug-info If set, filter is run in debug mode and may output additional info to the console
-do-not-check-capabilities If set, filter capabilities are not checked before filter is built (use with caution).
| Modifier and Type | Field and Description |
|---|---|
protected boolean[] |
m_AttToBeModified
Indicators for which attributes need to be changed.
|
protected int |
m_MinimumFrequency
Set the minimum frequency for a value not to be merged.
|
protected int[][] |
m_NewValues
The new values.
|
protected Range |
m_SelectCols
Stores which atributes to operate on (or nto)
|
protected int[] |
m_SelectedAttributes
Stores the indexes of the selected attributes in order.
|
protected boolean |
m_UseShortIDs
Whether to use short identifiers for merge values.
|
m_Debug, m_DoNotCheckCapabilities, m_FirstBatchDone, m_InputRelAtts, m_InputStringAtts, m_NewBatch, m_OutputRelAtts, m_OutputStringAtts| Constructor and Description |
|---|
MergeInfrequentNominalValues() |
| Modifier and Type | Method and Description |
|---|---|
boolean |
allowAccessToFullInputFormat()
We need access to the full input data in determineOutputFormat.
|
java.lang.String |
attributeIndicesTipText()
Returns the tip text for this property
|
protected Instances |
determineOutputFormat(Instances inputFormat)
Determines the output format based on the input format and returns this.
|
java.lang.String |
getAttributeIndices()
Get the current range selection.
|
Capabilities |
getCapabilities()
Returns the Capabilities of this filter.
|
boolean |
getInvertSelection()
Get whether the supplied attributes are to be acted on or all other
attributes.
|
int |
getMinimumFrequency()
Gets the minimum frequency.
|
java.lang.String[] |
getOptions()
Gets the current settings of the filter.
|
java.lang.String |
getRevision()
Returns the revision string.
|
boolean |
getUseShortIDs()
Get whether short IDs are to be used.
|
java.lang.String |
globalInfo()
Returns a string describing this filter.
|
java.lang.String |
invertSelectionTipText()
Returns the tip text for this property
|
java.util.Enumeration<Option> |
listOptions()
Returns an enumeration describing the available options.
|
static void |
main(java.lang.String[] args)
runs the filter with the given arguments
|
java.lang.String |
minimumFrequencyTipText()
Returns the tip text for this property
|
protected Instances |
process(Instances instances)
Processes the given data.
|
void |
setAttributeIndices(java.lang.String rangeList)
Set which attributes are to be acted on (or not, if invert is true)
|
void |
setAttributeIndicesArray(int[] attributes)
Set which attributes are to be acted on (or not, if invert is true)
|
void |
setInvertSelection(boolean invert)
Set whether selected attributes should be acted on or all other attributes.
|
void |
setMinimumFrequency(int minF)
Sets the minimum frequency.
|
void |
setOptions(java.lang.String[] options)
Parses a given list of options.
|
void |
setUseShortIDs(boolean m_UseShortIDs)
Sets whether short IDs are to be used.
|
java.lang.String |
useShortIDsTipText()
Returns the tip text for this property
|
batchFinished, hasImmediateOutputFormat, inputreset, setInputFormatbatchFilterFile, bufferInput, copyValues, copyValues, debugTipText, doNotCheckCapabilitiesTipText, filterFile, flushInput, getCapabilities, getCopyOfInputFormat, getDebug, getDoNotCheckCapabilities, getInputFormat, getOutputFormat, initInputLocators, initOutputLocators, inputFormatPeek, isFirstBatchDone, isNewBatch, isOutputFormatDefined, makeCopies, makeCopy, mayRemoveInstanceAfterFirstBatchDone, numPendingOutput, output, outputFormatPeek, outputPeek, postExecution, preExecution, push, push, resetQueue, run, runFilter, setDebug, setDoNotCheckCapabilities, setOutputFormat, testInputFormat, toString, useFilter, wekaStaticWrapperprotected int m_MinimumFrequency
protected Range m_SelectCols
protected int[] m_SelectedAttributes
protected boolean[] m_AttToBeModified
protected int[][] m_NewValues
protected boolean m_UseShortIDs
public java.lang.String globalInfo()
globalInfo in class SimpleFilterpublic java.util.Enumeration<Option> listOptions()
listOptions in interface OptionHandlerlistOptions in class Filterpublic java.lang.String[] getOptions()
getOptions in interface OptionHandlergetOptions in class Filterpublic void setOptions(java.lang.String[] options)
throws java.lang.Exception
-N <int> The minimum frequency for a value to remain (default: 2).
-R <range> Sets list of attributes to act on (or its inverse). 'first and 'last' are accepted as well.' E.g.: first-5,7,9,20-last (default: 1,2)
-V Invert matching sense (i.e. act on all attributes not specified in list)
-S Use short IDs for merged attribute values.
-output-debug-info If set, filter is run in debug mode and may output additional info to the console
-do-not-check-capabilities If set, filter capabilities are not checked before filter is built (use with caution).
setOptions in interface OptionHandlersetOptions in class Filteroptions - the list of options as an array of stringsjava.lang.Exception - if an option is not supportedpublic java.lang.String minimumFrequencyTipText()
public int getMinimumFrequency()
public void setMinimumFrequency(int minF)
minF - the minimum frequency as an integer.public java.lang.String attributeIndicesTipText()
public java.lang.String getAttributeIndices()
public void setAttributeIndices(java.lang.String rangeList)
rangeList - a string representing the list of attributes. Since the
string will typically come from a user, attributes are indexed
from 1. public void setAttributeIndicesArray(int[] attributes)
attributes - an array containing indexes of attributes to select.
Since the array will typically come from a program, attributes are
indexed from 0.public java.lang.String invertSelectionTipText()
public boolean getInvertSelection()
public void setInvertSelection(boolean invert)
invert - the new invert settingpublic java.lang.String useShortIDsTipText()
public boolean getUseShortIDs()
public void setUseShortIDs(boolean m_UseShortIDs)
m_UseShortIDs - if true, short IDs will be usedpublic boolean allowAccessToFullInputFormat()
allowAccessToFullInputFormat in class SimpleBatchFilterprotected Instances determineOutputFormat(Instances inputFormat)
determineOutputFormat in class SimpleFilterinputFormat - the input format to base the output format onSimpleFilter.hasImmediateOutputFormat(),
Filter.batchFinished()public Capabilities getCapabilities()
getCapabilities in interface CapabilitiesHandlergetCapabilities in class FilterCapabilitiesprotected Instances process(Instances instances) throws java.lang.Exception
process in class SimpleFilterinstances - the data to processjava.lang.Exception - in case the processing goes wrongFilter.batchFinished()public java.lang.String getRevision()
getRevision in interface RevisionHandlergetRevision in class Filterpublic static void main(java.lang.String[] args)
args - the commandline arguments