public class Apriori extends AbstractAssociator implements OptionHandler, AssociationRulesProducer, CARuleMiner, TechnicalInformationHandler
@inproceedings{Agrawal1994,
author = {R. Agrawal and R. Srikant},
booktitle = {20th International Conference on Very Large Data Bases},
pages = {478-499},
publisher = {Morgan Kaufmann, Los Altos, CA},
title = {Fast Algorithms for Mining Association Rules in Large Databases},
year = {1994}
}
@inproceedings{Liu1998,
author = {Bing Liu and Wynne Hsu and Yiming Ma},
booktitle = {Fourth International Conference on Knowledge Discovery and Data Mining},
pages = {80-86},
publisher = {AAAI Press},
title = {Integrating Classification and Association Rule Mining},
year = {1998}
}
Valid options are:
-N <required number of rules output> The required number of rules. (default = 10)
-T <0=confidence | 1=lift | 2=leverage | 3=Conviction> The metric type by which to rank rules. (default = confidence)
-C <minimum metric score of a rule> The minimum confidence of a rule. (default = 0.9)
-D <delta for minimum support> The delta by which the minimum support is decreased in each iteration. (default = 0.05)
-U <upper bound for minimum support> Upper bound for minimum support. (default = 1.0)
-M <lower bound for minimum support> The lower bound for the minimum support. (default = 0.1)
-S <significance level> If used, rules are tested for significance at the given level. Slower. (default = no significance testing)
-I If set the itemsets found are also output. (default = no)
-R Remove columns that contain all missing values (default = no)
-V Report progress iteratively. (default = no)
-A If set class association rules are mined. (default = no)
-Z Treat zero (i.e. first value of nominal attributes) as missing
-B <toString delimiters> If used, two characters to use as rule delimiters in the result of toString: the first to delimit fields, the second to delimit items within fields. (default = traditional toString result)
-c <the class index> The class index. (default = last)
| Modifier and Type | Field and Description |
|---|---|
protected static int |
CONFIDENCE
Metric type: Confidence
|
protected static int |
CONVICTION
Metric type: Conviction
|
protected static int |
LEVERAGE
Metric type: Leverage
|
protected static int |
LIFT
Metric type: Lift
|
protected java.util.ArrayList<java.lang.Object>[] |
m_allTheRules
The list of all generated rules.
|
protected boolean |
m_car
Flag indicating whether class association rules are mined.
|
protected int |
m_classIndex
The class index.
|
protected int |
m_cycles
Number of cycles used before required number of rules was one.
|
protected double |
m_delta
Delta by which m_minSupport is decreased in each iteration.
|
protected java.util.ArrayList<java.util.Hashtable<ItemSet,java.lang.Integer>> |
m_hashtables
The same information stored in hash tables.
|
protected Instances |
m_instances
The instances (transactions) to be used for generating the association
rules.
|
protected double |
m_lowerBoundMinSupport
The lower bound for the minimum support.
|
protected java.util.ArrayList<java.util.ArrayList<java.lang.Object>> |
m_Ls
The set of all sets of itemsets L.
|
protected int |
m_metricType
The selected metric type.
|
protected double |
m_minMetric
The minimum metric score.
|
protected double |
m_minSupport
The minimum support.
|
protected int |
m_numRules
The maximum number of rules that are output.
|
protected Instances |
m_onlyClass
Only the class attribute of all Instances.
|
protected boolean |
m_outputItemSets
Output itemsets found?
|
protected boolean |
m_removeMissingCols
Remove columns with all missing values
|
protected double |
m_significanceLevel
Significance level for optional significance test.
|
protected java.lang.String |
m_toStringDelimiters
ToString delimiters, if any
|
protected boolean |
m_treatZeroAsMissing
Treat zeros as missing (rather than a value in their own right)
|
protected double |
m_upperBoundMinSupport
The upper bound on the support
|
protected boolean |
m_verbose
Report progress iteratively
|
static Tag[] |
TAGS_SELECTION
Metric types.
|
m_DoNotCheckCapabilities| Constructor and Description |
|---|
Apriori()
Constructor that allows to sets default values for the minimum confidence
and the maximum number of rules the minimum confidence.
|
| Modifier and Type | Method and Description |
|---|---|
void |
buildAssociations(Instances instances)
Method that generates all large itemsets with a minimum support, and from
these all association rules with a minimum confidence.
|
boolean |
canProduceRules()
Returns true if this AssociationRulesProducer can actually produce rules.
|
java.lang.String |
carTipText()
Returns the tip text for this property
|
java.lang.String |
classIndexTipText()
Returns the tip text for this property
|
java.lang.String |
deltaTipText()
Returns the tip text for this property
|
java.util.ArrayList<java.lang.Object>[] |
getAllTheRules()
returns all the rules
|
AssociationRules |
getAssociationRules()
Gets the list of mined association rules.
|
Capabilities |
getCapabilities()
Returns default capabilities of the classifier.
|
boolean |
getCar()
Gets whether class association ruels are mined
|
int |
getClassIndex()
Gets the class index
|
double |
getDelta()
Get the value of delta.
|
Instances |
getInstancesNoClass()
Gets the instances without the class atrribute.
|
Instances |
getInstancesOnlyClass()
Gets only the class attribute of the instances.
|
double |
getLowerBoundMinSupport()
Get the value of lowerBoundMinSupport.
|
SelectedTag |
getMetricType()
Get the metric type
|
double |
getMinMetric()
Get the value of minConfidence.
|
int |
getNumRules()
Get the value of numRules.
|
java.lang.String[] |
getOptions()
Gets the current settings of the Apriori object.
|
boolean |
getOutputItemSets()
Gets whether itemsets are output as well
|
boolean |
getRemoveAllMissingCols()
Returns whether columns containing all missing values are to be removed
|
java.lang.String |
getRevision()
Returns the revision string.
|
java.lang.String[] |
getRuleMetricNames()
Gets a list of the names of the metrics output for each rule.
|
double |
getSignificanceLevel()
Get the value of significanceLevel.
|
TechnicalInformation |
getTechnicalInformation()
Returns an instance of a TechnicalInformation object, containing detailed
information about the technical background of this class, e.g., paper
reference or book this class is based on.
|
boolean |
getTreatZeroAsMissing()
Gets whether zeros (i.e. the first value of a nominal attribute) is to be
treated int he same way as missing values.
|
double |
getUpperBoundMinSupport()
Get the value of upperBoundMinSupport.
|
boolean |
getVerbose()
Gets whether algorithm is run in verbose mode
|
java.lang.String |
globalInfo()
Returns a string describing this associator
|
java.util.Enumeration<Option> |
listOptions()
Returns an enumeration describing the available options.
|
java.lang.String |
lowerBoundMinSupportTipText()
Returns the tip text for this property
|
static void |
main(java.lang.String[] args)
Main method.
|
java.lang.String |
metricString()
Returns the metric string for the chosen metric type
|
java.lang.String |
metricTypeTipText()
Returns the tip text for this property
|
java.util.ArrayList<java.lang.Object>[] |
mineCARs(Instances data)
Method that mines all class association rules with minimum support and with
a minimum confidence.
|
java.lang.String |
minMetricTipText()
Returns the tip text for this property
|
java.lang.String |
numRulesTipText()
Returns the tip text for this property
|
java.lang.String |
outputItemSetsTipText()
Returns the tip text for this property
|
java.lang.String |
removeAllMissingColsTipText()
Returns the tip text for this property
|
protected Instances |
removeMissingColumns(Instances instances)
Removes columns that are all missing from the data
|
void |
resetOptions()
Resets the options to the default values.
|
void |
setCar(boolean flag)
Sets class association rule mining
|
void |
setClassIndex(int index)
Sets the class index
|
void |
setDelta(double v)
Set the value of delta.
|
void |
setLowerBoundMinSupport(double v)
Set the value of lowerBoundMinSupport.
|
void |
setMetricType(SelectedTag d)
Set the metric type for ranking rules
|
void |
setMinMetric(double v)
Set the value of minConfidence.
|
void |
setNumRules(int v)
Set the value of numRules.
|
void |
setOptions(java.lang.String[] options)
Parses a given list of options.
|
void |
setOutputItemSets(boolean flag)
Sets whether itemsets are output as well
|
void |
setRemoveAllMissingCols(boolean r)
Remove columns containing all missing values.
|
void |
setSignificanceLevel(double v)
Set the value of significanceLevel.
|
void |
setTreatZeroAsMissing(boolean z)
Sets whether zeros (i.e. the first value of a nominal attribute) should be
treated as missing values.
|
void |
setUpperBoundMinSupport(double v)
Set the value of upperBoundMinSupport.
|
void |
setVerbose(boolean flag)
Sets verbose mode
|
java.lang.String |
significanceLevelTipText()
Returns the tip text for this property
|
java.lang.String |
toString()
Outputs the size of all the generated sets of itemsets and the rules.
|
java.lang.String |
treatZeroAsMissingTipText()
Returns the tip text for this property
|
java.lang.String |
upperBoundMinSupportTipText()
Returns the tip text for this property
|
java.lang.String |
verboseTipText()
Returns the tip text for this property
|
doNotCheckCapabilitiesTipText, forName, getDoNotCheckCapabilities, makeCopies, makeCopy, postExecution, preExecution, run, runAssociator, setDoNotCheckCapabilitiesprotected double m_minSupport
protected double m_upperBoundMinSupport
protected double m_lowerBoundMinSupport
protected static final int CONFIDENCE
protected static final int LIFT
protected static final int LEVERAGE
protected static final int CONVICTION
public static final Tag[] TAGS_SELECTION
protected int m_metricType
protected double m_minMetric
protected int m_numRules
protected double m_delta
protected double m_significanceLevel
protected int m_cycles
protected java.util.ArrayList<java.util.ArrayList<java.lang.Object>> m_Ls
protected java.util.ArrayList<java.util.Hashtable<ItemSet,java.lang.Integer>> m_hashtables
protected java.util.ArrayList<java.lang.Object>[] m_allTheRules
protected Instances m_instances
protected boolean m_outputItemSets
protected boolean m_removeMissingCols
protected boolean m_verbose
protected Instances m_onlyClass
protected int m_classIndex
protected boolean m_car
protected boolean m_treatZeroAsMissing
protected java.lang.String m_toStringDelimiters
public Apriori()
public java.lang.String globalInfo()
public TechnicalInformation getTechnicalInformation()
getTechnicalInformation in interface TechnicalInformationHandlerpublic void resetOptions()
protected Instances removeMissingColumns(Instances instances) throws java.lang.Exception
instances - the instancesjava.lang.Exception - if something goes wrongpublic Capabilities getCapabilities()
getCapabilities in interface AssociatorgetCapabilities in interface CapabilitiesHandlergetCapabilities in class AbstractAssociatorCapabilitiespublic void buildAssociations(Instances instances) throws java.lang.Exception
buildAssociations in interface Associatorinstances - the instances to be used for generating the associationsjava.lang.Exception - if rules can't be built successfullypublic java.util.ArrayList<java.lang.Object>[] mineCARs(Instances data) throws java.lang.Exception
mineCARs in interface CARuleMinerdata - the instances for which class association rules should be minedjava.lang.Exception - if rules can't be built successfullypublic Instances getInstancesNoClass()
getInstancesNoClass in interface CARuleMinerpublic Instances getInstancesOnlyClass()
getInstancesOnlyClass in interface CARuleMinerpublic java.util.Enumeration<Option> listOptions()
listOptions in interface OptionHandlerlistOptions in class AbstractAssociatorpublic void setOptions(java.lang.String[] options)
throws java.lang.Exception
-N <required number of rules output> The required number of rules. (default = 10)
-T <0=confidence | 1=lift | 2=leverage | 3=Conviction> The metric type by which to rank rules. (default = confidence)
-C <minimum metric score of a rule> The minimum confidence of a rule. (default = 0.9)
-D <delta for minimum support> The delta by which the minimum support is decreased in each iteration. (default = 0.05)
-U <upper bound for minimum support> Upper bound for minimum support. (default = 1.0)
-M <lower bound for minimum support> The lower bound for the minimum support. (default = 0.1)
-S <significance level> If used, rules are tested for significance at the given level. Slower. (default = no significance testing)
-I If set the itemsets found are also output. (default = no)
-R Remove columns that contain all missing values (default = no)
-V Report progress iteratively. (default = no)
-A If set class association rules are mined. (default = no)
-Z Treat zero (i.e. first value of nominal attributes) as missing
-B <toString delimiters> If used, two characters to use as rule delimiters in the result of toString: the first to delimit fields, the second to delimit items within fields. (default = traditional toString result)
-c <the class index> The class index. (default = last)
setOptions in interface OptionHandlersetOptions in class AbstractAssociatoroptions - the list of options as an array of stringsjava.lang.Exception - if an option is not supportedpublic java.lang.String[] getOptions()
getOptions in interface OptionHandlergetOptions in class AbstractAssociatorpublic java.lang.String toString()
toString in class java.lang.Objectpublic java.lang.String metricString()
metricString in interface CARuleMinerpublic java.lang.String removeAllMissingColsTipText()
public void setRemoveAllMissingCols(boolean r)
r - true if cols are to be removed.public boolean getRemoveAllMissingCols()
public java.lang.String upperBoundMinSupportTipText()
public double getUpperBoundMinSupport()
public void setUpperBoundMinSupport(double v)
v - Value to assign to upperBoundMinSupport.public void setClassIndex(int index)
setClassIndex in interface CARuleMinerindex - the class indexpublic int getClassIndex()
public java.lang.String classIndexTipText()
public void setCar(boolean flag)
flag - if class association rules are mined, false otherwisepublic boolean getCar()
public java.lang.String carTipText()
public java.lang.String lowerBoundMinSupportTipText()
public double getLowerBoundMinSupport()
public void setLowerBoundMinSupport(double v)
v - Value to assign to lowerBoundMinSupport.public SelectedTag getMetricType()
public java.lang.String metricTypeTipText()
public void setMetricType(SelectedTag d)
d - the type of metricpublic java.lang.String minMetricTipText()
public double getMinMetric()
public void setMinMetric(double v)
v - Value to assign to minConfidence.public java.lang.String numRulesTipText()
public int getNumRules()
public void setNumRules(int v)
v - Value to assign to numRules.public java.lang.String deltaTipText()
public double getDelta()
public void setDelta(double v)
v - Value to assign to delta.public java.lang.String significanceLevelTipText()
public double getSignificanceLevel()
public void setSignificanceLevel(double v)
v - Value to assign to significanceLevel.public void setOutputItemSets(boolean flag)
flag - true if itemsets are to be output as wellpublic boolean getOutputItemSets()
public java.lang.String outputItemSetsTipText()
public void setVerbose(boolean flag)
flag - true if algorithm should be run in verbose modepublic boolean getVerbose()
public java.lang.String verboseTipText()
public java.lang.String treatZeroAsMissingTipText()
public void setTreatZeroAsMissing(boolean z)
z - true if zeros should be treated as missing values.public boolean getTreatZeroAsMissing()
public java.util.ArrayList<java.lang.Object>[] getAllTheRules()
m_allTheRulespublic AssociationRules getAssociationRules()
AssociationRulesProducergetAssociationRules in interface AssociationRulesProducerpublic java.lang.String[] getRuleMetricNames()
getRuleMetricNames in interface AssociationRulesProducerpublic boolean canProduceRules()
canProduceRules in interface AssociationRulesProducerpublic java.lang.String getRevision()
getRevision in interface RevisionHandlergetRevision in class AbstractAssociatorpublic static void main(java.lang.String[] args)
args - the commandline options