public class HoeffdingTree extends AbstractClassifier implements UpdateableClassifier, WeightedInstancesHandler, OptionHandler, RevisionHandler, TechnicalInformationHandler, Drawable, java.io.Serializable
@inproceedings{Hulten2001,
author = {Geoff Hulten and Laurie Spencer and Pedro Domingos},
booktitle = {ACM SIGKDD Intl. Conf. on Knowledge Discovery and Data Mining},
pages = {97-106},
publisher = {ACM Press},
title = {Mining time-changing data streams},
year = {2001}
}
Valid options are:
-L The leaf prediction strategy to use. 0 = majority class, 1 = naive Bayes, 2 = naive Bayes adaptive. (default = 0)
-S The splitting criterion to use. 0 = Gini, 1 = Info gain (default = 0)
-E The allowable error in a split decision - values closer to zero will take longer to decide (default = 1e-7)
-H Threshold below which a split will be forced to break ties (default = 0.05)
-M Minimum fraction of weight required down at least two branches for info gain splitting (default = 0.01)
-G Grace period - the number of instances a leaf should observe between split attempts (default = 200)
-N The number of instances (weight) a leaf should observe before allowing naive Bayes to make predictions (NB or NB adaptive only) (default = 0)
-P Print leaf models when using naive Bayes at the leaves.
| Modifier and Type | Field and Description |
|---|---|
static int |
GINI_SPLIT |
static int |
INFO_GAIN_SPLIT |
static int |
LEAF_MAJ_CLASS |
static int |
LEAF_NB |
static int |
LEAF_NB_ADAPTIVE |
protected int |
m_activeLeafCount |
protected int |
m_decisionNodeCount |
protected double |
m_gracePeriod
The number of instances a leaf should observe between split attempts
|
protected Instances |
m_header |
protected double |
m_hoeffdingTieThreshold
Threshold below which a split will be forced to break ties
|
protected int |
m_inactiveLeafCount |
protected int |
m_leafStrategy
The leaf prediction strategy to use
|
protected double |
m_minFracWeightForTwoBranchesGain
The minimum fraction of weight required down at least two branches for info gain splitting
|
protected double |
m_nbThreshold
The number of instances (total weight) a leaf should observe before allowing naive Bayes to make
predictions
|
protected boolean |
m_printLeafModels
Print out leaf models in the case of naive Bayes or naive Bayes adaptive leaves
|
protected HNode |
m_root |
protected int |
m_selectedSplitMetric
The splitting metric to use
|
protected double |
m_splitConfidence
The allowable error in a split decision.
|
protected SplitMetric |
m_splitMetric |
static Tag[] |
TAGS_SELECTION |
static Tag[] |
TAGS_SELECTION2 |
BATCH_SIZE_DEFAULT, m_BatchSize, m_Debug, m_DoNotCheckCapabilities, m_numDecimalPlaces, NUM_DECIMAL_PLACES_DEFAULTBayesNet, Newick, NOT_DRAWABLE, TREE| Constructor and Description |
|---|
HoeffdingTree() |
| Modifier and Type | Method and Description |
|---|---|
protected void |
activateNode(InactiveHNode toActivate,
SplitNode parent,
java.lang.String parentBranch)
Activate (allow growth) the supplied node
|
void |
buildClassifier(Instances data)
Builds the classifier.
|
protected static double |
computeHoeffdingBound(double max,
double confidence,
double weight) |
protected void |
deactivateNode(ActiveHNode toDeactivate,
SplitNode parent,
java.lang.String parentBranch)
Deactivate (prevent growth) from the supplied node
|
double[] |
distributionForInstance(Instance inst)
Returns class probabilities for an instance.
|
Capabilities |
getCapabilities()
Returns default capabilities of the classifier.
|
double |
getGracePeriod()
Get the number of instances (or total weight of instances) a leaf should observe between split
attempts
|
double |
getHoeffdingTieThreshold()
Get the threshold below which a split will be forced to break ties
|
SelectedTag |
getLeafPredictionStrategy()
Get the leaf prediction strategy to use (majority class, naive Bayes or naive Bayes adaptive)
|
double |
getMinimumFractionOfWeightInfoGain()
Get the minimum fraction of weight required down at least two branches for info gain splitting
|
double |
getNaiveBayesPredictionThreshold()
Get the number of instances (weight) a leaf should observe before allowing naive Bayes to make
predictions
|
java.lang.String[] |
getOptions()
Gets the current settings of the Classifier.
|
boolean |
getPrintLeafModels() |
java.lang.String |
getRevision()
Returns the revision string.
|
double |
getSplitConfidence()
Get the allowable error in a split decision.
|
SelectedTag |
getSplitCriterion()
Get the split criterion to use (either Gini or info gain).
|
TechnicalInformation |
getTechnicalInformation()
Returns an instance of a TechnicalInformation object, containing detailed information about the
technical background of this class, e.g., paper reference or book this class is based on.
|
java.lang.String |
globalInfo()
Returns a string describing classifier
|
java.lang.String |
gracePeriodTipText()
Returns the tip text for this property
|
java.lang.String |
graph()
Returns a string that describes a graph representing
the object.
|
int |
graphType()
Returns the type of graph representing
the object.
|
java.lang.String |
hoeffdingTieThresholdTipText()
Returns the tip text for this property
|
java.lang.String |
leafPredictionStrategyTipText()
Returns the tip text for this property
|
java.util.Enumeration<Option> |
listOptions()
Returns an enumeration describing the available options.
|
static void |
main(java.lang.String[] args) |
java.lang.String |
minimumFractionOfWeightInfoGainTipText()
Returns the tip text for this property
|
java.lang.String |
naiveBayesPredictionThresholdTipText()
Returns the tip text for this property
|
protected ActiveHNode |
newLearningNode()
Create a new learning node (either majority class, naive Bayes or naive Bayes adaptive)
|
java.lang.String |
printLeafModelsTipText()
Returns the tip text for this property
|
protected void |
reset() |
void |
setGracePeriod(double grace)
Set the number of instances (or total weight of instances) a leaf should observe between split
attempts
|
void |
setHoeffdingTieThreshold(double ht)
Set the threshold below which a split will be forced to break ties
|
void |
setLeafPredictionStrategy(SelectedTag strat)
Set the leaf prediction strategy to use (majority class, naive Bayes or naive Bayes adaptive)
|
void |
setMinimumFractionOfWeightInfoGain(double m)
Set the minimum fraction of weight required down at least two branches for info gain splitting
|
void |
setNaiveBayesPredictionThreshold(double n)
Set the number of instances (weight) a leaf should observe before allowing naive Bayes to make
predictions
|
void |
setOptions(java.lang.String[] options)
Parses a given list of options.
|
void |
setPrintLeafModels(boolean p) |
void |
setSplitConfidence(double sc)
Set the allowable error in a split decision.
|
void |
setSplitCriterion(SelectedTag crit)
Set the split criterion to use (either Gini or info gain).
|
java.lang.String |
splitConfidenceTipText()
Returns the tip text for this property
|
java.lang.String |
splitCriterionTipText()
Returns the tip text for this property
|
java.lang.String |
toString()
Return a textual description of the mode
|
protected void |
trySplit(ActiveHNode node,
SplitNode parent,
java.lang.String parentBranch)
Try a split from the supplied node
|
void |
updateClassifier(Instance inst)
Updates the classifier with the given instance.
|
batchSizeTipText, classifyInstance, debugTipText, distributionsForInstances, doNotCheckCapabilitiesTipText, forName, getBatchSize, getDebug, getDoNotCheckCapabilities, getNumDecimalPlaces, implementsMoreEfficientBatchPrediction, makeCopies, makeCopy, numDecimalPlacesTipText, postExecution, preExecution, run, runClassifier, setBatchSize, setDebug, setDoNotCheckCapabilities, setNumDecimalPlacesprotected Instances m_header
protected HNode m_root
protected double m_gracePeriod
protected double m_splitConfidence
protected double m_hoeffdingTieThreshold
protected double m_minFracWeightForTwoBranchesGain
protected int m_selectedSplitMetric
protected SplitMetric m_splitMetric
protected int m_leafStrategy
protected double m_nbThreshold
protected int m_activeLeafCount
protected int m_inactiveLeafCount
protected int m_decisionNodeCount
public static final int GINI_SPLIT
public static final int INFO_GAIN_SPLIT
public static final Tag[] TAGS_SELECTION
public static final int LEAF_MAJ_CLASS
public static final int LEAF_NB
public static final int LEAF_NB_ADAPTIVE
public static final Tag[] TAGS_SELECTION2
protected boolean m_printLeafModels
public java.lang.String globalInfo()
public TechnicalInformation getTechnicalInformation()
getTechnicalInformation in interface TechnicalInformationHandlerprotected void reset()
public Capabilities getCapabilities()
getCapabilities in interface ClassifiergetCapabilities in interface CapabilitiesHandlergetCapabilities in class AbstractClassifierCapabilitiespublic java.util.Enumeration<Option> listOptions()
listOptions in interface OptionHandlerlistOptions in class AbstractClassifierpublic void setOptions(java.lang.String[] options)
throws java.lang.Exception
-L The leaf prediction strategy to use. 0 = majority class, 1 = naive Bayes, 2 = naive Bayes adaptive. (default = 0)
-S The splitting criterion to use. 0 = Gini, 1 = Info gain (default = 0)
-E The allowable error in a split decision - values closer to zero will take longer to decide (default = 1e-7)
-H Threshold below which a split will be forced to break ties (default = 0.05)
-M Minimum fraction of weight required down at least two branches for info gain splitting (default = 0.01)
-G Grace period - the number of instances a leaf should observe between split attempts (default = 200)
-N The number of instances (weight) a leaf should observe before allowing naive Bayes to make predictions (NB or NB adaptive only) (default = 0)
-P Print leaf models when using naive Bayes at the leaves.
setOptions in interface OptionHandlersetOptions in class AbstractClassifieroptions - the list of options as an array of stringsjava.lang.Exception - if an option is not supportedpublic java.lang.String[] getOptions()
getOptions in interface OptionHandlergetOptions in class AbstractClassifierpublic java.lang.String printLeafModelsTipText()
public void setPrintLeafModels(boolean p)
public boolean getPrintLeafModels()
public java.lang.String minimumFractionOfWeightInfoGainTipText()
public void setMinimumFractionOfWeightInfoGain(double m)
m - the minimum fraction of weightpublic double getMinimumFractionOfWeightInfoGain()
public java.lang.String gracePeriodTipText()
public void setGracePeriod(double grace)
grace - the grace periodpublic double getGracePeriod()
public java.lang.String hoeffdingTieThresholdTipText()
public void setHoeffdingTieThreshold(double ht)
ht - the thresholdpublic double getHoeffdingTieThreshold()
public java.lang.String splitConfidenceTipText()
public void setSplitConfidence(double sc)
sc - the split confidencepublic double getSplitConfidence()
public java.lang.String splitCriterionTipText()
public void setSplitCriterion(SelectedTag crit)
crit - the criterion to usepublic SelectedTag getSplitCriterion()
public java.lang.String leafPredictionStrategyTipText()
public void setLeafPredictionStrategy(SelectedTag strat)
strat - the strategy to usepublic SelectedTag getLeafPredictionStrategy()
public java.lang.String naiveBayesPredictionThresholdTipText()
public void setNaiveBayesPredictionThreshold(double n)
n - the number/weight of instancespublic double getNaiveBayesPredictionThreshold()
protected static double computeHoeffdingBound(double max,
double confidence,
double weight)
public void buildClassifier(Instances data) throws java.lang.Exception
buildClassifier in interface Classifierdata - the data to train withjava.lang.Exception - if classifier can't be built successfullypublic void updateClassifier(Instance inst) throws java.lang.Exception
updateClassifier in interface UpdateableClassifierinstance - the new training instance to include in the modeljava.lang.Exception - if the instance could not be incorporated in the model.public double[] distributionForInstance(Instance inst) throws java.lang.Exception
distributionForInstance in interface ClassifierdistributionForInstance in class AbstractClassifierinstance - the instance to compute the distribution forjava.lang.Exception - if distribution can't be computed successfullyprotected void deactivateNode(ActiveHNode toDeactivate, SplitNode parent, java.lang.String parentBranch)
toDeactivate - the node to deactivateparent - the node's parentparentBranch - the branch leading to the nodeprotected void activateNode(InactiveHNode toActivate, SplitNode parent, java.lang.String parentBranch)
toActivate - the node to activateparent - the node's parentparentBranch - the branch leading to the nodeprotected void trySplit(ActiveHNode node, SplitNode parent, java.lang.String parentBranch) throws java.lang.Exception
node - the node to splitparent - the parent of the nodeparentBranch - the branch leading to the nodejava.lang.Exception - if a problem occursprotected ActiveHNode newLearningNode() throws java.lang.Exception
java.lang.Exception - if a problem occurspublic java.lang.String toString()
toString in class java.lang.Objectpublic java.lang.String getRevision()
getRevision in interface RevisionHandlergetRevision in class AbstractClassifierpublic static void main(java.lang.String[] args)
public int graphType()
Drawablepublic java.lang.String graph()
throws java.lang.Exception
Drawable