public class LogitBoost extends RandomizableIteratedSingleClassifierEnhancer implements Sourcable, WeightedInstancesHandler, TechnicalInformationHandler, IterativeClassifier, BatchPredictor
@techreport{Friedman1998,
address = {Stanford University},
author = {J. Friedman and T. Hastie and R. Tibshirani},
title = {Additive Logistic Regression: a Statistical View of Boosting},
year = {1998},
PS = {http://www-stat.stanford.edu/\~jhf/ftp/boost.ps}
}
Valid options are:
-Q Use resampling instead of reweighting for boosting.
-use-estimated-priors Use estimated priors rather than uniform ones.
-P <percent> Percentage of weight mass to base training on. (default 100, reduce to around 90 speed up)
-L <num> Threshold on the improvement of the likelihood. (default -Double.MAX_VALUE)
-H <num> Shrinkage parameter. (default 1)
-Z <num> Z max threshold for responses. (default 3)
-O <int> The size of the thread pool, for example, the number of cores in the CPU. (default 1)
-E <int> The number of threads to use for batch prediction, which should be >= size of thread pool. (default 1)
-S <num> Random number seed. (default 1)
-I <num> Number of iterations. (default 10)
-W Full name of base classifier. (default: weka.classifiers.trees.DecisionStump)
-output-debug-info If set, classifier is run in debug mode and may output additional info to the console
-do-not-check-capabilities If set, classifier capabilities are not checked before classifier is built (use with caution).
Options specific to classifier weka.classifiers.trees.DecisionStump:
-output-debug-info If set, classifier is run in debug mode and may output additional info to the console
-do-not-check-capabilities If set, classifier capabilities are not checked before classifier is built (use with caution).Options after -- are passed to the designated learner.
| Modifier and Type | Field and Description |
|---|---|
protected static double |
DEFAULT_Z_MAX
A threshold for responses (Friedman suggests between 2 and 4)
|
protected Attribute |
m_ClassAttribute
The actual class attribute (for getting class names)
|
protected java.util.ArrayList<Classifier[]> |
m_Classifiers
ArrayList for storing the generated base classifiers.
|
protected Instances |
m_data
The training data.
|
protected double[] |
m_InitialFs
The initial F scores (0 by default)
|
protected double |
m_logLikelihood
The current loglikelihood.
|
protected int |
m_NumClasses
The number of classes
|
protected Instances |
m_NumericClassData
Dummy dataset with a numeric class
|
protected int |
m_NumGenerated
The number of successfully generated base classifiers.
|
protected int |
m_numThreads
The number of threads to use at prediction time in batch prediction.
|
protected double |
m_Offset
The value by which the actual target value for the true class is offset.
|
protected int |
m_poolSize
The size of the thread pool.
|
protected double |
m_Precision
The threshold on the improvement of the likelihood
|
protected double[][] |
m_probs
The probabilities used during the training process.
|
protected java.util.Random |
m_RandomInstance
The random number generator used
|
protected double |
m_Shrinkage
The value of the shrinkage parameter
|
protected double |
m_sumOfWeights
The total weight of the data.
|
protected double[][] |
m_trainFs
The F scores used during the training process.
|
protected double[][] |
m_trainYs
The y values used during the training process.
|
protected boolean |
m_UseEstimatedPriors
Whether to start with class priors estimated from the training data
|
protected boolean |
m_UseResampling
Use boosting with reweighting?
|
protected int |
m_WeightThreshold
Weight thresholding.
|
protected Classifier |
m_ZeroR
A ZeroR model in case no model can be built from the data
|
protected double |
m_zMax
The Z max value to use
|
m_Seedm_NumIterationsm_ClassifierBATCH_SIZE_DEFAULT, m_BatchSize, m_Debug, m_DoNotCheckCapabilities, m_numDecimalPlaces, NUM_DECIMAL_PLACES_DEFAULT| Constructor and Description |
|---|
LogitBoost()
Constructor.
|
| Modifier and Type | Method and Description |
|---|---|
void |
buildClassifier(Instances data)
Method used to build the classifier.
|
Classifier[][] |
classifiers()
Returns the array of classifiers that have been built.
|
protected java.lang.String |
defaultClassifierString()
String describing default classifier.
|
double[] |
distributionForInstance(Instance inst)
Calculates the class membership probabilities for the given test instance.
|
double[][] |
distributionsForInstances(Instances insts)
Calculates the class membership probabilities for the given test instances.
|
void |
done()
Clean up after boosting.
|
Capabilities |
getCapabilities()
Returns default capabilities of the classifier.
|
double |
getLikelihoodThreshold()
Get the value of Precision.
|
int |
getNumThreads()
Gets the number of threads.
|
java.lang.String[] |
getOptions()
Gets the current settings of the Classifier.
|
int |
getPoolSize()
Gets the number of threads.
|
java.lang.String |
getRevision()
Returns the revision string.
|
double |
getShrinkage()
Get the value of Shrinkage.
|
TechnicalInformation |
getTechnicalInformation()
Returns an instance of a TechnicalInformation object, containing detailed information about the
technical background of this class, e.g., paper reference or book this class is based on.
|
boolean |
getUseEstimatedPriors()
Get whether resampling is turned on
|
boolean |
getUseResampling()
Get whether resampling is turned on
|
int |
getWeightThreshold()
Get the degree of weight thresholding
|
double |
getZMax()
Get the Z max threshold on the responses
|
java.lang.String |
globalInfo()
Returns a string describing classifier
|
boolean |
implementsMoreEfficientBatchPrediction()
Performs efficient batch prediction
|
void |
initializeClassifier(Instances data)
Builds the boosted classifier
|
java.lang.String |
likelihoodThresholdTipText()
Returns the tip text for this property
|
java.util.Enumeration<Option> |
listOptions()
Returns an enumeration describing the available options.
|
static void |
main(java.lang.String[] argv)
Main method for testing this class.
|
boolean |
next()
Perform another iteration of boosting.
|
java.lang.String |
numThreadsTipText() |
java.lang.String |
poolSizeTipText() |
protected double[] |
processInstance(Instance instance)
Applies models to an instance to get class probabilities.
|
protected Instances |
selectWeightQuantile(Instances data,
double quantile)
Select only instances with weights that contribute to the specified quantile of the weight
distribution
|
void |
setLikelihoodThreshold(double newPrecision)
Set the value of Precision.
|
void |
setNumThreads(int nT)
Sets the number of threads
|
void |
setOptions(java.lang.String[] options)
Parses a given list of options.
|
void |
setPoolSize(int nT)
Sets the number of threads
|
void |
setShrinkage(double newShrinkage)
Set the value of Shrinkage.
|
void |
setUseEstimatedPriors(boolean r)
Set resampling mode
|
void |
setUseResampling(boolean r)
Set resampling mode
|
void |
setWeightThreshold(int threshold)
Set weight thresholding
|
void |
setZMax(double zMax)
Set the Z max threshold on the responses
|
java.lang.String |
shrinkageTipText()
Returns the tip text for this property
|
java.lang.String |
toSource(java.lang.String className)
Returns the boosted model as Java source code.
|
java.lang.String |
toString()
Returns description of the boosted classifier.
|
java.lang.String |
useEstimatedPriorsTipText()
Returns the tip text for this property
|
java.lang.String |
useResamplingTipText()
Returns the tip text for this property
|
java.lang.String |
weightThresholdTipText()
Returns the tip text for this property
|
java.lang.String |
ZMaxTipText()
Returns the tip text for this property
|
getSeed, seedTipText, setSeeddefaultNumberOfIterations, getM_Classifiers, getNumIterations, numIterationsTipText, setNumIterationsclassifierTipText, defaultClassifierOptions, getClassifier, getClassifierSpec, postExecution, preExecution, setClassifierbatchSizeTipText, classifyInstance, debugTipText, doNotCheckCapabilitiesTipText, forName, getBatchSize, getDebug, getDoNotCheckCapabilities, getNumDecimalPlaces, makeCopies, makeCopy, numDecimalPlacesTipText, run, runClassifier, setBatchSize, setDebug, setDoNotCheckCapabilities, setNumDecimalPlacesclone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, waitclassifyInstancegetBatchSize, setBatchSizeprotected java.util.ArrayList<Classifier[]> m_Classifiers
protected int m_NumClasses
protected int m_NumGenerated
protected int m_WeightThreshold
protected static final double DEFAULT_Z_MAX
protected Instances m_NumericClassData
protected Attribute m_ClassAttribute
protected boolean m_UseResampling
protected double m_Precision
protected double m_Shrinkage
protected boolean m_UseEstimatedPriors
protected java.util.Random m_RandomInstance
protected double m_Offset
protected Classifier m_ZeroR
protected double[] m_InitialFs
protected double m_zMax
protected double[][] m_trainYs
protected double[][] m_trainFs
protected double[][] m_probs
protected double m_logLikelihood
protected double m_sumOfWeights
protected Instances m_data
protected int m_numThreads
protected int m_poolSize
public java.lang.String globalInfo()
public TechnicalInformation getTechnicalInformation()
getTechnicalInformation in interface TechnicalInformationHandlerprotected java.lang.String defaultClassifierString()
defaultClassifierString in class SingleClassifierEnhancerprotected Instances selectWeightQuantile(Instances data, double quantile)
data - the input instancesquantile - the specified quantile eg 0.9 to select 90% of the weight masspublic java.util.Enumeration<Option> listOptions()
listOptions in interface OptionHandlerlistOptions in class RandomizableIteratedSingleClassifierEnhancerpublic void setOptions(java.lang.String[] options)
throws java.lang.Exception
-Q Use resampling instead of reweighting for boosting.
-use-estimated-priors Use estimated priors rather than uniform ones.
-P <percent> Percentage of weight mass to base training on. (default 100, reduce to around 90 speed up)
-L <num> Threshold on the improvement of the likelihood. (default -Double.MAX_VALUE)
-H <num> Shrinkage parameter. (default 1)
-Z <num> Z max threshold for responses. (default 3)
-O <int> The size of the thread pool, for example, the number of cores in the CPU. (default 1)
-E <int> The number of threads to use for batch prediction, which should be >= size of thread pool. (default 1)
-S <num> Random number seed. (default 1)
-I <num> Number of iterations. (default 10)
-W Full name of base classifier. (default: weka.classifiers.trees.DecisionStump)
-output-debug-info If set, classifier is run in debug mode and may output additional info to the console
-do-not-check-capabilities If set, classifier capabilities are not checked before classifier is built (use with caution).
Options specific to classifier weka.classifiers.trees.DecisionStump:
-output-debug-info If set, classifier is run in debug mode and may output additional info to the console
-do-not-check-capabilities If set, classifier capabilities are not checked before classifier is built (use with caution).Options after -- are passed to the designated learner.
setOptions in interface OptionHandlersetOptions in class RandomizableIteratedSingleClassifierEnhanceroptions - the list of options as an array of stringsjava.lang.Exception - if an option is not supportedpublic java.lang.String[] getOptions()
getOptions in interface OptionHandlergetOptions in class RandomizableIteratedSingleClassifierEnhancerpublic java.lang.String ZMaxTipText()
public void setZMax(double zMax)
zMax - the threshold to usepublic double getZMax()
public java.lang.String shrinkageTipText()
public double getShrinkage()
public void setShrinkage(double newShrinkage)
newShrinkage - Value to assign to Shrinkage.public java.lang.String likelihoodThresholdTipText()
public double getLikelihoodThreshold()
public void setLikelihoodThreshold(double newPrecision)
newPrecision - Value to assign to Precision.public java.lang.String useResamplingTipText()
public void setUseResampling(boolean r)
r - true if resampling should be donepublic boolean getUseResampling()
public java.lang.String useEstimatedPriorsTipText()
public void setUseEstimatedPriors(boolean r)
r - true if resampling should be donepublic boolean getUseEstimatedPriors()
public java.lang.String weightThresholdTipText()
public void setWeightThreshold(int threshold)
threshold - the percentage of weight mass used for trainingpublic int getWeightThreshold()
public java.lang.String numThreadsTipText()
public int getNumThreads()
public void setNumThreads(int nT)
public java.lang.String poolSizeTipText()
public int getPoolSize()
public void setPoolSize(int nT)
public Capabilities getCapabilities()
getCapabilities in interface ClassifiergetCapabilities in interface CapabilitiesHandlergetCapabilities in class SingleClassifierEnhancerCapabilitiespublic void buildClassifier(Instances data) throws java.lang.Exception
buildClassifier in interface ClassifierbuildClassifier in class IteratedSingleClassifierEnhancerdata - the training data to be used for generating the
bagged classifier.java.lang.Exception - if the classifier could not be built successfullypublic void initializeClassifier(Instances data) throws java.lang.Exception
initializeClassifier in interface IterativeClassifierdata - the data to train the classifier withjava.lang.Exception - if building fails, e.g., can't handle datapublic boolean next()
throws java.lang.Exception
next in interface IterativeClassifierjava.lang.Exception - if this iteration fails for unexpected reasonspublic void done()
done in interface IterativeClassifierpublic Classifier[][] classifiers()
public boolean implementsMoreEfficientBatchPrediction()
implementsMoreEfficientBatchPrediction in interface BatchPredictorimplementsMoreEfficientBatchPrediction in class AbstractClassifierpublic double[] distributionForInstance(Instance inst) throws java.lang.Exception
distributionForInstance in interface ClassifierdistributionForInstance in class AbstractClassifierinst - the instance to be classifiedjava.lang.Exception - if instance could not be classified successfullyprotected double[] processInstance(Instance instance) throws java.lang.Exception
java.lang.Exceptionpublic double[][] distributionsForInstances(Instances insts) throws java.lang.Exception
distributionsForInstances in interface BatchPredictordistributionsForInstances in class AbstractClassifierinsts - the instances to be classifiedjava.lang.Exception - if instances could not be classified successfullypublic java.lang.String toSource(java.lang.String className)
throws java.lang.Exception
public java.lang.String toString()
toString in class java.lang.Objectpublic java.lang.String getRevision()
getRevision in interface RevisionHandlergetRevision in class AbstractClassifierpublic static void main(java.lang.String[] argv)
argv - the options