public class LogisticBase extends AbstractClassifier implements WeightedInstancesHandler
-D If set, classifier is run in debug mode and may output additional info to the console
| Modifier and Type | Field and Description |
|---|---|
protected boolean |
m_errorOnProbabilities
Use error on probabilities for stopping criterion of LogitBoost?
|
protected int |
m_fixedNumIterations
Use fixed number of iterations for LogitBoost?
|
protected int |
m_heuristicStop
Use heuristic to stop performing LogitBoost iterations earlier?
|
protected int |
m_maxIterations
The maximum number of LogitBoost iterations
|
protected int |
m_numClasses
The number of different classes
|
protected Instances |
m_numericData
Numeric version of the training data.
|
protected Instances |
m_numericDataHeader
Header-only version of the numeric version of the training data
|
protected static int |
m_numFoldsBoosting
Number of folds for cross-validating number of LogitBoost iterations
|
protected double |
m_numParameters
Effective number of parameters used for AIC / BIC automatic stopping
|
protected int |
m_numRegressions
The number of LogitBoost iterations performed.
|
protected SimpleLinearRegression[][] |
m_regressions
Array holding the simple regression functions fit by LogitBoost
|
protected Instances |
m_train
Training data
|
protected boolean |
m_useCrossValidation
Use cross-validation to determine best number of LogitBoost iterations ?
|
protected double |
m_weightTrimBeta
Threshold for trimming weights.
|
protected static double |
Z_MAX
Threshold on the Z-value for LogitBoost
|
BATCH_SIZE_DEFAULT, m_BatchSize, m_Debug, m_DoNotCheckCapabilities, m_numDecimalPlaces, NUM_DECIMAL_PLACES_DEFAULT| Constructor and Description |
|---|
LogisticBase()
Constructor that creates LogisticBase object with standard options.
|
LogisticBase(int numBoostingIterations,
boolean useCrossValidation,
boolean errorOnProbabilities)
Constructor to create LogisticBase object.
|
| Modifier and Type | Method and Description |
|---|---|
void |
buildClassifier(Instances data)
Builds the logistic regression model usiing LogitBoost.
|
void |
cleanup()
Cleanup in order to save memory.
|
protected SimpleLinearRegression[][] |
copyRegressions(SimpleLinearRegression[][] a)
Deep copies the given array of simple linear regression functions.
|
double[] |
distributionForInstance(Instance instance)
Returns class probabilities for an instance.
|
protected int |
getBestIteration(double[] errors,
int maxIteration)
Helper function to find the minimum in an array of error values.
|
protected double[][] |
getCoefficients()
Returns an array holding the coefficients of the logistic model.
|
protected double |
getErrorRate(Instances data)
Returns the misclassification error of the current model on a set of instances.
|
protected double[] |
getFs(Instance instance)
Computes the F-values for a single instance.
|
protected double[][] |
getFs(Instances data)
Computes the F-values for a set of instances.
|
int |
getMaxIterations()
Returns the maxIterations parameter.
|
protected double |
getMeanAbsoluteError(Instances data)
Returns the error of the probability estimates for the current model on a set of instances.
|
protected Instances |
getNumericData(Instances data)
Converts training data to numeric version.
|
int |
getNumRegressions()
The number of LogitBoost iterations performed (= the number of simple regression functions fit).
|
protected double[][] |
getProbs(double[][] dataFs)
Computes the p-values (probabilities for the different classes) from the F-values for a set of instances.
|
java.lang.String |
getRevision()
Returns the revision string.
|
boolean |
getUseAIC()
Get the value of useAIC.
|
int[][] |
getUsedAttributes()
Returns an array of the indices of the attributes used in the logistic model.
|
double |
getWeightTrimBeta()
Get the value of weightTrimBeta.
|
protected double[][] |
getWs(double[][] probs,
double[][] dataYs)
Computes the LogitBoost weights from an array of y/p values (actual/estimated class probabilities).
|
protected double[][] |
getYs(Instances data)
Computes the Y-values (actual class probabilities) for a set of instances.
|
protected double |
getZ(double actual,
double p)
Computes the LogitBoost response variable from y/p values (actual/estimated class probabilities).
|
protected double[][] |
getZs(double[][] probs,
double[][] dataYs)
Computes the LogitBoost response for an array of y/p values (actual/estimated class probabilities).
|
protected SimpleLinearRegression[][] |
initRegressions()
Helper function to initialize m_regressions.
|
protected double |
negativeLogLikelihood(double[][] dataYs,
double[][] probs)
Returns the negative loglikelihood of the Y-values (actual class probabilities) given the p-values (current probability estimates).
|
double |
percentAttributesUsed()
Returns the fraction of all attributes in the data that are used in the logistic model (in percent).
|
protected void |
performBoosting()
Runs LogitBoost using the stopping criterion on the training set.
|
protected int |
performBoosting(Instances train,
Instances test,
double[] error,
int maxIterations)
Runs LogitBoost on a training set and monitors the error on a test set.
|
protected void |
performBoosting(int numIterations)
Runs LogitBoost with a fixed number of iterations.
|
protected void |
performBoostingCV()
Runs LogitBoost, determining the best number of iterations by cross-validation.
|
protected void |
performBoostingInfCriterion()
Runs LogitBoost, determining the best number of iterations by an information criterion (currently AIC).
|
protected boolean |
performIteration(int iteration,
double[][] trainYs,
double[][] trainFs,
double[][] probs,
Instances trainNumeric)
Performs a single iteration of LogitBoost, and updates the model accordingly.
|
protected double[] |
probs(double[] Fs)
Computes the p-values (probabilities for the classes) from the F-values of the logistic model.
|
void |
setHeuristicStop(int heuristicStop)
Sets the option "heuristicStop".
|
void |
setMaxIterations(int maxIterations)
Sets the parameter "maxIterations".
|
void |
setUseAIC(boolean c)
Set the value of useAIC.
|
void |
setWeightTrimBeta(double w)
Sets the option "weightTrimBeta".
|
java.lang.String |
toString()
Returns a description of the logistic model (i.e., attributes and coefficients).
|
batchSizeTipText, classifyInstance, debugTipText, distributionsForInstances, doNotCheckCapabilitiesTipText, forName, getBatchSize, getCapabilities, getDebug, getDoNotCheckCapabilities, getNumDecimalPlaces, getOptions, implementsMoreEfficientBatchPrediction, listOptions, makeCopies, makeCopy, numDecimalPlacesTipText, postExecution, preExecution, run, runClassifier, setBatchSize, setDebug, setDoNotCheckCapabilities, setNumDecimalPlaces, setOptionsprotected Instances m_numericDataHeader
protected Instances m_numericData
protected Instances m_train
protected boolean m_useCrossValidation
protected boolean m_errorOnProbabilities
protected int m_fixedNumIterations
protected int m_heuristicStop
protected int m_numRegressions
protected int m_maxIterations
protected int m_numClasses
protected SimpleLinearRegression[][] m_regressions
protected static int m_numFoldsBoosting
protected static final double Z_MAX
protected double m_numParameters
protected double m_weightTrimBeta
public LogisticBase()
public LogisticBase(int numBoostingIterations,
boolean useCrossValidation,
boolean errorOnProbabilities)
numBoostingIterations - fixed number of iterations for LogitBoost (if negative, use cross-validation or stopping criterion on the training data).useCrossValidation - cross-validate number of LogitBoost iterations (if false, use stopping criterion on the training data).errorOnProbabilities - if true, use error on probabilities instead of misclassification for stopping criterion of LogitBoostpublic void buildClassifier(Instances data) throws java.lang.Exception
buildClassifier in interface Classifierdata - the training datajava.lang.Exception - if something goes wrongprotected void performBoostingCV()
throws java.lang.Exception
java.lang.Exception - if something goes wrongprotected SimpleLinearRegression[][] copyRegressions(SimpleLinearRegression[][] a) throws java.lang.Exception
a - the array to copyjava.lang.Exceptionprotected void performBoostingInfCriterion()
throws java.lang.Exception
java.lang.Exceptionprotected int performBoosting(Instances train, Instances test, double[] error, int maxIterations) throws java.lang.Exception
train - the training settest - the test seterror - array to hold the logged error valuesmaxIterations - the maximum number of LogitBoost iterations to runjava.lang.Exception - if something goes wrongprotected void performBoosting(int numIterations)
throws java.lang.Exception
numIterations - the number of iterations to runjava.lang.Exception - if something goes wrongprotected void performBoosting()
throws java.lang.Exception
java.lang.Exception - if something goes wrongprotected double getErrorRate(Instances data) throws java.lang.Exception
data - the set of instancesjava.lang.Exception - if something goes wrongprotected double getMeanAbsoluteError(Instances data) throws java.lang.Exception
data - the set of instancesjava.lang.Exception - if something goes wrongprotected int getBestIteration(double[] errors,
int maxIteration)
errors - an array containing errorsmaxIteration - the maximum of iterationsprotected boolean performIteration(int iteration,
double[][] trainYs,
double[][] trainFs,
double[][] probs,
Instances trainNumeric)
throws java.lang.Exception
iteration - the current iterationtrainYs - the y-values (see description of LogitBoost) for the model trained so fartrainFs - the F-values (see description of LogitBoost) for the model trained so farprobs - the p-values (see description of LogitBoost) for the model trained so fartrainNumeric - numeric version of the training datajava.lang.Exception - if something goes wrongprotected SimpleLinearRegression[][] initRegressions() throws java.lang.Exception
java.lang.Exceptionprotected Instances getNumericData(Instances data) throws java.lang.Exception
data - the data to convertjava.lang.Exception - if something goes wrongprotected double getZ(double actual,
double p)
actual - the actual class probabilityp - the estimated class probabilityprotected double[][] getZs(double[][] probs,
double[][] dataYs)
dataYs - the actual class probabilitiesprobs - the estimated class probabilitiesprotected double[][] getWs(double[][] probs,
double[][] dataYs)
dataYs - the actual class probabilitiesprobs - the estimated class probabilitiesprotected double[] probs(double[] Fs)
Fs - the F-valuesprotected double[][] getYs(Instances data)
data - the data to compute the Y-values fromprotected double[] getFs(Instance instance) throws java.lang.Exception
instance - the instance to compute the F-values forjava.lang.Exception - if something goes wrongprotected double[][] getFs(Instances data) throws java.lang.Exception
data - the data to work onjava.lang.Exception - if something goes wrongprotected double[][] getProbs(double[][] dataFs)
dataFs - the F-valuesprotected double negativeLogLikelihood(double[][] dataYs,
double[][] probs)
dataYs - the Y-valuesprobs - the p-valuespublic int[][] getUsedAttributes()
public int getNumRegressions()
public double getWeightTrimBeta()
public boolean getUseAIC()
public void setMaxIterations(int maxIterations)
maxIterations - the maximum iterationspublic void setHeuristicStop(int heuristicStop)
heuristicStop - the heuristic stop to usepublic void setWeightTrimBeta(double w)
public void setUseAIC(boolean c)
c - Value to assign to useAIC.public int getMaxIterations()
protected double[][] getCoefficients()
public double percentAttributesUsed()
public java.lang.String toString()
toString in class java.lang.Objectpublic double[] distributionForInstance(Instance instance) throws java.lang.Exception
distributionForInstance in interface ClassifierdistributionForInstance in class AbstractClassifierinstance - the instance to compute the distribution forjava.lang.Exception - if distribution can't be computed successfullypublic void cleanup()
public java.lang.String getRevision()
getRevision in interface RevisionHandlergetRevision in class AbstractClassifier