public class Agrawal extends ClassificationGenerator implements TechnicalInformationHandler
@article{Agrawal1993,
author = {R. Agrawal and T. Imielinski and A. Swami},
journal = {IEEE Transactions on Knowledge and Data Engineering},
note = {Special issue on Learning and Discovery in Knowledge-Based Databases},
number = {6},
pages = {914-925},
title = {Database Mining: A Performance Perspective},
volume = {5},
year = {1993},
URL = {http://www.almaden.ibm.com/software/quest/Publications/ByDate.html},
PDF = {http://www.almaden.ibm.com/software/quest/Publications/papers/tkde93.pdf}
}
Valid options are:
-h Prints this help.
-o <file> The name of the output file, otherwise the generated data is printed to stdout.
-r <name> The name of the relation.
-d Whether to print debug informations.
-S The seed for random function (default 1)
-n <num> The number of examples to generate (default 100)
-F <num> The function to use for generating the data. (default 1)
-B Whether to balance the class.
-P <num> The perturbation factor. (default 0.05)
| Modifier and Type | Class and Description |
|---|---|
protected static interface |
Agrawal.ClassFunction
the interface for the class functions
|
| Modifier and Type | Field and Description |
|---|---|
protected static Agrawal.ClassFunction[] |
builtInFunctions
built in functions are based on the paper (page 924), which turn out to be
functions pred20 thru pred29 in the public c code
|
static int |
FUNCTION_1
function 1
|
static int |
FUNCTION_10
function 10
|
static int |
FUNCTION_2
function 2
|
static int |
FUNCTION_3
function 3
|
static int |
FUNCTION_4
function 4
|
static int |
FUNCTION_5
function 5
|
static int |
FUNCTION_6
function 6
|
static int |
FUNCTION_7
function 7
|
static int |
FUNCTION_8
function 8
|
static int |
FUNCTION_9
function 9
|
static Tag[] |
FUNCTION_TAGS
the funtion tags
|
protected boolean |
m_BalanceClass
whether to balance the class
|
protected int |
m_Function
the function to use for generating the data
|
protected double |
m_lastLabel
the last class label that was generated
|
protected boolean |
m_nextClassShouldBeZero
used for balancing the class
|
protected double |
m_PerturbationFraction
the perturabation fraction
|
m_NumExamplesm_CreatingRelationName, m_DatasetFormat, m_Debug, m_DefaultOutput, m_NumExamplesAct, m_OptionBlacklist, m_Output, m_Random, m_RelationName, m_Seed| Constructor and Description |
|---|
Agrawal()
initializes the generator with default values
|
| Modifier and Type | Method and Description |
|---|---|
java.lang.String |
balanceClassTipText()
Returns the tip text for this property
|
protected boolean |
defaultBalanceClass()
returns the default for balancing the class
|
protected SelectedTag |
defaultFunction()
returns the default function
|
protected double |
defaultPerturbationFraction()
returns the default perturbation fraction
|
Instances |
defineDataFormat()
Initializes the format for the dataset produced.
|
java.lang.String |
functionTipText()
Returns the tip text for this property
|
Instance |
generateExample()
Generates one example of the dataset.
|
Instances |
generateExamples()
Generates all examples of the dataset.
|
java.lang.String |
generateFinished()
Generates a comment string that documentats the data generator.
|
java.lang.String |
generateStart()
Generates a comment string that documentates the data generator.
|
boolean |
getBalanceClass()
Gets whether the class is balanced.
|
SelectedTag |
getFunction()
Gets the function for generating the data.
|
java.lang.String[] |
getOptions()
Gets the current settings of the datagenerator.
|
double |
getPerturbationFraction()
Gets the perturbation fraction.
|
java.lang.String |
getRevision()
Returns the revision string.
|
boolean |
getSingleModeFlag()
Return if single mode is set for the given data generator mode depends on
option setting and or generator type.
|
TechnicalInformation |
getTechnicalInformation()
Returns an instance of a TechnicalInformation object, containing detailed
information about the technical background of this class, e.g., paper
reference or book this class is based on.
|
java.lang.String |
globalInfo()
Returns a string describing this data generator.
|
java.util.Enumeration<Option> |
listOptions()
Returns an enumeration describing the available options.
|
static void |
main(java.lang.String[] args)
Main method for executing this class.
|
java.lang.String |
perturbationFractionTipText()
Returns the tip text for this property
|
protected double |
perturbValue(double val,
double min,
double max)
perturbs the given value
|
protected double |
perturbValue(double val,
double range,
double min,
double max)
perturbs the given value
|
void |
setBalanceClass(boolean value)
Sets whether the class is balanced.
|
void |
setFunction(SelectedTag value)
Sets the function for generating the data.
|
void |
setOptions(java.lang.String[] options)
Parses a list of options for this object.
|
void |
setPerturbationFraction(double value)
Sets the perturbation fraction.
|
defaultNumExamples, getNumExamples, numExamplesTipText, setNumExamplesaddToBlacklist, clearBlacklist, debugTipText, defaultNumExamplesAct, defaultOutput, defaultRelationName, defaultSeed, enumToVector, formatTipText, getDatasetFormat, getDebug, getEpilogue, getNumExamplesAct, getOutput, getPrologue, getRandom, getRelationName, getRelationNameToUse, getSeed, isOnBlacklist, makeData, makeOptionString, numExamplesActTipText, outputTipText, randomTipText, relationNameTipText, removeBlacklist, runDataGenerator, seedTipText, setDatasetFormat, setDebug, setNumExamplesAct, setOutput, setRandom, setRelationName, setSeed, toStringFormatprotected static Agrawal.ClassFunction[] builtInFunctions
public static final int FUNCTION_1
public static final int FUNCTION_2
public static final int FUNCTION_3
public static final int FUNCTION_4
public static final int FUNCTION_5
public static final int FUNCTION_6
public static final int FUNCTION_7
public static final int FUNCTION_8
public static final int FUNCTION_9
public static final int FUNCTION_10
public static final Tag[] FUNCTION_TAGS
protected int m_Function
protected boolean m_BalanceClass
protected double m_PerturbationFraction
protected boolean m_nextClassShouldBeZero
protected double m_lastLabel
public java.lang.String globalInfo()
public TechnicalInformation getTechnicalInformation()
getTechnicalInformation in interface TechnicalInformationHandlerpublic java.util.Enumeration<Option> listOptions()
listOptions in interface OptionHandlerlistOptions in class ClassificationGeneratorpublic void setOptions(java.lang.String[] options)
throws java.lang.Exception
-h Prints this help.
-o <file> The name of the output file, otherwise the generated data is printed to stdout.
-r <name> The name of the relation.
-d Whether to print debug informations.
-S The seed for random function (default 1)
-n <num> The number of examples to generate (default 100)
-F <num> The function to use for generating the data. (default 1)
-B Whether to balance the class.
-P <num> The perturbation factor. (default 0.05)
setOptions in interface OptionHandlersetOptions in class ClassificationGeneratoroptions - the list of options as an array of stringsjava.lang.Exception - if an option is not supportedpublic java.lang.String[] getOptions()
getOptions in interface OptionHandlergetOptions in class ClassificationGeneratorDataGenerator.removeBlacklist(String[])protected SelectedTag defaultFunction()
public SelectedTag getFunction()
FUNCTION_TAGSpublic void setFunction(SelectedTag value)
value - the function.FUNCTION_TAGSpublic java.lang.String functionTipText()
protected boolean defaultBalanceClass()
public boolean getBalanceClass()
public void setBalanceClass(boolean value)
value - whether to balance the class.public java.lang.String balanceClassTipText()
protected double defaultPerturbationFraction()
public double getPerturbationFraction()
public void setPerturbationFraction(double value)
value - the perturbation fraction.public java.lang.String perturbationFractionTipText()
public boolean getSingleModeFlag()
throws java.lang.Exception
getSingleModeFlag in class DataGeneratorjava.lang.Exception - if mode is not set yetpublic Instances defineDataFormat() throws java.lang.Exception
defineDataFormat in class DataGeneratorjava.lang.Exception - if the generating of the format failedDataGenerator.getSeed()protected double perturbValue(double val,
double min,
double max)
val - the value to perturbmin - the minimummax - the maximumprotected double perturbValue(double val,
double range,
double min,
double max)
val - the value to perturbrange - the range for the perturbationmin - the minimummax - the maximumpublic Instance generateExample() throws java.lang.Exception
generateExample in class DataGeneratorjava.lang.Exception - if the format of the dataset is not yet definedjava.lang.Exception - if the generator only works with generateExamples which
means in non single modepublic Instances generateExamples() throws java.lang.Exception
generateExamples in class DataGeneratorjava.lang.Exception - if the format of the dataset is not yet definedjava.lang.Exception - if the generator only works with generateExample, which
means in single modeDataGenerator.getSeed()public java.lang.String generateStart()
generateStart in class DataGeneratorpublic java.lang.String generateFinished()
throws java.lang.Exception
generateFinished in class DataGeneratorjava.lang.Exception - if the generating of the documentaion failspublic java.lang.String getRevision()
getRevision in interface RevisionHandlerpublic static void main(java.lang.String[] args)
args - should contain arguments for the data producer: