Package ai.libs.jaicore.ml.cache
Class ReproducibleInstances
- java.lang.Object
-
- java.util.AbstractCollection<E>
-
- java.util.AbstractList<weka.core.Instance>
-
- weka.core.Instances
-
- ai.libs.jaicore.ml.cache.ReproducibleInstances
-
- All Implemented Interfaces:
java.io.Serializable,java.lang.Iterable<weka.core.Instance>,java.util.Collection<weka.core.Instance>,java.util.List<weka.core.Instance>,weka.core.RevisionHandler
public class ReproducibleInstances extends weka.core.InstancesNew Instances class to track splits and data origin. Origin of the dataset is stored by aLoadDataSetInstructionand changed byFoldBasedSubsetInstructions saved as a list of instructions. This history of the instances can be converted to json and used to reproduce a specific set of instances.- See Also:
- Serialized Form
-
-
Constructor Summary
Constructors Constructor Description ReproducibleInstances(ReproducibleInstances dataset)
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description static ReproducibleInstancesfromARFF(java.io.File arffFile)static ReproducibleInstancesfromHistory(InstructionGraph history, ai.libs.jaicore.basic.sets.Pair<java.lang.String,java.lang.Integer> outputUnitOfHistory)Creates a newReproducibleInstancesobject.static ReproducibleInstancesfromOpenML(int id, java.lang.String apiKey)Creates a newReproducibleInstancesobject.InstructionGraphgetInstructions()ai.libs.jaicore.basic.sets.Pair<java.lang.String,java.lang.Integer>getOutputUnit()booleanisCacheLookup()If true signifies that performance on this data should be looked up in cachebooleanisCacheStorage()If true signifies that performance evaluation should be stored.ReproducibleInstancesreduceWithInstruction(java.lang.String nameOfRefinementInstruction, Instruction instruction, int outputOfRefinementInstruction)Creates a reduced version of the dataset by using an instruction with one input and one outputvoidsetCacheLookup(boolean cacheLookup)If true signifies that performance on this data should be looked up in cachevoidsetCacheStorage(boolean cacheStorage)If set to true, signifies that performance evaluation should be stored.voidsetOutputUnitWithoutRecomputation(ai.libs.jaicore.basic.sets.Pair<java.lang.String,java.lang.Integer> outputUnit)-
Methods inherited from class weka.core.Instances
add, add, allAttributeWeightsIdentical, allInstanceWeightsIdentical, attribute, attribute, attributeStats, attributeToDoubleArray, checkForAttributeType, checkForStringAttributes, checkInstance, classAttribute, classIndex, compactify, copyInstances, delete, delete, deleteAttributeAt, deleteAttributeType, deleteStringAttributes, deleteWithMissing, deleteWithMissing, deleteWithMissingClass, enumerateAttributes, enumerateInstances, equalHeaders, equalHeadersMsg, firstInstance, get, getRandomNumberGenerator, getRevision, initialize, insertAttributeAt, instance, instancesAndWeights, kthSmallestValue, kthSmallestValue, lastInstance, main, meanOrMode, meanOrMode, mergeInstances, numAttributes, numClasses, numDistinctValues, numDistinctValues, numInstances, randomize, readInstance, relationName, remove, renameAttribute, renameAttribute, renameAttributeValue, renameAttributeValue, replaceAttributeAt, resample, resampleWithWeights, resampleWithWeights, resampleWithWeights, resampleWithWeights, resampleWithWeights, resampleWithWeights, resampleWithWeights, set, setAttributeWeight, setAttributeWeight, setClass, setClassIndex, setRelationName, size, sort, sort, sortBasedOnNominalAttribute, stableSort, stableSort, stratify, stratStep, stringFreeStructure, stringWithoutHeader, sumOfWeights, swap, test, testCV, toString, toSummaryString, trainCV, trainCV, variance, variance, variances
-
Methods inherited from class java.util.AbstractList
addAll, clear, equals, hashCode, indexOf, iterator, lastIndexOf, listIterator, listIterator, removeRange, subList
-
Methods inherited from class java.util.AbstractCollection
addAll, contains, containsAll, isEmpty, remove, removeAll, retainAll, toArray, toArray
-
-
-
-
Constructor Detail
-
ReproducibleInstances
public ReproducibleInstances(ReproducibleInstances dataset)
-
-
Method Detail
-
fromHistory
public static ReproducibleInstances fromHistory(InstructionGraph history, ai.libs.jaicore.basic.sets.Pair<java.lang.String,java.lang.Integer> outputUnitOfHistory) throws InstructionFailedException, java.lang.InterruptedException
Creates a newReproducibleInstancesobject. Data is loaded from openml.org.- Parameters:
id- The id of the openml datasetapiKey- apikey to use- Returns:
- new
ReproducibleInstancesobject - Throws:
java.io.IOException- if something goes wrong while loading Instances from openmljava.lang.InterruptedExceptionInstructionFailedException
-
fromOpenML
public static ReproducibleInstances fromOpenML(int id, java.lang.String apiKey) throws InstructionFailedException, java.lang.InterruptedException
Creates a newReproducibleInstancesobject. Data is loaded from openml.org.- Parameters:
id- The id of the openml datasetapiKey- apikey to use- Returns:
- new
ReproducibleInstancesobject - Throws:
java.io.IOException- if something goes wrong while loading Instances from openmljava.lang.InterruptedExceptionInstructionFailedException
-
fromARFF
public static ReproducibleInstances fromARFF(java.io.File arffFile) throws InstructionFailedException, java.lang.InterruptedException
- Throws:
InstructionFailedExceptionjava.lang.InterruptedException
-
getInstructions
public InstructionGraph getInstructions()
- Returns:
- the ordered lists of instructions or null if cache is not used
-
getOutputUnit
public ai.libs.jaicore.basic.sets.Pair<java.lang.String,java.lang.Integer> getOutputUnit()
-
setOutputUnitWithoutRecomputation
public void setOutputUnitWithoutRecomputation(ai.libs.jaicore.basic.sets.Pair<java.lang.String,java.lang.Integer> outputUnit)
-
isCacheStorage
public boolean isCacheStorage()
If true signifies that performance evaluation should be stored.- Returns:
- true if performance should be saved
-
setCacheStorage
public void setCacheStorage(boolean cacheStorage)
If set to true, signifies that performance evaluation should be stored.- Parameters:
cacheStorage- the cacheStorage to set
-
isCacheLookup
public boolean isCacheLookup()
If true signifies that performance on this data should be looked up in cache- Returns:
- true if lookup should be performed
-
setCacheLookup
public void setCacheLookup(boolean cacheLookup)
If true signifies that performance on this data should be looked up in cache- Parameters:
cacheLookup- the cacheLookup to set
-
reduceWithInstruction
public ReproducibleInstances reduceWithInstruction(java.lang.String nameOfRefinementInstruction, Instruction instruction, int outputOfRefinementInstruction) throws java.lang.ClassNotFoundException, InstructionFailedException, java.lang.InterruptedException
Creates a reduced version of the dataset by using an instruction with one input and one output- Parameters:
nameOfRefinementInstruction-instruction-outputOfRefinementInstruction-- Returns:
- Throws:
java.lang.InterruptedExceptionInstructionFailedExceptionjava.lang.ClassNotFoundException
-
-