Class ScikitLearnWrapper

  • All Implemented Interfaces:
    IInstancesClassifier, java.io.Serializable, weka.classifiers.Classifier

    public class ScikitLearnWrapper
    extends java.lang.Object
    implements IInstancesClassifier, weka.classifiers.Classifier
    Wraps a Scikit-Learn Python process by utilizing a template to start a classifier in Scikit with the given classifier. Usage: Set the constructInstruction to exactly the command how the classifier should be instantiated. E.g. "LinearRegression()" or "MLPRegressor(solver = 'lbfg')". Set the imports to exactly what the additional imports lines that are necessary to run the construction command must look like. It is up to the user to decide whether fully qualified names or only the class name themself are used as long as the import is on par with the construct call. E.g (without namespace in construct call) "from sklearn.linear_model import LinearRegression" or (without namespace) "import sklearn.linear_model" createImportStatementFromImportFolder might help to import an own folder of modules. It initializes the folder to be utilizable as a source of modules. Depending on the shape of the construct call the keepNamespace flag must be set (as described above). Before starting the classification it must be set whether the given dataset is a categorical or a regression task (setIsRegression). If the task is a multi target prediction, setTargets must be used to define which columns of the dataset are the targets. If no targets are defined it is assumed that only the last column is the target vector. Moreover, the outputFolder might be set to something else but the default (setOutputFolder). Now buildClassifier can be run. If classifyInstances is run with the same ScikitLearnWrapper instance after training, the previously trained model is used for testing. If another model shall be used or there was no training prior to classifyInstances, the model must be set with setModelPath. After a multi target prediction the results might be more accessible with the unflattened representation that can be obtained with getRawLastClassificationResults. For debug purposes the wrapper might be set to be verbose with setIsVerbose.
    See Also:
    Serialized Form
    • Constructor Detail

      • ScikitLearnWrapper

        public ScikitLearnWrapper​(java.lang.String constructInstruction,
                                  java.lang.String imports,
                                  boolean withoutModelDump)
                           throws java.io.IOException
        Starts a new wrapper and creates its underlying script with the given parameters.
        Parameters:
        constructInstruction - String that defines what constructor to call for the classifier and with which parameters to call it.
        imports - Imports that are appended to the beginning of the script. Normally only the necessary imports for the constructor instruction must be added here.
        Throws:
        java.io.IOException - The script could not be created.
      • ScikitLearnWrapper

        public ScikitLearnWrapper​(java.lang.String constructInstruction,
                                  java.lang.String imports)
                           throws java.io.IOException
        Starts a new wrapper and creates its underlying script with the given parameters.
        Parameters:
        constructInstruction - String that defines what constructor to call for the classifier and with which parameters to call it.
        imports - Imports that are appended to the beginning of the script. Normally only the necessary imports for the constructor instruction must be added here.
        Throws:
        java.io.IOException - The script could not be created.
      • ScikitLearnWrapper

        public ScikitLearnWrapper​(java.lang.String constructInstruction,
                                  java.lang.String imports,
                                  java.io.File trainedModelPath)
                           throws java.io.IOException
        Throws:
        java.io.IOException
    • Method Detail

      • buildClassifier

        public void buildClassifier​(weka.core.Instances data)
                             throws java.lang.Exception
        Specified by:
        buildClassifier in interface weka.classifiers.Classifier
        Throws:
        java.lang.Exception
      • classifyInstances

        public double[] classifyInstances​(weka.core.Instances data)
                                   throws java.lang.Exception
        Specified by:
        classifyInstances in interface IInstancesClassifier
        Throws:
        java.lang.Exception
      • classifyInstance

        public double classifyInstance​(weka.core.Instance instance)
                                throws java.lang.Exception
        Specified by:
        classifyInstance in interface weka.classifiers.Classifier
        Throws:
        java.lang.Exception
      • createImportStatementFromImportFolder

        public static java.lang.String createImportStatementFromImportFolder​(java.io.File importsFolder,
                                                                             boolean keepNamespace)
                                                                      throws java.io.IOException
        Makes the given folder a module to be usable as an import for python and creates a string that adds the folder to the python environment and then imports the folder itself as a module.
        Parameters:
        importsFolder - Folder to be added as a module.
        keepNamespace - If true, a class must be called by the modules' name plus the class name. This is only important if multiple modules are imported and the classes' names are ambiguous. Keep in mind that the constructor call for the classifier must be created accordingly.
        Returns:
        String which can be appended to other imports to care for the folder to be added as a module.
        Throws:
        java.io.IOException - The __init__.py couldn't be created in the given folder (which is necessary to declare it as a module).
      • getImportString

        public static java.lang.String getImportString​(java.util.Collection<java.lang.String> imports)
      • getRawLastClassificationResults

        public java.util.List<java.util.List<java.lang.Double>> getRawLastClassificationResults()
      • setTargets

        public void setTargets​(int... targetColumns)
      • setModelPath

        public void setModelPath​(java.io.File modelFile)
      • getModelPath

        public java.io.File getModelPath()
      • distributionForInstance

        public double[] distributionForInstance​(weka.core.Instance instance)
                                         throws java.lang.Exception
        Specified by:
        distributionForInstance in interface weka.classifiers.Classifier
        Throws:
        java.lang.Exception
      • getCapabilities

        public weka.core.Capabilities getCapabilities()
        Specified by:
        getCapabilities in interface weka.classifiers.Classifier
      • toString

        public java.lang.String toString()
        Overrides:
        toString in class java.lang.Object