Class ArffDatasetAdapter

  • All Implemented Interfaces:
    org.api4.java.ai.ml.core.dataset.serialization.IDatasetDeserializer<org.api4.java.ai.ml.core.dataset.supervised.ILabeledDataset<org.api4.java.ai.ml.core.dataset.supervised.ILabeledInstance>>, org.api4.java.common.control.ILoggingCustomizable

    public class ArffDatasetAdapter
    extends java.lang.Object
    implements org.api4.java.ai.ml.core.dataset.serialization.IDatasetDeserializer<org.api4.java.ai.ml.core.dataset.supervised.ILabeledDataset<org.api4.java.ai.ml.core.dataset.supervised.ILabeledInstance>>, org.api4.java.common.control.ILoggingCustomizable
    Handles dataset files in the arff format {@link https://waikato.github.io/weka-wiki/formats_and_processing/arff/}
    • Method Summary

      All Methods Instance Methods Concrete Methods 
      Modifier and Type Method Description
      protected org.api4.java.ai.ml.core.dataset.supervised.ILabeledDataset<org.api4.java.ai.ml.core.dataset.supervised.ILabeledInstance> createDataset​(ai.libs.jaicore.basic.kvstore.KVStore relationMetaData, java.util.List<org.api4.java.ai.ml.core.dataset.schema.attribute.IAttribute> attributes)  
      org.api4.java.ai.ml.core.dataset.supervised.ILabeledDataset<org.api4.java.ai.ml.core.dataset.supervised.ILabeledInstance> deserializeDataset()  
      org.api4.java.ai.ml.core.dataset.supervised.ILabeledDataset<org.api4.java.ai.ml.core.dataset.supervised.ILabeledInstance> deserializeDataset​(org.api4.java.ai.ml.core.dataset.descriptor.IDatasetDescriptor datasetDescriptor)  
      org.api4.java.ai.ml.core.dataset.supervised.ILabeledDataset<org.api4.java.ai.ml.core.dataset.supervised.ILabeledInstance> deserializeDataset​(org.api4.java.ai.ml.core.dataset.descriptor.IFileDatasetDescriptor datasetDescriptor, int columnWithClassIndex)  
      org.api4.java.ai.ml.core.dataset.supervised.ILabeledDataset<org.api4.java.ai.ml.core.dataset.supervised.ILabeledInstance> deserializeDataset​(org.api4.java.ai.ml.core.dataset.descriptor.IFileDatasetDescriptor datasetFile, java.lang.String nameOfClassAttribute)  
      org.api4.java.ai.ml.core.dataset.schema.attribute.IAttribute getAttributeWithName​(org.api4.java.ai.ml.core.dataset.descriptor.IFileDatasetDescriptor datasetFile, java.lang.String nameOfAttribute)  
      java.lang.String getLoggerName()  
      protected org.api4.java.ai.ml.core.dataset.schema.attribute.IAttribute parseAttribute​(java.lang.String line)
      parses an attribute definition of an ARff file.
      protected java.util.List<java.lang.Object> parseInstance​(boolean sparseData, java.util.List<org.api4.java.ai.ml.core.dataset.schema.attribute.IAttribute> attributes, int targetIndex, java.lang.String line)
      Parses a single instance of an ARff file containing values for each attribute given (attributes parameter).
      protected ai.libs.jaicore.basic.kvstore.KVStore parseRelation​(java.lang.String line)
      Extracts meta data about a relation from a string.
      org.api4.java.ai.ml.core.dataset.supervised.ILabeledDataset<org.api4.java.ai.ml.core.dataset.supervised.ILabeledInstance> readDataset​(boolean sparseMode, java.io.File datasetFile)  
      org.api4.java.ai.ml.core.dataset.supervised.ILabeledDataset<org.api4.java.ai.ml.core.dataset.supervised.ILabeledInstance> readDataset​(boolean sparseMode, java.io.File datasetFile, int columnWithClassIndex)
      Parses the ARff dataset from the given file into a ILabeledDataset
      org.api4.java.ai.ml.core.dataset.supervised.ILabeledDataset<org.api4.java.ai.ml.core.dataset.supervised.ILabeledInstance> readDataset​(java.io.File datasetFile)  
      void serializeDataset​(java.io.File arffOutputFile, org.api4.java.ai.ml.core.dataset.supervised.ILabeledDataset<? extends org.api4.java.ai.ml.core.dataset.supervised.ILabeledInstance> data)  
      void setLoggerName​(java.lang.String name)  
      java.lang.String[] splitDenseInstanceLine​(java.lang.String line)  
      • Methods inherited from class java.lang.Object

        clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
    • Field Detail

      • REG_EXP_DATA_LINE

        public static final java.util.regex.Pattern REG_EXP_DATA_LINE
    • Constructor Detail

      • ArffDatasetAdapter

        public ArffDatasetAdapter​(boolean sparseMode,
                                  org.api4.java.ai.ml.core.dataset.descriptor.IDatasetDescriptor datasetDescriptor)
      • ArffDatasetAdapter

        public ArffDatasetAdapter​(boolean sparseMode)
      • ArffDatasetAdapter

        public ArffDatasetAdapter()
    • Method Detail

      • getAttributeWithName

        public org.api4.java.ai.ml.core.dataset.schema.attribute.IAttribute getAttributeWithName​(org.api4.java.ai.ml.core.dataset.descriptor.IFileDatasetDescriptor datasetFile,
                                                                                                 java.lang.String nameOfAttribute)
                                                                                          throws org.api4.java.ai.ml.core.dataset.serialization.DatasetDeserializationFailedException
        Throws:
        org.api4.java.ai.ml.core.dataset.serialization.DatasetDeserializationFailedException
      • deserializeDataset

        public org.api4.java.ai.ml.core.dataset.supervised.ILabeledDataset<org.api4.java.ai.ml.core.dataset.supervised.ILabeledInstance> deserializeDataset​(org.api4.java.ai.ml.core.dataset.descriptor.IFileDatasetDescriptor datasetFile,
                                                                                                                                                            java.lang.String nameOfClassAttribute)
                                                                                                                                                     throws org.api4.java.ai.ml.core.dataset.serialization.DatasetDeserializationFailedException
        Throws:
        org.api4.java.ai.ml.core.dataset.serialization.DatasetDeserializationFailedException
      • deserializeDataset

        public org.api4.java.ai.ml.core.dataset.supervised.ILabeledDataset<org.api4.java.ai.ml.core.dataset.supervised.ILabeledInstance> deserializeDataset​(org.api4.java.ai.ml.core.dataset.descriptor.IFileDatasetDescriptor datasetDescriptor,
                                                                                                                                                            int columnWithClassIndex)
                                                                                                                                                     throws org.api4.java.ai.ml.core.dataset.serialization.DatasetDeserializationFailedException
        Throws:
        org.api4.java.ai.ml.core.dataset.serialization.DatasetDeserializationFailedException
      • deserializeDataset

        public org.api4.java.ai.ml.core.dataset.supervised.ILabeledDataset<org.api4.java.ai.ml.core.dataset.supervised.ILabeledInstance> deserializeDataset​(org.api4.java.ai.ml.core.dataset.descriptor.IDatasetDescriptor datasetDescriptor)
                                                                                                                                                     throws org.api4.java.ai.ml.core.dataset.serialization.DatasetDeserializationFailedException,
                                                                                                                                                            java.lang.InterruptedException
        Specified by:
        deserializeDataset in interface org.api4.java.ai.ml.core.dataset.serialization.IDatasetDeserializer<org.api4.java.ai.ml.core.dataset.supervised.ILabeledDataset<org.api4.java.ai.ml.core.dataset.supervised.ILabeledInstance>>
        Throws:
        org.api4.java.ai.ml.core.dataset.serialization.DatasetDeserializationFailedException
        java.lang.InterruptedException
      • deserializeDataset

        public org.api4.java.ai.ml.core.dataset.supervised.ILabeledDataset<org.api4.java.ai.ml.core.dataset.supervised.ILabeledInstance> deserializeDataset()
                                                                                                                                                     throws java.lang.InterruptedException,
                                                                                                                                                            org.api4.java.ai.ml.core.dataset.serialization.DatasetDeserializationFailedException
        Throws:
        java.lang.InterruptedException
        org.api4.java.ai.ml.core.dataset.serialization.DatasetDeserializationFailedException
      • parseRelation

        protected ai.libs.jaicore.basic.kvstore.KVStore parseRelation​(java.lang.String line)
        Extracts meta data about a relation from a string.
        Parameters:
        line - The line which is to be parsed to extract the necessary information from the relation name.
        Returns:
        A KVStore containing the parsed meta data.
      • parseAttribute

        protected org.api4.java.ai.ml.core.dataset.schema.attribute.IAttribute parseAttribute​(java.lang.String line)
                                                                                       throws org.api4.java.ai.ml.core.dataset.serialization.UnsupportedAttributeTypeException
        parses an attribute definition of an ARff file. General format: @Attribute <attribute_name > <attribute_type>
        Parameters:
        line - to be analyzed
        Returns:
        Object of class IAttribute
        Throws:
        org.api4.java.ai.ml.core.dataset.serialization.UnsupportedAttributeTypeException - when the parsed Attribute not implemented
      • splitDenseInstanceLine

        public java.lang.String[] splitDenseInstanceLine​(java.lang.String line)
      • parseInstance

        protected java.util.List<java.lang.Object> parseInstance​(boolean sparseData,
                                                                 java.util.List<org.api4.java.ai.ml.core.dataset.schema.attribute.IAttribute> attributes,
                                                                 int targetIndex,
                                                                 java.lang.String line)
        Parses a single instance of an ARff file containing values for each attribute given (attributes parameter). Syntax , , ...
        Parameters:
        sparseData - if true there are ? in the data - if it false there are not
        attributes - List of given IAttribute Objects.
        targetIndex -
        line - line of data
        Returns:
        list of IAttribute values
      • createDataset

        protected org.api4.java.ai.ml.core.dataset.supervised.ILabeledDataset<org.api4.java.ai.ml.core.dataset.supervised.ILabeledInstance> createDataset​(ai.libs.jaicore.basic.kvstore.KVStore relationMetaData,
                                                                                                                                                          java.util.List<org.api4.java.ai.ml.core.dataset.schema.attribute.IAttribute> attributes)
      • readDataset

        public org.api4.java.ai.ml.core.dataset.supervised.ILabeledDataset<org.api4.java.ai.ml.core.dataset.supervised.ILabeledInstance> readDataset​(java.io.File datasetFile)
                                                                                                                                              throws org.api4.java.ai.ml.core.dataset.serialization.DatasetDeserializationFailedException
        Throws:
        org.api4.java.ai.ml.core.dataset.serialization.DatasetDeserializationFailedException
      • readDataset

        public org.api4.java.ai.ml.core.dataset.supervised.ILabeledDataset<org.api4.java.ai.ml.core.dataset.supervised.ILabeledInstance> readDataset​(boolean sparseMode,
                                                                                                                                                     java.io.File datasetFile)
                                                                                                                                              throws org.api4.java.ai.ml.core.dataset.serialization.DatasetDeserializationFailedException
        Throws:
        org.api4.java.ai.ml.core.dataset.serialization.DatasetDeserializationFailedException
      • readDataset

        public org.api4.java.ai.ml.core.dataset.supervised.ILabeledDataset<org.api4.java.ai.ml.core.dataset.supervised.ILabeledInstance> readDataset​(boolean sparseMode,
                                                                                                                                                     java.io.File datasetFile,
                                                                                                                                                     int columnWithClassIndex)
                                                                                                                                              throws org.api4.java.ai.ml.core.dataset.serialization.DatasetDeserializationFailedException
        Parses the ARff dataset from the given file into a ILabeledDataset
        Parameters:
        sparseMode -
        datasetFile - file to be parsed
        columnWithClassIndex -
        Throws:
        org.api4.java.ai.ml.core.dataset.serialization.DatasetDeserializationFailedException
      • serializeDataset

        public void serializeDataset​(java.io.File arffOutputFile,
                                     org.api4.java.ai.ml.core.dataset.supervised.ILabeledDataset<? extends org.api4.java.ai.ml.core.dataset.supervised.ILabeledInstance> data)
                              throws java.io.IOException
        Throws:
        java.io.IOException
      • getLoggerName

        public java.lang.String getLoggerName()
        Specified by:
        getLoggerName in interface org.api4.java.common.control.ILoggingCustomizable
      • setLoggerName

        public void setLoggerName​(java.lang.String name)
        Specified by:
        setLoggerName in interface org.api4.java.common.control.ILoggingCustomizable