Class TimeSeriesBagOfFeaturesLearningAlgorithm

  • All Implemented Interfaces:
    java.lang.Iterable<org.api4.java.algorithm.events.IAlgorithmEvent>, java.util.concurrent.Callable<TimeSeriesBagOfFeaturesClassifier>, java.util.Iterator<org.api4.java.algorithm.events.IAlgorithmEvent>, org.api4.java.algorithm.IAlgorithm<ai.libs.jaicore.ml.classification.singlelabel.timeseries.dataset.TimeSeriesDataset2,​TimeSeriesBagOfFeaturesClassifier>, org.api4.java.common.control.ICancelable, org.api4.java.common.control.ILoggingCustomizable, org.api4.java.common.event.IEventEmitter<java.lang.Object>, org.api4.java.common.event.IRelaxedEventEmitter

    public class TimeSeriesBagOfFeaturesLearningAlgorithm
    extends ai.libs.jaicore.ml.classification.singlelabel.timeseries.learner.ASimplifiedTSCLearningAlgorithm<java.lang.Integer,​TimeSeriesBagOfFeaturesClassifier>
    Algorithm to train a Time Series Bag-of-Features (TSBF) classifier as described in Baydogan, Mustafa & Runger, George & Tuv, Eugene. (2013). A Bag-of-Features Framework to Classify Time Series. IEEE Transactions on Pattern Analysis and Machine Intelligence. 35. 2796-802. 10.1109/TPAMI.2013.72.
    • Field Summary

      Fields 
      Modifier and Type Field Description
      static boolean USE_BIAS_CORRECTION
      Indicator whether Bessel's correction should in feature generation.
    • Method Summary

      All Methods Static Methods Instance Methods Concrete Methods 
      Modifier and Type Method Description
      TimeSeriesBagOfFeaturesClassifier call()
      Training procedure construction a Time Series Bag-of-Features (TSBF) classifier using the given input data.
      static int[][] discretizeProbs​(int numBins, double[][] probs)
      Function discretizing probabilities into bins.
      static ai.libs.jaicore.basic.sets.Pair<int[][][],​int[][]> formHistogramsAndRelativeFreqs​(int[][] discretizedProbs, int numInstances, int numClasses, int numBins)
      Function calculating the histograms as described in the paper's section 2.2 ("Codebook and Learning").
      static double[][][][] generateFeatures​(double[][] data, int[][] subsequences, int[][][] intervals)
      Function generating the features for the internal probability measurement model based on the given subseries and their corresponding intervals.
      static double[][] generateHistogramInstances​(int[][][] histograms, int[][] relativeFreqsOfClasses)
      Generates a matrix consisting of the histogram values for each instance out of the given histograms and the relative frequencies of classes for each instance.
      ai.libs.jaicore.basic.sets.Pair<int[][],​int[][][]> generateSubsequencesAndIntervals​(int r, int d, int lMin, int T)
      Method randomly determining the subsequences and their intervals to be used for feature generation of the instances.
      TimeSeriesBagOfFeaturesLearningAlgorithm.ITimeSeriesBagOfFeaturesConfig getConfig()
      static double[][] measureOOBProbabilitiesUsingCV​(double[][] subSeqValueMatrix, int[] targetMatrix, int numProbInstances, int numFolds, int numClasses, weka.classifiers.trees.RandomForest rf)
      Function measuring the out-of-bag (OOB) probabilities using a cross validation with numFolds many folds.
      • Methods inherited from class ai.libs.jaicore.ml.classification.singlelabel.timeseries.learner.ASimplifiedTSCLearningAlgorithm

        cancel, getClassifier, hasNext, iterator, next, nextWithException, registerListener
      • Methods inherited from class ai.libs.jaicore.basic.algorithm.AAlgorithm

        activate, announceTimeoutDetected, avoidReinterruptionOnShutdownOnCurrentThread, checkAndConductTermination, checkTermination, computeTimeoutAware, getActivationTime, getDeadline, getId, getInput, getListeners, getLoggerName, getNumCPUs, getRemainingTimeToDeadline, getState, getTimeout, getTimeoutPrecautionOffset, hasThreadBeenInterruptedDuringShutdown, interruptThreadAsPartOfShutdown, isCanceled, isShutdownInitialized, isStopCriterionSatisfied, isTimeoutDefined, isTimeouted, post, registerActiveThread, resolveShutdownInterruptOnCurrentThread, setConfig, setDeadline, setLoggerName, setMaxNumThreads, setNumCPUs, setState, setTimeout, setTimeout, setTimeoutPrecautionOffset, shutdown, terminate, unregisterActiveThread, unregisterThreadAndShutdown
      • Methods inherited from class java.lang.Object

        clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
      • Methods inherited from interface java.lang.Iterable

        forEach, spliterator
      • Methods inherited from interface java.util.Iterator

        forEachRemaining, remove
    • Field Detail

      • USE_BIAS_CORRECTION

        public static final boolean USE_BIAS_CORRECTION
        Indicator whether Bessel's correction should in feature generation.
        See Also:
        Constant Field Values
    • Method Detail

      • call

        public TimeSeriesBagOfFeaturesClassifier call()
                                               throws org.api4.java.algorithm.exceptions.AlgorithmException
        Training procedure construction a Time Series Bag-of-Features (TSBF) classifier using the given input data.
        Throws:
        org.api4.java.algorithm.exceptions.AlgorithmException
      • generateSubsequencesAndIntervals

        public ai.libs.jaicore.basic.sets.Pair<int[][],​int[][][]> generateSubsequencesAndIntervals​(int r,
                                                                                                         int d,
                                                                                                         int lMin,
                                                                                                         int T)
        Method randomly determining the subsequences and their intervals to be used for feature generation of the instances. As a result, a pair of each subsequence's start and end index and the intervals' start and end indices is returned.
        Parameters:
        r - The number of possible intervals in a time series
        d - The number of intervals for each subsequence
        lMin - The minimum subsequence length
        T - The length of the time series
        Returns:
        a pair of each subsequence's start and end index and the intervals' start and end indices
      • generateFeatures

        public static double[][][][] generateFeatures​(double[][] data,
                                                      int[][] subsequences,
                                                      int[][][] intervals)
        Function generating the features for the internal probability measurement model based on the given subseries and their corresponding intervals. The features are built using the TimeSeriesFeature implementation. As a result, a tensor consisting of the generated features for each interval in each subsequence for each instance is returned (4 dimensions).
        Parameters:
        data - The data used for feature generation
        subsequences - The subsequences used for feature generation (the start and end [exclusive] index is stored for each subsequence)
        intervals - The intervals of each subsequence used for the feature generation (the start and end [exclusive] index is stored for each interval)
        Returns:
        Returns a tensor consisting of the generated features for each interval in each subsequence for each instance
      • generateHistogramInstances

        public static double[][] generateHistogramInstances​(int[][][] histograms,
                                                            int[][] relativeFreqsOfClasses)
        Generates a matrix consisting of the histogram values for each instance out of the given histograms and the relative frequencies of classes for each instance. The histogram values for each instance, class and bin are concatenated. Furthermore, the relative frequencies are also added to the instance's features.
        Parameters:
        histograms - The histograms for each instance (number of instances x number of classes - 1 x number of bins)
        relativeFreqsOfClasses - The relative frequencies of the classes for each instance (previously extracted from each subseries instance per origin instance; dimensionality is number of instances x number of classes)
        Returns:
        Returns a matrix storing the features for each instance (number of instances x number of features)
      • measureOOBProbabilitiesUsingCV

        public static double[][] measureOOBProbabilitiesUsingCV​(double[][] subSeqValueMatrix,
                                                                int[] targetMatrix,
                                                                int numProbInstances,
                                                                int numFolds,
                                                                int numClasses,
                                                                weka.classifiers.trees.RandomForest rf)
                                                         throws org.api4.java.ai.ml.core.exception.TrainingException
        Function measuring the out-of-bag (OOB) probabilities using a cross validation with numFolds many folds. For each fold, the data given by subSeqValueMatrix is split into a training and test set. The test set's probabilities are then derived by a trained Random Forest classifier.
        Parameters:
        subSeqValueMatrix - Input data used to derive the OOB probabilities
        targetMatrix - The target values of the input data
        numProbInstances - Number of instances for which the probabilities should be derived
        numFolds - Number of folds used for the measurement
        numClasses - Number of total classes
        rf - Random Forest classifier which is retrained in each fold
        Returns:
        Returns a matrix storing the probability for each input instance given by subSeqValueMatrix
        Throws:
        org.api4.java.ai.ml.core.exception.TrainingException - Thrown when the classifier rf could not be trained in any fold
      • formHistogramsAndRelativeFreqs

        public static ai.libs.jaicore.basic.sets.Pair<int[][][],​int[][]> formHistogramsAndRelativeFreqs​(int[][] discretizedProbs,
                                                                                                              int numInstances,
                                                                                                              int numClasses,
                                                                                                              int numBins)
        Function calculating the histograms as described in the paper's section 2.2 ("Codebook and Learning"). All probabilities rows belonging to one instance are aggregated by evaluating the discretized probabilities discretizedProbs. Furthermore, the relative frequencies of the classes are collected. As the result, a pair of the generated histograms for all instances and the corresponding normalized relative class frequencies is returned.
        Parameters:
        discretizedProbs - The discretized (binned) probabilities of all instance's subseries rows (the number of rows must be divisible by the number of total instances)
        targets - The targets corresponding to the discretized probabilities
        numInstances - The total number of instances (must be <= the number of rows in discretizedProbs
        numClasses - The total number of classes
        numBins - The number of bins using within the discretization
        Returns:
        Returns a pair of the histograms per instance (numInstances in total) and the corresponding relative frequencies (normalized)
      • discretizeProbs

        public static int[][] discretizeProbs​(int numBins,
                                              double[][] probs)
        Function discretizing probabilities into bins. The bins are determined by steps of 1 / numBins. The result is a matrix with the same dimensionality as probs storing the identifier of the corresponding bins.
        Parameters:
        numBins - Number of bins, determines the probability steps for each bin
        probs - Matrix storing the probabilities of each row for each class (columns)
        Returns:
        Returns a matrix sharing the dimensionality of probs with the discrete bin identifier