Package ai.libs.jaicore.ml.tsc.util
Class TimeSeriesUtil
- java.lang.Object
-
- ai.libs.jaicore.ml.tsc.util.TimeSeriesUtil
-
public class TimeSeriesUtil extends java.lang.ObjectUtility class for time series operations.
-
-
Field Summary
Fields Modifier and Type Field Description static doubleEPSILON
-
Method Summary
All Methods Static Methods Concrete Methods Modifier and Type Method Description static double[]backwardDifferenceDerivate(double[] t)Calclualtes f'(n) = f(n-1) - f(n)static double[]backwardDifferenceDerivateWithBoundaries(double[] t)Calclualtes f'(n) = f(n-1) - f(n)static TimeSeriesDatasetcreateDatasetForMatrix(double[][]... valueMatrices)Function creating aTimeSeriesDatasetobject given one or multiplevalueMatrices.static TimeSeriesDatasetcreateDatasetForMatrix(int[] targets, double[][]... valueMatrices)static double[]createEquidistantTimestamps(double[] timeSeries)Creates equidistant timestamps for a time series.static org.nd4j.linalg.api.ndarray.INDArraycreateEquidistantTimestamps(org.nd4j.linalg.api.ndarray.INDArray timeSeries)Creates equidistant timestamps for a time series.static double[]forwardDifferenceDerivate(double[] t)f'(n) = f(n+1) - f(n)static double[]forwardDifferenceDerivateWithBoundaries(double[] t)f'(n) = f(n+1) - f(n)static java.util.List<java.lang.Integer>getClassesInDataset(TimeSeriesDataset dataset)Returns a list storing the unique Integer class values in the givendataset.static double[]getInterval(double[] timeSeries, int start, int end)Function extracting the interval [start, end (exclusive)] out of the giventimeSeriesvector.static <T> TgetMaximumKeyByValue(java.util.Map<T,java.lang.Integer> map)Returns the key with the maximum integer value.static intgetMode(int[] array)Returns the mode of the givenarray.static intgetNumberOfClasses(TimeSeriesDataset dataset)Counts the number of unique classes occurring in the givendataset.static ai.libs.jaicore.basic.sets.Pair<TimeSeriesDataset,TimeSeriesDataset>getTrainingAndTestDataForFold(int fold, int numFolds, double[][] srcValueMatrix, int[] srcTargetMatrix)Functions creating twoTimeSeriesDatasetobjects representing the training and test split for the givenfoldof a cross validation withnumFoldsmany folds.static double[]gulloDerivate(double[] t)Calculates the derivative of a timeseries as described first by Gullo et. al (2009).static double[]gulloDerivateWithBoundaries(double[] t)f'(n) = \frac{f(i+1)-f(i-1)}{2}static booleanisSameLength(double[] timeSeries1, double[]... timeSeries)Checks whether multiple arrays have the same length.static booleanisSameLength(org.nd4j.linalg.api.ndarray.INDArray timeSeries1, org.nd4j.linalg.api.ndarray.INDArray... timeSeries)Checks whether multiple arrays have the same length.static voidisSameLengthOrException(double[] timeSeries1, double[]... timeSeries)Checks whether multiple arrays have the same length.static voidisSameLengthOrException(org.nd4j.linalg.api.ndarray.INDArray timeSeries1, org.nd4j.linalg.api.ndarray.INDArray... timeSeries)Checks whether multiple arrays have the same length.static booleanisTimeSeries(int length, double[]... array)Checks, whether given array are valid time series with a given length.static booleanisTimeSeries(int length, org.nd4j.linalg.api.ndarray.INDArray... array)Checks, whether given INDArrays are valid time series with a given length.static booleanisTimeSeries(org.nd4j.linalg.api.ndarray.INDArray... array)Checks, whether given INDArray are valid time series.static voidisTimeSeriesOrException(int length, double[]... array)Checks, whether given INDArrays are valid time series with a given length.static voidisTimeSeriesOrException(int length, org.nd4j.linalg.api.ndarray.INDArray... array)Checks, whether given INDArrays are valid time series with a given length.static voidisTimeSeriesOrException(org.nd4j.linalg.api.ndarray.INDArray... array)Checks, whether given INDArrays are valid time series.static double[]keoghDerivate(double[] t)Calculates the derivative of a timeseries as described first by Keogh and Pazzani (2001).static double[]keoghDerivateWithBoundaries(double[] t)Calculates the derivateive of a timeseries as described first by Keogh and Pazzani (2001).static doublemean(double[] t)static double[]normalizeByStandardDeviation(double[] t)static org.nd4j.linalg.api.ndarray.INDArraynormalizeINDArray(org.nd4j.linalg.api.ndarray.INDArray array, boolean inplace)Normalizes an INDArray vector object.static voidshuffleTimeSeriesDataset(TimeSeriesDataset dataset, int seed)Shuffles the givenTimeSeriesDatasetobject using the givenseed.static java.util.List<java.lang.Integer>sortIndexes(double[] vector, boolean ascending)Sorts the indices of the givenvectorbased on the the vector's values (argsort).static doublestandardDeviation(double[] t)Calculates the (population) standard deviation of the values of a times series.static doublesum(double[] t)static java.lang.StringtoString(double[] timeSeries)Enables printing of time series.static doublevariance(double[] t)Calculates the (population) variance of the values of a times series.static double[]zNormalize(double[] dataVector, boolean besselsCorrection)Z-normalizes a givendataVector.static double[]zTransform(double[] t)
-
-
-
Field Detail
-
EPSILON
public static final double EPSILON
- See Also:
- Constant Field Values
-
-
Method Detail
-
isTimeSeries
public static boolean isTimeSeries(org.nd4j.linalg.api.ndarray.INDArray... array)
Checks, whether given INDArray are valid time series.- Parameters:
array-- Returns:
- True, if the all arrays are valid time series.
-
isTimeSeries
public static boolean isTimeSeries(int length, org.nd4j.linalg.api.ndarray.INDArray... array)Checks, whether given INDArrays are valid time series with a given length.- Parameters:
array-length-- Returns:
- True, if the array is a valid time series of the given length. False, otherwise.
-
isTimeSeries
public static boolean isTimeSeries(int length, double[]... array)Checks, whether given array are valid time series with a given length.- Parameters:
array-length-- Returns:
- True, if the array is a valid time series of the given length. False, otherwise.
-
isTimeSeriesOrException
public static void isTimeSeriesOrException(org.nd4j.linalg.api.ndarray.INDArray... array)
Checks, whether given INDArrays are valid time series. Throws an exception otherwise.- Parameters:
array-- Throws:
java.lang.IllegalArgumentException
-
isTimeSeriesOrException
public static void isTimeSeriesOrException(int length, org.nd4j.linalg.api.ndarray.INDArray... array)Checks, whether given INDArrays are valid time series with a given length. Throws an exception otherwise.- Parameters:
array-length-- Throws:
java.lang.IllegalArgumentException
-
isTimeSeriesOrException
public static void isTimeSeriesOrException(int length, double[]... array)Checks, whether given INDArrays are valid time series with a given length. Throws an exception otherwise.- Parameters:
array-length-- Throws:
java.lang.IllegalArgumentException
-
isSameLength
public static boolean isSameLength(org.nd4j.linalg.api.ndarray.INDArray timeSeries1, org.nd4j.linalg.api.ndarray.INDArray... timeSeries)Checks whether multiple arrays have the same length.- Parameters:
timeSeries1-timeSeries2-- Returns:
- True if the arrays have the same length. False, otherwise.
-
isSameLength
public static boolean isSameLength(double[] timeSeries1, double[]... timeSeries)Checks whether multiple arrays have the same length.- Parameters:
timeSeries1-timeSeries2-- Returns:
- True if the arrays have the same length. False, otherwise.
-
isSameLengthOrException
public static void isSameLengthOrException(org.nd4j.linalg.api.ndarray.INDArray timeSeries1, org.nd4j.linalg.api.ndarray.INDArray... timeSeries)Checks whether multiple arrays have the same length. Throws an exception otherwise.- Parameters:
timeSeries1-timeSeries2-- Throws:
TimeSeriesLengthException
-
isSameLengthOrException
public static void isSameLengthOrException(double[] timeSeries1, double[]... timeSeries)Checks whether multiple arrays have the same length. Throws an exception otherwise.- Parameters:
timeSeries1-timeSeries2-- Throws:
TimeSeriesLengthException
-
createEquidistantTimestamps
public static org.nd4j.linalg.api.ndarray.INDArray createEquidistantTimestamps(org.nd4j.linalg.api.ndarray.INDArray timeSeries)
Creates equidistant timestamps for a time series.- Parameters:
timeSeries- Time series to generate timestamps for. Let n be its length.- Returns:
- Equidistant timestamp, i.e. {0, 1, .., n-1}.
-
createEquidistantTimestamps
public static double[] createEquidistantTimestamps(double[] timeSeries)
Creates equidistant timestamps for a time series.- Parameters:
timeSeries- Time series to generate timestamps for. Let n be its length.- Returns:
- Equidistant timestamp, i.e. {0, 1, .., n-1}.
-
getInterval
public static double[] getInterval(double[] timeSeries, int start, int end)Function extracting the interval [start, end (exclusive)] out of the giventimeSeriesvector.- Parameters:
timeSeries- Time series vector sourcestart- Start of the intervalend- End index of the interval (exclusive)- Returns:
- Returns the specified interval as a double array
-
normalizeINDArray
public static org.nd4j.linalg.api.ndarray.INDArray normalizeINDArray(org.nd4j.linalg.api.ndarray.INDArray array, boolean inplace)Normalizes an INDArray vector object.- Parameters:
array- INDArray row vector with single shape dimensioninplace- Indication whether the normalization should be performed in place or on a new array copy- Returns:
- Returns the view on the transformed INDArray (if inplace) or a normalized copy of the input array (if not inplace)
-
getMode
public static int getMode(int[] array)
Returns the mode of the givenarray. If there are multiple values with the same frequency, the lower value will be taken.- Parameters:
array- The array which mode should be returned- Returns:
- Returns the mode, i. e. the most frequently occurring int value
-
getMaximumKeyByValue
public static <T> T getMaximumKeyByValue(java.util.Map<T,java.lang.Integer> map)
Returns the key with the maximum integer value. If there are multiple values with the same value, the lower key with regard to its type will be taken.- Parameters:
map- The map storing the keys with its corresponding integer values- Returns:
- Returns the key of type
storing the maximum integer value
-
zNormalize
public static double[] zNormalize(double[] dataVector, boolean besselsCorrection)Z-normalizes a givendataVector. Uses Bessel's correction (1/(n-1) in the calculation of the standard deviation) if set.- Parameters:
dataVector- Vector to be z-normalizedbesselsCorrection- Indicator whether the std dev correction using n-1 instead of n should be applied- Returns:
- Z-normalized vector
-
sortIndexes
public static java.util.List<java.lang.Integer> sortIndexes(double[] vector, boolean ascending)Sorts the indices of the givenvectorbased on the the vector's values (argsort).- Parameters:
vector- Vector where the values are extracted fromascending- Indicator whether the indices should be sorted ascending- Returns:
- Returns the list of indices which are sorting based on the vector's values
-
getNumberOfClasses
public static int getNumberOfClasses(TimeSeriesDataset dataset)
Counts the number of unique classes occurring in the givendataset.- Parameters:
dataset- Dataset to be evaluated- Returns:
- Returns the number of unique classes occurring in target matrix of
the given
dataset
-
getClassesInDataset
public static java.util.List<java.lang.Integer> getClassesInDataset(TimeSeriesDataset dataset)
Returns a list storing the unique Integer class values in the givendataset.- Parameters:
dataset- Dataset to be evaluated- Returns:
- Returns a
Listobject storing the unique Integer class values of the dataset
-
shuffleTimeSeriesDataset
public static void shuffleTimeSeriesDataset(TimeSeriesDataset dataset, int seed)
Shuffles the givenTimeSeriesDatasetobject using the givenseed.- Parameters:
dataset- The dataset to be shuffledseed- The seed used within the randomized shuffling
-
getTrainingAndTestDataForFold
public static ai.libs.jaicore.basic.sets.Pair<TimeSeriesDataset,TimeSeriesDataset> getTrainingAndTestDataForFold(int fold, int numFolds, double[][] srcValueMatrix, int[] srcTargetMatrix)
Functions creating twoTimeSeriesDatasetobjects representing the training and test split for the givenfoldof a cross validation withnumFoldsmany folds. Data is extracted (and copied) from the givensrcValueMatrixandsrcTargetMatrix. The function uses the two functionsTimeSeriesUtil#selectTrainingDataForFold(int, int, int, int, double[][], int[])andTimeSeriesUtil#selectTestDataForFold(int, int, int, int, double[][], int[]).- Parameters:
fold- The current fold for which the datasets should be preparednumFolds- Number of total folds using within the performed cross validationsrcValueMatrix- Source dataset from which the instances are copiedsrcTargetMatrix- Source targets from which the targets are copied- Returns:
- Returns a pair consisting of the training and test dataset
-
createDatasetForMatrix
public static TimeSeriesDataset createDatasetForMatrix(int[] targets, double[][]... valueMatrices)
- Parameters:
targets- The target values of the instancesvalueMatrices- One or more matrices storing the time series values- Returns:
- Returns a
TimeSeriesDatasetobject constructed out of the given parameters
-
createDatasetForMatrix
public static TimeSeriesDataset createDatasetForMatrix(double[][]... valueMatrices)
Function creating aTimeSeriesDatasetobject given one or multiplevalueMatrices.- Parameters:
valueMatrices- One or more matrices storing the time series values- Returns:
- Returns a
TimeSeriesDatasetobject constructed out of the given parameters
-
toString
public static java.lang.String toString(double[] timeSeries)
Enables printing of time series.- Parameters:
timeSeries- Time series to print.- Returns:
- Readable string of the time series, i.e.
"{1.0, 2.0, 3.0, 4.0}"
-
keoghDerivate
public static double[] keoghDerivate(double[] t)
Calculates the derivative of a timeseries as described first by Keogh and Pazzani (2001).f'(n) = \frac{ f(n) - f(n-1) + /frac{f(i+1) - f(i-1)}{2} }{2}- Parameters:
t-- Returns:
-
keoghDerivateWithBoundaries
public static double[] keoghDerivateWithBoundaries(double[] t)
Calculates the derivateive of a timeseries as described first by Keogh and Pazzani (2001).f'(n) = \frac{ f(n) - f(n-1) + /frac{f(i+1) - f(i-1)}{2} }{2}- Parameters:
t-- Returns:
-
backwardDifferenceDerivate
public static double[] backwardDifferenceDerivate(double[] t)
Calclualtes f'(n) = f(n-1) - f(n)- Parameters:
t- Time series.- Returns:
-
backwardDifferenceDerivateWithBoundaries
public static double[] backwardDifferenceDerivateWithBoundaries(double[] t)
Calclualtes f'(n) = f(n-1) - f(n)- Parameters:
t- Time series.- Returns:
-
forwardDifferenceDerivate
public static double[] forwardDifferenceDerivate(double[] t)
f'(n) = f(n+1) - f(n)- Parameters:
t-- Returns:
-
forwardDifferenceDerivateWithBoundaries
public static double[] forwardDifferenceDerivateWithBoundaries(double[] t)
f'(n) = f(n+1) - f(n)- Parameters:
t-- Returns:
-
gulloDerivate
public static double[] gulloDerivate(double[] t)
Calculates the derivative of a timeseries as described first by Gullo et. al (2009).- Parameters:
t-- Returns:
-
gulloDerivateWithBoundaries
public static double[] gulloDerivateWithBoundaries(double[] t)
f'(n) = \frac{f(i+1)-f(i-1)}{2}- Parameters:
t-- Returns:
-
sum
public static double sum(double[] t)
-
mean
public static double mean(double[] t)
-
variance
public static double variance(double[] t)
Calculates the (population) variance of the values of a times series.
-
standardDeviation
public static double standardDeviation(double[] t)
Calculates the (population) standard deviation of the values of a times series.
-
zTransform
public static double[] zTransform(double[] t)
-
normalizeByStandardDeviation
public static double[] normalizeByStandardDeviation(double[] t)
-
-