Class SpecTransform

java.lang.Object
org.jamdev.jpamutils.spectrogram.SpecTransform

public class SpecTransform
extends Object
Transforms spectrogram data.
Author:
Jamie Macaulay
  • Constructor Summary

    Constructors 
    Constructor Description
    SpecTransform​(Spectrogram spectrgram)
    Constructor for the spectrogram transform.
  • Method Summary

    Modifier and Type Method Description
    static double[][] blurImage​(double[][] img, double sigma)
    Smooth the input image using a median or Gaussian blur filter.
    static double[][] clamp​(double[][] array, double minVal, double maxVal)
    Clamp a spectrogram between two values.
    SpecTransform clamp​(double minVal, double maxVal)
    Clamp the current spectrogram between two values.
    static double[][] copyArr​(double[][] array)
    Hard copy of an array.
    SpecTransform dBSpec()
    Convert the current spectrogram data to dB using 10*log10(linear) ;
    SpecTransform dBSpec​(boolean power)
    Convert the current spectrogram data to dB using 10*log10(linear);
    SpecTransform dBSpec​(boolean power, double mindB)
    Convert the current spectrogram data to dB using 10*log10(linear);
    static double[][] dBSpec​(double[][] array, boolean power, double minddB)
    Convert a spectrogram to dB.
    SpecTransform dBSpec​(Double mindB)
    Convert the current spectrogram data to dB using 10*log10(linear);
    SpecTransform enhance​(double enhanceFactor)
    Enhance the contrast between regions of high and low intensity, while preserving the range of pixel values.
    static double[][] enhance​(double[][] img, double enhancement)
    Enhance the contrast between regions of high and low intensity, while preserving the range of pixel values.
    static double[][] filterIsolatedSpots​(double[][] img, int[][] struct)
    Discard pixels that are lower than the median threshold.
    SpecTransform gaussianFilter​(double sigma)
    Clamp the current spectrogram between two values.
    static double[][] generateKernal​(double sigma)
    Generate the Kernel for a sigma value
    double[][] getImag()
    Get the imaginary data from the transformed spectrogram.
    double[][] getReal()
    Get the real data from the transformed spectrogram.
    Spectrogram getSpectrgram()
    Get the raw spectrogram.
    double[][] getTransformedData()
    Get the transformed spectrgram data.
    static double[][] interpolate​(double[][] array, double fMin, double fMax, int freqBins, float sR)
    Interpolate a spectrogram so that it has a specified number of frequency bins and sits that sits between two frequency limits.
    SpecTransform interpolate​(double fMin, double fMax, int freqBins)
    Interpolate a spectrogram so that it has a specified number of frequency bins and sits that sits between two frequency limits.
    static double[][] medianFilter​(double[][] img1, double rowfactor, double colfactor)
    Discard pixels that are lower than the median threshold.
    SpecTransform medianFilter​(double rowfactor, double colfactor)
    Discard pixels that are lower than the median threshold.
    static double[] nearestNeighbourInterp​(double[] inputArray, int w2)
    Perform a nearest neighbour interpolation of a 1D array of evenly spaced values.
    static double[][] normalise​(double[][] array, double min_leveldB, double ref_level_dB)
    Normalise a spectrogram.
    SpecTransform normalise​(double min_leveldB, double ref_level_dB)
    Normalise the current spectrogram between two reference values.
    SpecTransform normaliseMinMax()
    Normalise the current spectrogram between the minimum and maximum of the array
    static double[][] normaliseMinMax​(double[][] img)
    A minimum/maximum spectrogram normalisation.
    SpecTransform normaliseRowSum()
    Normalise the current spectrogram by dividing by sum of the square of the sum of all rows.
    static double[][] normaliseRowSum​(double[][] img)
    Normalise a spectrogram by summing each row and squaring it then dividing the entire array by that value.
    static double[][] normaliseStd​(double[][] img, double mean, double std)
    Normalize the data array to specified mean and standard deviation.
    SpecTransform normaliseStd​(double mean, double std)
    Normalize the data array to specified mean and standard deviation.
    static double[][] reduceTonalNoiseMean​(double[][] input, double timeLenConst)
    Reduce continuous tonal noise produced by e.g.
    SpecTransform reduceTonalNoiseMean​(int timeConstLen)
    Reduce continuous tonal noise produced by e.g.
    SpecTransform reduceTonalNoiseMedian()
    Subtracts from each row the median value of that row.
    static double[][] reduceTonalNoiseMedian​(double[][] img)
    Subtracts from each row the median value of that row.
    void setTransformedData​(double[][] absoluteSpectrogram)
    Manually set the transformed data.

    Methods inherited from class java.lang.Object

    clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
  • Constructor Details

    • SpecTransform

      public SpecTransform​(Spectrogram spectrgram)
      Constructor for the spectrogram transform.
      Parameters:
      spectrgram - - the spectrogram
  • Method Details

    • dBSpec

      public SpecTransform dBSpec()
      Convert the current spectrogram data to dB using 10*log10(linear) ;
      Returns:
      reference to the spectrogram object.
    • dBSpec

      public SpecTransform dBSpec​(boolean power)
      Convert the current spectrogram data to dB using 10*log10(linear);
      Parameters:
      power - -true for power 10*log(X) or false for amplitude 20*log10(X).
      Returns:
      reference to the spectrogram object.
    • dBSpec

      public SpecTransform dBSpec​(Double mindB)
      Convert the current spectrogram data to dB using 10*log10(linear);
      Parameters:
      mindB - - the minimum allowed mindB.
      Returns:
      reference to the spectrogram object.
    • dBSpec

      public SpecTransform dBSpec​(boolean power, double mindB)
      Convert the current spectrogram data to dB using 10*log10(linear);
      Parameters:
      power - -true for power 10*log(X) or false for amplitude 20*log10(X).
      mindB - - the minimum allowed mindB.
      Returns:
      reference to the spectrogram object.
    • normalise

      public SpecTransform normalise​(double min_leveldB, double ref_level_dB)
      Normalise the current spectrogram between two reference values.
      Parameters:
      min_leveldB - - the minimum dB level
      ref_level_dB - - the reference dB level to normalise to
      Returns:
      reference to the normalised spectrogram
    • normaliseMinMax

      public SpecTransform normaliseMinMax()
      Normalise the current spectrogram between the minimum and maximum of the array
      Returns:
      reference to the normalised spectrogram
    • normaliseRowSum

      public SpecTransform normaliseRowSum()
      Normalise the current spectrogram by dividing by sum of the square of the sum of all rows.
      Returns:
      reference to the normalised spectrogram
    • normaliseStd

      public SpecTransform normaliseStd​(double mean, double std)
      Normalize the data array to specified mean and standard deviation. For the data array to be normalizable, it must have non-zero standard deviation. If this is not the case, the array is unchanged by calling this method.

      From Ketos (Meridian).

      Parameters:
      img - - the img to normalise
      mean - - Mean value of the normalized array. The default is 0.
      std - - Standard deviation of the normalized array. The default is 1.
      Returns:
      the normalised array
    • reduceTonalNoiseMean

      public SpecTransform reduceTonalNoiseMean​(int timeConstLen)
      Reduce continuous tonal noise produced by e.g. ships and slowly varying background noise by subtracting from each row a running mean, computed according to the formula given in Baumgartner & Mussoline, Journal of the Acoustical Society of America 129, 2889 (2011); doi: 10.1121/1.3562166

      From Ketos (Meridian).

      Parameters:
      input - - a spectrogram image.
      Returns:
      timeConstLen - Time constant in number of samples, used for the computation of the running mean.
    • reduceTonalNoiseMedian

      public SpecTransform reduceTonalNoiseMedian()
      Subtracts from each row the median value of that row.
      Parameters:
      input - - a spectrogram image.
      Returns:
      corrected array.
    • medianFilter

      public SpecTransform medianFilter​(double rowfactor, double colfactor)
      Discard pixels that are lower than the median threshold. The resulting image will have 0s for pixels below the threshold and 1s for the pixels above the threshold. Note: Code adapted from Kahl et al. (2017) Paper: http://ceur-ws.org/Vol-1866/paper_143.pdf Code: https://github.com/kahst/BirdCLEF2017/blob/master/birdCLEF_spec.py

      From Ketos (Meridian).

      Parameters:
      rowfactor - - Factor by which the row-wise median pixel value will be multiplied in orther to define the threshold.
      colfactor - - Factor by which the col-wise median pixel value will be multiplied in orther to define the threshold.
      Returns:
      the fitlered median.
    • enhance

      public SpecTransform enhance​(double enhanceFactor)
      Enhance the contrast between regions of high and low intensity, while preserving the range of pixel values. Multiplies each pixel value by the factor, .. math:: f(x) = ( e^{-(x - m_x - \sigma_m) / w} + 1)^{-1} where :math:`x` is the pixel value, :math:`m_x` is the pixel value median of the image, and :math:`w = \sigma_x / \epsilon`, where :math:`\sigma_x` is the pixel value standard deviation of the image and :math:`\epsilon` is the enhancement parameter. Some observations: :math:`f(x)` is a smoothly increasing function from 0 to 1. :math:`f(m_x)=0.5`, i.e. the median :math:`m_x` demarks the transition from "low intensity" to "high intensity". The smaller the width, :math:`w`, the faster the transition from 0 to 1. *

      From Ketos (Meridian).

      Parameters:
      enhancement - - Time constant in number of samples, used for the computation of the running mean.
      Returns:
      - the enahnced transform
    • interpolate

      public SpecTransform interpolate​(double fMin, double fMax, int freqBins)
      Interpolate a spectrogram so that it has a specified number of frequency bins and sits that sits between two frequency limits.
      Parameters:
      fMin - - the minimum frequency (Hz)
      fMin - - the minimum frequency (Hz)
      freqBins - - the number of frequency bins to interpolate to. This is the number of bins between fMin and fMax
      Returns:
      the interpolated spectrogram object.
    • clamp

      public SpecTransform clamp​(double minVal, double maxVal)
      Clamp the current spectrogram between two values.
      Parameters:
      minVal - - the minimum value to clamp between.
      maxVal - - the maximum value to clamp to.
      Returns:
      reference the clamped spectrogram.
    • gaussianFilter

      public SpecTransform gaussianFilter​(double sigma)
      Clamp the current spectrogram between two values.
      Returns:
      reference the clamped spectrogram.
    • dBSpec

      public static double[][] dBSpec​(double[][] array, boolean power, double minddB)
      Convert a spectrogram to dB.
      Parameters:
      array - - the absolute spectrogram array.
      power - -true for power 10*log(X) or false for amplitude 20*log10(X).
      mindB - - the minimum dB i.e. if the 10/20*log10(value) in array is below this value, the value is st to mindB.
      Returns:
      the normalised spectrogram.
    • normalise

      public static double[][] normalise​(double[][] array, double min_leveldB, double ref_level_dB)
      Normalise a spectrogram.
      Parameters:
      array - - the absolute spectrogram array.
      Returns:
      the normalised spectrogram.
    • normaliseMinMax

      public static double[][] normaliseMinMax​(double[][] img)
      A minimum/maximum spectrogram normalisation.
      Parameters:
      img - - the absolute spectrogram array.
      Returns:
      the normalised spectrogram.
    • copyArr

      public static double[][] copyArr​(double[][] array)
      Hard copy of an array.
      Parameters:
      array -
      Returns:
    • normaliseRowSum

      public static double[][] normaliseRowSum​(double[][] img)
      Normalise a spectrogram by summing each row and squaring it then dividing the entire array by that value.
      Parameters:
      array - - the absolute spectrogram array.
      Returns:
      the normalised spectrogram.
    • clamp

      public static double[][] clamp​(double[][] array, double minVal, double maxVal)
      Clamp a spectrogram between two values.
      Parameters:
      array - - the spectrogram array.
      Returns:
      the clamped spectrogram.
    • interpolate

      public static double[][] interpolate​(double[][] array, double fMin, double fMax, int freqBins, float sR)
      Interpolate a spectrogram so that it has a specified number of frequency bins and sits that sits between two frequency limits.
      Parameters:
      array - - the spectrogram array. This should be spectrogram covering it's full frequency range.
      fMin - - the minimum frequency (Hz)
      fMin - - the minimum frequency (Hz)
      freqBins - - the number of frequency bins to interpolate to.
      Returns:
      interpolate spectrogram
    • nearestNeighbourInterp

      public static double[] nearestNeighbourInterp​(double[] inputArray, int w2)
      Perform a nearest neighbour interpolation of a 1D array of evenly spaced values.
      Parameters:
      inputArray - - the array to interpolate.
      w2 - - the new length of array.
      Returns:
      the interpolated array.
    • reduceTonalNoiseMean

      public static double[][] reduceTonalNoiseMean​(double[][] input, double timeLenConst)
      Reduce continuous tonal noise produced by e.g. ships and slowly varying background noise by subtracting from each row a running mean, computed according to the formula given in Baumgartner & Mussoline, Journal of the Acoustical Society of America 129, 2889 (2011); doi: 10.1121/1.3562166

      From Ketos (Meridian).

      Parameters:
      input - - a spectrogram image.
      Returns:
      timeConstLen - Time constant in number of samples, used for the computation of the running mean.
    • reduceTonalNoiseMedian

      public static double[][] reduceTonalNoiseMedian​(double[][] img)
      Subtracts from each row the median value of that row.
      Parameters:
      input - - a spectrogram image.

      From Ketos (Meridian).

      Returns:
      corrected array.
    • normaliseStd

      public static double[][] normaliseStd​(double[][] img, double mean, double std)
      Normalize the data array to specified mean and standard deviation. For the data array to be normalizable, it must have non-zero standard deviation. If this is not the case, the array is unchanged by calling this method.

      From Ketos (Meridian).

      Parameters:
      img - - the img to normalise
      mean - - Mean value of the normalized array. The default is 0.
      std - - Standard deviation of the normalized array. The default is 1.
      Returns:
      the normalised array
    • filterIsolatedSpots

      public static double[][] filterIsolatedSpots​(double[][] img, int[][] struct)
      Discard pixels that are lower than the median threshold. The resulting image will have 0s for pixels below the threshold and 1s for the pixels above the threshold.
      Parameters:
      input - - a spectrogram image.
    • blurImage

      public static double[][] blurImage​(double[][] img, double sigma)
      Smooth the input image using a median or Gaussian blur filter. Note that the input image is recasted as np.float32. This is essentially a wrapper around the scipy.ndimage.median_filter and scipy.ndimage.gaussian_filter methods. For further details, see https://docs.scipy.org/doc/scipy/reference/ndimage.html

      From Ketos (Meridian).

      Parameters:
      img - - image to be processed.
      size - - Only used by the median filter. Describes the shape that is taken from the input array, at every element position, to define the input to the filter function.
      sigma - - Only used by the Gaussian filter. Standard deviation for Gaussian kernel. May be given as a single number, in which case all axes have the same standard deviation, or as an array, allowing for the axes to have different standard deviations.
      gaussian - - Switch between median and Gaussian (default) filter
      Returns:
      blurred image.
    • generateKernal

      public static double[][] generateKernal​(double sigma)
      Generate the Kernel for a sigma value
      Parameters:
      sigma -
      Returns:
    • medianFilter

      public static double[][] medianFilter​(double[][] img1, double rowfactor, double colfactor)
      Discard pixels that are lower than the median threshold. The resulting image will have 0s for pixels below the threshold and 1s for the pixels above the threshold. Note: Code adapted from Kahl et al. (2017) Paper: http://ceur-ws.org/Vol-1866/paper_143.pdf Code: https://github.com/kahst/BirdCLEF2017/blob/master/birdCLEF_spec.py

      From Ketos (Meridian).

      Parameters:
      img1 - - Array containing the img to be filtered.
      rowfactor - - Factor by which the row-wise median pixel value will be multiplied in orther to define the threshold.
      colfactor - - Factor by which the col-wise median pixel value will be multiplied in orther to define the threshold.
      Returns:
      numpy array The filtered image with 0s and 1s.
    • enhance

      public static double[][] enhance​(double[][] img, double enhancement)
      Enhance the contrast between regions of high and low intensity, while preserving the range of pixel values. Multiplies each pixel value by the factor, .. math:: f(x) = ( e^{-(x - m_x - \sigma_m) / w} + 1)^{-1} where :math:`x` is the pixel value, :math:`m_x` is the pixel value median of the image, and :math:`w = \sigma_x / \epsilon`, where :math:`\sigma_x` is the pixel value standard deviation of the image and :math:`\epsilon` is the enhancement parameter. Some observations: :math:`f(x)` is a smoothly increasing function from 0 to 1. :math:`f(m_x)=0.5`, i.e. the median :math:`m_x` demarks the transition from "low intensity" to "high intensity". The smaller the width, :math:`w`, the faster the transition from 0 to 1.

      From Ketos (Meridian).

      Parameters:
      input - - a spectrogram image.
      enhancement - - Time constant in number of samples, used for the computation of the running mean.
      Returns:
      - the enahnced image
    • getReal

      public double[][] getReal()
      Get the real data from the transformed spectrogram.
      Returns:
      the real data from the spectrogram transform.
    • getImag

      public double[][] getImag()
      Get the imaginary data from the transformed spectrogram.
      Returns:
      the real data from the spectrogram transform.
    • getTransformedData

      public double[][] getTransformedData()
      Get the transformed spectrgram data.
      Returns:
      the transformed spectrgram data.
    • getSpectrgram

      public Spectrogram getSpectrgram()
      Get the raw spectrogram. This has no undergone any transformations. See getTransformedData for the transformed spectrogram.
      Returns:
      the original spectrogram data.
    • setTransformedData

      public void setTransformedData​(double[][] absoluteSpectrogram)
      Manually set the transformed data.
      Parameters:
      absoluteSpectrogram - - the data to set.