Class OneWayAnova

java.lang.Object
org.apache.commons.math4.stat.inference.OneWayAnova

public class OneWayAnova
extends java.lang.Object
Implements one-way ANOVA (analysis of variance) statistics.

Tests for differences between two or more categories of univariate data (for example, the body mass index of accountants, lawyers, doctors and computer programmers). When two categories are given, this is equivalent to the TTest.

Uses the commons-math F Distribution implementation to estimate exact p-values.

This implementation is based on a description at http://faculty.vassar.edu/lowry/ch13pt1.html

 Abbreviations: bg = between groups,
                wg = within groups,
                ss = sum squared deviations
 
Since:
1.2
  • Constructor Summary

    Constructors 
    Constructor Description
    OneWayAnova()
    Default constructor.
  • Method Summary

    Modifier and Type Method Description
    double anovaFValue​(java.util.Collection<double[]> categoryData)
    Computes the ANOVA F-value for a collection of double[] arrays.
    double anovaPValue​(java.util.Collection<double[]> categoryData)
    Computes the ANOVA P-value for a collection of double[] arrays.
    double anovaPValue​(java.util.Collection<SummaryStatistics> categoryData, boolean allowOneElementData)
    Computes the ANOVA P-value for a collection of SummaryStatistics.
    boolean anovaTest​(java.util.Collection<double[]> categoryData, double alpha)
    Performs an ANOVA test, evaluating the null hypothesis that there is no difference among the means of the data categories.

    Methods inherited from class java.lang.Object

    clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
  • Constructor Details

    • OneWayAnova

      public OneWayAnova()
      Default constructor.
  • Method Details

    • anovaFValue

      public double anovaFValue​(java.util.Collection<double[]> categoryData) throws NullArgumentException, DimensionMismatchException
      Computes the ANOVA F-value for a collection of double[] arrays.

      Preconditions:

      • The categoryData Collection must contain double[] arrays.
      • There must be at least two double[] arrays in the categoryData collection and each of these arrays must contain at least two values.

      This implementation computes the F statistic using the definitional formula

         F = msbg/mswg
      where
        msbg = between group mean square
        mswg = within group mean square
      are as defined here
      Parameters:
      categoryData - Collection of double[] arrays each containing data for one category
      Returns:
      Fvalue
      Throws:
      NullArgumentException - if categoryData is null
      DimensionMismatchException - if the length of the categoryData array is less than 2 or a contained double[] array does not have at least two values
    • anovaPValue

      public double anovaPValue​(java.util.Collection<double[]> categoryData) throws NullArgumentException, DimensionMismatchException, ConvergenceException, MaxCountExceededException
      Computes the ANOVA P-value for a collection of double[] arrays.

      Preconditions:

      • The categoryData Collection must contain double[] arrays.
      • There must be at least two double[] arrays in the categoryData collection and each of these arrays must contain at least two values.

      This implementation uses the commons-math F Distribution implementation to estimate the exact p-value, using the formula

         p = 1 - cumulativeProbability(F)
      where F is the F value and cumulativeProbability is the commons-math implementation of the F distribution.
      Parameters:
      categoryData - Collection of double[] arrays each containing data for one category
      Returns:
      Pvalue
      Throws:
      NullArgumentException - if categoryData is null
      DimensionMismatchException - if the length of the categoryData array is less than 2 or a contained double[] array does not have at least two values
      ConvergenceException - if the p-value can not be computed due to a convergence error
      MaxCountExceededException - if the maximum number of iterations is exceeded
    • anovaPValue

      public double anovaPValue​(java.util.Collection<SummaryStatistics> categoryData, boolean allowOneElementData) throws NullArgumentException, DimensionMismatchException, ConvergenceException, MaxCountExceededException
      Computes the ANOVA P-value for a collection of SummaryStatistics.

      Preconditions:

      • The categoryData Collection must contain SummaryStatistics.
      • There must be at least two SummaryStatistics in the categoryData collection and each of these statistics must contain at least two values.

      This implementation uses the commons-math F Distribution implementation to estimate the exact p-value, using the formula

         p = 1 - cumulativeProbability(F)
      where F is the F value and cumulativeProbability is the commons-math implementation of the F distribution.
      Parameters:
      categoryData - Collection of SummaryStatistics each containing data for one category
      allowOneElementData - if true, allow computation for one catagory only or for one data element per category
      Returns:
      Pvalue
      Throws:
      NullArgumentException - if categoryData is null
      DimensionMismatchException - if the length of the categoryData array is less than 2 or a contained SummaryStatistics does not have at least two values
      ConvergenceException - if the p-value can not be computed due to a convergence error
      MaxCountExceededException - if the maximum number of iterations is exceeded
      Since:
      3.2
    • anovaTest

      public boolean anovaTest​(java.util.Collection<double[]> categoryData, double alpha) throws NullArgumentException, DimensionMismatchException, OutOfRangeException, ConvergenceException, MaxCountExceededException
      Performs an ANOVA test, evaluating the null hypothesis that there is no difference among the means of the data categories.

      Preconditions:

      • The categoryData Collection must contain double[] arrays.
      • There must be at least two double[] arrays in the categoryData collection and each of these arrays must contain at least two values.
      • alpha must be strictly greater than 0 and less than or equal to 0.5.

      This implementation uses the commons-math F Distribution implementation to estimate the exact p-value, using the formula

         p = 1 - cumulativeProbability(F)
      where F is the F value and cumulativeProbability is the commons-math implementation of the F distribution.

      True is returned iff the estimated p-value is less than alpha.

      Parameters:
      categoryData - Collection of double[] arrays each containing data for one category
      alpha - significance level of the test
      Returns:
      true if the null hypothesis can be rejected with confidence 1 - alpha
      Throws:
      NullArgumentException - if categoryData is null
      DimensionMismatchException - if the length of the categoryData array is less than 2 or a contained double[] array does not have at least two values
      OutOfRangeException - if alpha is not in the range (0, 0.5]
      ConvergenceException - if the p-value can not be computed due to a convergence error
      MaxCountExceededException - if the maximum number of iterations is exceeded