Class Percentile
- All Implemented Interfaces:
Serializable,UnivariateStatistic
- Direct Known Subclasses:
Median
There are several commonly used methods for estimating percentiles (a.k.a. quantiles) based on sample data. For large samples, the different methods agree closely, but when sample sizes are small, different methods will give significantly different results. The algorithm implemented here works as follows:
- Let
nbe the length of the (sorted) array and0 invalid input: '<' p invalid input: '<'= 100be the desired percentile. - If
n = 1return the unique array element (regardless of the value ofp); otherwise - Compute the estimated percentile position
pos = p * (n + 1) / 100and the difference,dbetweenposandfloor(pos)(i.e. the fractional part ofpos). Ifpos >= nreturn the largest element in the array; otherwise - Let
lowerbe the element in positionfloor(pos)in the array and letupperbe the next element in the array. Returnlower + d * (upper - lower)
To compute percentiles, the data must be at least partially ordered. Input
arrays are copied and recursively partitioned using an ordering definition.
The ordering used by Arrays.sort(double[]) is the one determined
by Double.compareTo(Double). This ordering makes
Double.NaN larger than any other value (including
Double.POSITIVE_INFINITY). Therefore, for example, the median
(50th percentile) of
{0, 1, 2, 3, 4, Double.NaN} evaluates to 2.5.
Since percentile estimation usually involves interpolation between array
elements, arrays containing NaN or infinite values will often
result in NaN or infinite values returned.
Since 2.2, Percentile implementation uses only selection instead of complete
sorting and caches selection algorithm state between calls to the various
evaluate methods when several percentiles are to be computed on the same data.
This greatly improves efficiency, both for single percentile and multiple
percentiles computations. However, it also induces a need to be sure the data
at one call to evaluate is the same as the data with the cached algorithm
state from the previous calls. Percentile does this by checking the array reference
itself and a checksum of its content by default. If the user already knows he calls
evaluate on an immutable array, he can save the checking time by calling the
evaluate methods that do not
Note that this implementation is not synchronized. If
multiple threads access an instance of this class concurrently, and at least
one of the threads invokes the increment() or
clear() method, it must be synchronized externally.
- See Also:
-
Constructor Summary
ConstructorsConstructorDescriptionConstructs a Percentile with a default quantile value of 50.0.Percentile(double p) Constructs a Percentile with the specific quantile value.Percentile(Percentile original) Copy constructor, creates a newPercentileidentical to theoriginal -
Method Summary
Modifier and TypeMethodDescriptioncopy()Returns a copy of the statistic with the same internal state.static voidcopy(Percentile source, Percentile dest) Copies source to dest.doubleevaluate(double p) Returns the result of evaluating the statistic over the stored data.doubleevaluate(double[] values, double p) Returns an estimate of thepth percentile of the values in thevaluesarray.doubleevaluate(double[] values, int start, int length) Returns an estimate of thequantileth percentile of the designated values in thevaluesarray.doubleevaluate(double[] values, int begin, int length, double p) Returns an estimate of thepth percentile of the values in thevaluesarray, starting with the element in (0-based) positionbeginin the array and includinglengthvalues.doubleReturns the value of the quantile field (determines what percentile is computed when evaluate() is called with no quantile argument).voidsetData(double[] values) Set the data array.voidsetData(double[] values, int begin, int length) Set the data array.voidsetQuantile(double p) Sets the value of the quantile field (determines what percentile is computed when evaluate() is called with no quantile argument).Methods inherited from class org.apache.commons.math.stat.descriptive.AbstractUnivariateStatistic
evaluate, evaluate, getData
-
Constructor Details
-
Percentile
public Percentile()Constructs a Percentile with a default quantile value of 50.0. -
Percentile
public Percentile(double p) Constructs a Percentile with the specific quantile value.- Parameters:
p- the quantile- Throws:
IllegalArgumentException- if p is not greater than 0 and less than or equal to 100
-
Percentile
Copy constructor, creates a newPercentileidentical to theoriginal- Parameters:
original- thePercentileinstance to copy
-
-
Method Details
-
setData
public void setData(double[] values) Set the data array.The stored value is a copy of the parameter array, not the array itself
- Overrides:
setDatain classAbstractUnivariateStatistic- Parameters:
values- data array to store (may be null to remove stored data)- See Also:
-
setData
public void setData(double[] values, int begin, int length) Set the data array.- Overrides:
setDatain classAbstractUnivariateStatistic- Parameters:
values- data array to storebegin- the index of the first element to includelength- the number of elements to include- See Also:
-
evaluate
public double evaluate(double p) Returns the result of evaluating the statistic over the stored data.The stored array is the one which was set by previous calls to
- Parameters:
p- the percentile value to compute- Returns:
- the value of the statistic applied to the stored data
-
evaluate
public double evaluate(double[] values, double p) Returns an estimate of thepth percentile of the values in thevaluesarray.Calls to this method do not modify the internal
quantilestate of this statistic.- Returns
Double.NaNifvalueshas length0 - Returns (for any value of
p)values[0]ifvalueshas length1 - Throws
IllegalArgumentExceptionifvaluesis null or p is not a valid quantile value (p must be greater than 0 and less than or equal to 100)
See
Percentilefor a description of the percentile estimation algorithm used.- Parameters:
values- input array of valuesp- the percentile value to compute- Returns:
- the percentile value or Double.NaN if the array is empty
- Throws:
IllegalArgumentException- ifvaluesis null or p is invalid
- Returns
-
evaluate
public double evaluate(double[] values, int start, int length) Returns an estimate of thequantileth percentile of the designated values in thevaluesarray. The quantile estimated is determined by thequantileproperty.- Returns
Double.NaNiflength = 0 - Returns (for any value of
quantile)values[begin]iflength = 1 - Throws
IllegalArgumentExceptionifvaluesis null, orstartorlengthis invalid
See
Percentilefor a description of the percentile estimation algorithm used.- Specified by:
evaluatein interfaceUnivariateStatistic- Specified by:
evaluatein classAbstractUnivariateStatistic- Parameters:
values- the input arraystart- index of the first array element to includelength- the number of elements to include- Returns:
- the percentile value
- Throws:
IllegalArgumentException- if the parameters are not valid
- Returns
-
evaluate
public double evaluate(double[] values, int begin, int length, double p) Returns an estimate of thepth percentile of the values in thevaluesarray, starting with the element in (0-based) positionbeginin the array and includinglengthvalues.Calls to this method do not modify the internal
quantilestate of this statistic.- Returns
Double.NaNiflength = 0 - Returns (for any value of
p)values[begin]iflength = 1 - Throws
IllegalArgumentExceptionifvaluesis null ,beginorlengthis invalid, orpis not a valid quantile value (p must be greater than 0 and less than or equal to 100)
See
Percentilefor a description of the percentile estimation algorithm used.- Parameters:
values- array of input valuesbegin- the first (0-based) element to include in the computationlength- the number of array elements to includep- the percentile to compute- Returns:
- the percentile value
- Throws:
IllegalArgumentException- if the parameters are not valid or the input array is null
- Returns
-
getQuantile
public double getQuantile()Returns the value of the quantile field (determines what percentile is computed when evaluate() is called with no quantile argument).- Returns:
- quantile
-
setQuantile
public void setQuantile(double p) Sets the value of the quantile field (determines what percentile is computed when evaluate() is called with no quantile argument).- Parameters:
p- a value between 0 invalid input: '<' p invalid input: '<'= 100- Throws:
IllegalArgumentException- if p is not greater than 0 and less than or equal to 100
-
copy
Returns a copy of the statistic with the same internal state.- Specified by:
copyin interfaceUnivariateStatistic- Specified by:
copyin classAbstractUnivariateStatistic- Returns:
- a copy of the statistic
-
copy
Copies source to dest.Neither source nor dest can be null.
- Parameters:
source- Percentile to copydest- Percentile to copy to- Throws:
NullPointerException- if either source or dest is null
-