|
||||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||||
java.lang.Objectorg.jaitools.numeric.StreamingSampleStats
public class StreamingSampleStats
A class to calculate summary statistics for a sample of Double-valued buffers that is received as a (potentially long) stream of values rather than in a single batch. Any Double.NaN values in the stream will be ignored.
Two options are offered to calculate sample median. Where it is known a priori that the data stream can be accomodated in memory, the exact median can be requested with Statistic.MEDIAN. Where the length of the data stream is unknown, or known to be too large to be held in memory, an approximate median can be calculated using the 'remedian' estimator as described in:
PJ Rousseeuw and GW Bassett (1990) The remedian: a robust averaging method for large data sets. Journal of the American Statistical Society 85:97-104This is requested with Statistic.APPROX_MEDIAN.
Note: the 'remedian' estimator performs badly with non-stationary data, e.g. a data stream that is monotonically increasing will result in an estimate for the median that is too high. If possible, it is best to de-trend or randomly order the data prior to streaming it.
Example of use:
StreamingSampleStats strmStats = new StreamingSampleStats();
// set the statistics that will be calculated
Statistic[] stats = {
Statistic.MEAN,
Statistic.SDEV,
Statistic.RANGE,
Statistic.APPROX_MEDIAN
};
strmStats.setStatistics(stats);
// some process that generates a long stream of data
while (somethingBigIsRunning) {
double value = ...
strmStats.offer(value);
}
// report the results
for (Statistic s : stats) {
System.out.println(String.format("%s: %.4f", s, strmStats.getStatisticValue(s)));
}
| Constructor Summary | |
|---|---|
StreamingSampleStats()
Creates a new sampler and sets the default range type to Range.Type.EXCLUDE. |
|
StreamingSampleStats(Range.Type rangesType)
Creates a new sampler with specified use of Ranges. |
|
| Method Summary | |
|---|---|
void |
addNoDataRange(Range<Double> noData)
Adds a range of values to be considered as NoData and then to be excluded from the calculation of all statistics. |
void |
addNoDataValue(Double noData)
Adds a single value to be considered as NoData. |
void |
addRange(Range<Double> range)
Adds a range of values to include in or exclude from the calculation of all statistics. |
void |
addRange(Range<Double> range,
Range.Type rangesType)
Adds a range of values to include in or exclude from the calculation of all statistics. |
long |
getNumAccepted(Statistic stat)
Gets the number of sample values that have been accepted for the specified Statistic. |
long |
getNumNaN(Statistic stat)
Gets the number of NaN values that have been offered. |
long |
getNumNoData(Statistic stat)
Gets the number of NoData values (including NaN) that have been offered. |
long |
getNumOffered(Statistic stat)
Gets the number of sample values that have been offered for the specified Statistic. |
Set<Statistic> |
getStatistics()
Gets the statistics that are currently set. |
Double |
getStatisticValue(Statistic stat)
Gets the current value of a running statistic. |
Map<Statistic,Double> |
getStatisticValues()
Gets the values of all statistics calculated by this sampler. |
boolean |
isSet(Statistic stat)
Tests whether the specified statistic is currently set. |
void |
offer(Double sample)
Offers a sample value. |
void |
offer(Double[] samples)
Offers an array of sample values. |
void |
setStatistic(Statistic stat)
Adds a statistic to those calculated by this sampler. |
void |
setStatistics(Statistic[] stats)
Adds the given statistics to those that will be calculated by this sampler. |
| Methods inherited from class java.lang.Object |
|---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
| Constructor Detail |
|---|
public StreamingSampleStats()
Range.Type.EXCLUDE.
public StreamingSampleStats(Range.Type rangesType)
Ranges.
rangesType - either Range.Type.INCLUDE
or Range.Type.EXCLUDE| Method Detail |
|---|
public void setStatistic(Statistic stat)
stat - the statisticStatisticpublic void setStatistics(Statistic[] stats)
stats - the statisticssetStatistic(Statistic)public boolean isSet(Statistic stat)
Statistic.MEAN is set then SDEV and
VARIANCE will also be set as these three are calculated
together. The same is true for MIN, MAX and RANGE.
stat - the statistic
true if the statistic has been set; false otherwise.public void addNoDataRange(Range<Double> noData)
noData - the range defining NoData valuespublic void addNoDataValue(Double noData)
noData - the value to be treated as NoDataaddNoDataRange(Range)public void addRange(Range<Double> range)
range - the range to include/exclude
public void addRange(Range<Double> range,
Range.Type rangesType)
range - the range to include/excluderangesType - one of Range.Type.INCLUDE or Range.Type.EXCLUDEpublic Set<Statistic> getStatistics()
public Double getStatisticValue(Statistic stat)
stat - the statistic
IllegalStateException - if stat was not previously setpublic long getNumAccepted(Statistic stat)
Statistic.
Note that different statistics might have been set at different times in the sampling process.
stat - the statistic
IllegalArgumentException - if the statistic hasn't been setpublic long getNumOffered(Statistic stat)
Statistic. This might be higher than the value
returned by getNumAccepted(org.jaitools.numeric.Statistic) due to nulls,
Double.NaNs and excluded values in the data stream.
Note that different statistics might have been set at different times in the sampling process.
stat - the statistic
IllegalArgumentException - if the statistic hasn't been setpublic long getNumNaN(Statistic stat)
stat - the statistic
IllegalArgumentException - if the statistic hasn't been setpublic long getNumNoData(Statistic stat)
stat - the statistic
IllegalArgumentException - if the statistic hasn't been setpublic void offer(Double sample)
Double.NaNs and nulls are excluded by default.
sample - the sample valuepublic void offer(Double[] samples)
samples - the sample valuespublic Map<Statistic,Double> getStatisticValues()
|
||||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||||