public class GaussianDistribution extends AbstractDistribution implements ExponentialFamily
The family of normal distributions is closed under linear transformations.
That is, if X is normally distributed, then a linear transform
The central limit theorem states that under certain, fairly common conditions,
the sum of a large number of random variables will have approximately normal
distribution. For example if X1, …, Xn is a
sequence of iid random variables, each having mean μ and variance σ2
but otherwise distributions of Xi's can be arbitrary, then the
central limit theorem states that
Therefore, certain other distributions can be approximated by the normal
distribution, for example:
aX + b
(for some real numbers a ≠ 0 and b) is also normally distributed.
If X1, X2 are two
independent normal random variables, then their linear combination
will also be normally distributed. The converse is also true: if
X1 and X2 are independent
and their sum X1 + X2 is distributed
normally, then both X1 and X2
must also be normal, which is known as the Cramer's theorem. Of all
probability distributions over the real domain with mean μ
and variance σ2, the normal
distribution N(μ, σ2) is the one with the maximum entropy.
√n (1⁄n Σ Xi - μ) → N(0, σ2).
The theorem will hold even if the summands Xi are not iid,
although some constraints on the degree of dependence and the growth rate
of moments still have to be imposed.
B(n, p) is approximately normal
N(np, np(1-p)) for large n and for p not too close to zero or one.
Poisson(λ) distribution is approximately normal
N(λ, λ) for large values of λ.
Χ2(k) is
approximately normal N(k, 2k) for large k.
t(ν) is approximately normal
N(0, 1) when ν is large.
| Modifier and Type | Field and Description |
|---|---|
double |
mu
The mean.
|
double |
sigma
The standard deviation.
|
| Constructor and Description |
|---|
GaussianDistribution(double mu,
double sigma)
Constructor
|
| Modifier and Type | Method and Description |
|---|---|
double |
cdf(double x)
Cumulative distribution function.
|
double |
entropy()
Shannon entropy of the distribution.
|
static GaussianDistribution |
fit(double[] data)
Estimates the distribution parameters by MLE.
|
static GaussianDistribution |
getInstance() |
int |
length()
The number of parameters of the distribution.
|
double |
logp(double x)
The density at x in log scale, which may prevents the underflow problem.
|
Mixture.Component |
M(double[] x,
double[] posteriori)
The M step in the EM algorithm, which depends the specific distribution.
|
double |
mean()
The mean of distribution.
|
double |
p(double x)
The probability density function for continuous distribution
or probability mass function for discrete distribution at x.
|
double |
quantile(double p)
The quantile, the probability to the left of quantile(p) is p.
|
double |
rand()
Uses the Box-Muller algorithm to transform Random.random()'s into Gaussian deviates.
|
double |
randInverseCDF()
Uses Inverse CDF method to generate a Gaussian deviate.
|
double |
sd()
The standard deviation of distribution.
|
java.lang.String |
toString() |
double |
variance()
The variance of distribution.
|
inverseTransformSampling, quantile, quantile, rejectionclone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, waitlikelihood, logLikelihood, randpublic final double mu
public final double sigma
public GaussianDistribution(double mu,
double sigma)
mu - mean.sigma - standard deviation.public static GaussianDistribution fit(double[] data)
public static GaussianDistribution getInstance()
public int length()
Distributionlength in interface Distributionpublic double mean()
Distributionmean in interface Distributionpublic double variance()
Distributionvariance in interface Distributionpublic double sd()
Distributionsd in interface Distributionpublic double entropy()
Distributionentropy in interface Distributionpublic java.lang.String toString()
toString in class java.lang.Objectpublic double rand()
rand in interface Distributionpublic double randInverseCDF()
public double p(double x)
Distributionp in interface Distributionpublic double logp(double x)
Distributionlogp in interface Distributionpublic double cdf(double x)
Distributioncdf in interface Distributionpublic double quantile(double p)
quantile in interface Distributionpublic Mixture.Component M(double[] x, double[] posteriori)
ExponentialFamilyM in interface ExponentialFamilyx - the input data for estimationposteriori - the posteriori probability.