# Central Limit Theorem

## The Central Limit Theorem

Gaussian (normal) distributions are so important because the describe the probability distribution from most measurements made in the natural world. This phenomenon can be attributed to the central limit theorem.

### Definition

Let $x_1,x_2,\dots ,x_ n$ be a set of n independent random variables drawn from some distribution that need not be Gaussian, with each xi distributed around a mean value μi with a finite variance $\sigma _ i^2$. Then the variable

$z\equiv \frac{\sum _{i=1}^ n{(x_ i-\mu _ i)}}{\sqrt {\sum _{i=1}^ n{\sigma _ i^2}}}, \,\!$

in the limit of $n\to \infty$, will have a Gaussian distribution with zero mean (μ = 0) and unit variance (σ2 = 1).

It turns out that the Central Limit Theorem can also hold for random variables, each drawn from a different, not-necessarily-Gaussian distribution, but only under the additional (not unreasonable) condition that at least one higher order moment of the distribution of the ensemble converges to zero.

### Implications

There are two big implications of the Central Limit theorem:

1. Ensembles of many random processes/variables converge to Gaussian distributions. That’s why normal distributions are everywhere.
2. When adding together random numbers, the variance of the sum is the sum of the variances of those numbers.

Statement 2 is important. It means that, if you are averaging a bunch of samples drawn from the same distribution (e.g. they are all measurements with the same random error):

$\bar x = \frac1n\sum _{i=1}^ n{x_ i} \,\!$

then the standard deviation of $\bar x$ (which you’d see if you computed this average with new random samples over and over again) decreases as the square root of the number of samples averaged:

$\sigma _{\bar x} = \frac{\sigma _ x}{\sqrt {n}} \,\!$

That’s why you get a better estimate of a quantity by making lots of (independent) measurements. But you only do so as the square-root of the number of measurements.