The Gaussian distribution — 2: Frankenstein Monsters

In Part 1, we introduced the Gaussian distribution as naturally arising from the mixing of a large number of independent random variables. Random mixing has an averaging effect that can be described by a sequence of approximations. As a first order approximation, random mixing reduces the size of fluctuations; asymptotically, the sample average of iid random variables converges to the mean. If we zoom into the small deviations of the mixture around its mean for a second-order approximation, we discover that the deviations converge to a Gaussian. distribution regardless of the distribution of the underlying random variables. This is the famous Central Limit Theorem (CLT).

Once you go Gaussian… In addition to its ubiquity in nature, the popularity of the Gaussian distribution can also be attributed to its many, very nice mathematical properties which makes us want to use these distributions. First and most important, mixtures of independent Gaussian random variables are also Gaussian. From our previous reasoning that led up to the CLT, this should be unsurprising.

Consider a sequence of iid standard Gaussian random variables collected into a column vector \underline{Z} \doteq [Z_1, Z_2 \dots Z_M]^T, and a sequence of derived random variables \underline{X} = A \underline{Z}, where A is a N \times M matrix. Thus \underline{X} \doteq [X_1, X_2 \dots X_N]^T is a sequence of random variables that are linear combinations of the Z_i‘s.

Natural Gaussians. By the previous reasoning, the X_i‘s are all Gaussian. Furthermore, all linear mixtures of the X_i‘s are also Gaussian. Note however, that the X_i‘s are not in general independent of each other; they depend on each other through a shared dependence on one or more of the underlying independent variables Z_i.

We call such random variables X_i multi-variate Gaussian or jointly Gaussian, but they could also be reasonably called natural Gaussians. Indeed, there is a multi-variable version of the Central Limit Theorem that shows how a set of multi-causal random variables that are each an aggregate of a large number of independent random variables converge to a multi-variable generalization of the single-variable Gaussian distribution regardless of the underlying distributions.

But calling these natural Gaussians begs the question: is there another kind?

Gaussians with .. Unnatural Dependencies. Sometimes the nice mathematical properties of the Gaussian distribution can lead to .. annoying mathematical corner cases. We can take two iid standard Gaussian rvs Z_1, Z_2 and create nice, little baby Gaussians like e.g. X_1 = Z_1 - Z_2,~X_2 = 2 Z_1 + Z_2. These are the nice, natural kind of Gaussians we discussed above.

But we can also create weird Frankenstein monster Gaussians like V_1 = Z_1 \mathrm{sgn}(Z_2) where V_1 is equal to \pm Z_1 depending on the sign of Z_2. Randomly flipping the sign of a zero mean Gaussian still yields a Gaussian rv because of the even symmetry of the Gaussian density function. The new rv V_1 is uncorrelated with Z_1, but certainly not independent of it. The mixture V_1+Z_1 is very far from a Gaussian: it is zero with 50% probability!

While there are numerous ways to cook up this kind of weird Gaussian random variables, they almost never exist in the wild.

If we limit ourselves to the nice kind of multi-variate Gaussians, we can apply some very nice and intuitive geometric ideas to them. That’s a topic for Part 3.

2 thoughts on “The Gaussian distribution — 2: Frankenstein Monsters

Leave a comment