Independence of Normals

A well known fact about joint normally distributed random variables, is that they are independent if and only if their covariance is zero. In one direction, this statement is trivial. Any independent pair of random variables has zero covariance (assuming that they are integrable, so that the covariance has a well-defined value). The strength of the statement is in the other direction. Knowing the value of the covariance does not tell us a lot about the joint distribution so, in the case that they are joint normal, the fact that we can determine independence from this is a rather strong statement.

Theorem 1 A joint normal pair of random variables are independent if and only if their covariance is zero.

Proof: Suppose that X,Y are joint normal, such that ${X\overset d= N(\mu_X,\sigma^2_X)}$ and ${Y\overset d=N(\mu_Y,\sigma_Y^2)}$, and that their covariance is c. Then, the characteristic function of ${(X,Y)}$ can be computed as

 \displaystyle \begin{aligned} {\mathbb E}\left[e^{iaX+ibY}\right] &=e^{ia\mu_X+ib\mu_Y-\frac12(a^2\sigma_X^2+2abc+b^2\sigma_Y^2)}\\ &=e^{-abc}{\mathbb E}\left[e^{iaX}\right]{\mathbb E}\left[e^{ibY}\right] \end{aligned}

for all ${(a,b)\in{\mathbb R}^2}$. It is standard that the joint characteristic function of a pair of random variables is equal to the product of their characteristic functions if and only if they are independent which, in this case, corresponds to the covariance c being zero. ⬜

To demonstrate necessity of the joint normality condition, consider the example from the previous post.

Example 1 A pair of standard normal random variables X,Y which have zero covariance, but ${X+Y}$ is not normal.

As their sum is not normal, X and Y cannot be independent. This example was constructed by setting ${Y={\rm sgn}(\lvert X\rvert -K)X}$ for some fixed ${K > 0}$, which is standard normal whenever X is. As explained in the previous post, the intermediate value theorem ensures that there is a unique value for K making the covariance ${{\mathbb E}[XY]}$ equal to zero. Continue reading “Independence of Normals”

Multivariate Normal Distributions

I looked at normal random variables in an earlier post but, what does it mean for a sequence of real-valued random variables ${X_1,X_2,\ldots,X_n}$ to be jointly normal? We could simply require each of them to be normal, but this says very little about their joint distribution and is not much help in handling expressions involving more than one of the ${X_i}$ at once. In case that the random variables are independent, the following result is a very useful property of the normal distribution. All random variables in this post will be real-valued, except where stated otherwise, and we assume that they are defined with respect to some underlying probability space ${(\Omega,\mathcal F,{\mathbb P})}$.

Lemma 1 Linear combinations of independent normal random variables are again normal.

Proof: More precisely, if ${X_1,\ldots,X_n}$ is a sequence of independent normal random variables and ${a_1,\ldots,a_n}$ are real numbers, then ${Y=a_1X_1+\cdots+a_nX_n}$ is normal. Let us suppose that ${X_k}$ has mean ${\mu_k}$ and variance ${\sigma_k^2}$. Then, the characteristic function of Y can be computed using the independence property and the characteristic functions of the individual normals,

 \displaystyle \begin{aligned} {\mathbb E}\left[e^{i\lambda Y}\right] &={\mathbb E}\left[\prod_ke^{i\lambda a_k X_k}\right] =\prod_k{\mathbb E}\left[e^{i\lambda a_k X_k}\right]\\ &=\prod_ke^{-\frac12\lambda^2a_k^2\sigma_k^2+i\lambda a_k\mu_k} =e^{-\frac12\lambda^2\sigma^2+i\lambda\mu} \end{aligned}

where we have set ${\mu_k=\sum_ka_k\mu_k}$ and ${\sigma^2=\sum_ka_k^2\sigma_k^2}$. This is the characteristic function of a normal random variable with mean ${\mu}$ and variance ${\sigma^2}$. ⬜

The definition of joint normal random variables will include the case of independent normals, so that any linear combination is also normal. We use use this result as the defining property for the general multivariate normal case.

Definition 2 A collection ${\{X_i\}_{i\in I}}$ of real-valued random variables is multivariate normal (or joint normal) if and only if all of its finite linear combinations are normal.

Manipulating the Normal Distribution

The normal (or Gaussian) distribution is ubiquitous throughout probability theory for various reasons, including the central limit theorem, the fact that it is realistic for many practical applications, and because it satisfies nice properties making it amenable to mathematical manipulation. It is, therefore, one of the first continuous distributions that students encounter at school. As such, it is not something that I have spent much time discussing on this blog, which is usually concerned with more advanced topics. However, there are many nice properties and methods that can be performed with normal distributions, greatly simplifying the manipulation of expressions in which it is involved. While it is usually possible to ignore these, and instead just substitute in the density function and manipulate the resulting integrals, that approach can get very messy. So, I will describe some of the basic results and ideas that I use frequently.

Throughout, I assume the existence of an underlying probability space ${(\Omega,\mathcal F,{\mathbb P})}$. Recall that a real-valued random variable X has the standard normal distribution if it has a probability density function given by,

 $\displaystyle \varphi(x)=\frac1{\sqrt{2\pi}}e^{-\frac{x^2}2}.$

For it to function as a probability density, it is necessary that it integrates to one. While it is not obvious that the normalization factor ${1/\sqrt{2\pi}}$ is the correct value for this to be true, it is the one fact that I state here without proof. Wikipedia does list a couple of proofs, which can be referred to. By symmetry, ${-X}$ and ${X}$ have the same distribution, so that they have the same mean and, therefore, ${{\mathbb E}[X]=0}$.

The derivative of the density function satisfies the useful identity

 $\displaystyle \varphi^\prime(x)=-x\varphi(x).$ (1)

This allows us to quickly verify that standard normal variables have unit variance, by an application of integration by parts.

 \displaystyle \begin{aligned} {\mathbb E}[X^2] &=\int x^2\varphi(x)dx\\ &= -\int x\varphi^\prime(x)dx\\ &=\int\varphi(x)dx-[x\varphi(x)]_{-\infty}^\infty=1 \end{aligned}

Another identity satisfied by the normal density function is,

 $\displaystyle \varphi(x+y)=e^{-xy - \frac{y^2}2}\varphi(x)$ (2)

This enables us to prove the following very useful result. In fact, it is difficult to overstate how helpful this result can be. I make use of it frequently when manipulating expressions involving normal variables, as it significantly simplifies the calculations. It is also easy to remember, and simple to derive if needed.

Theorem 1 Let X be standard normal and ${f\colon{\mathbb R}\rightarrow{\mathbb R}_+}$ be measurable. Then, for all ${\lambda\in{\mathbb R}}$,

 \displaystyle \begin{aligned} {\mathbb E}[e^{\lambda X}f(X)] &={\mathbb E}[e^{\lambda X}]{\mathbb E}[f(X+\lambda)]\\ &=e^{\frac{\lambda^2}{2}}{\mathbb E}[f(X+\lambda)]. \end{aligned} (3)