Multivariate Normal Distributions

I looked at normal random variables in an earlier post but, what does it mean for a sequence of real-valued random variables {X_1,X_2,\ldots,X_n} to be jointly normal? We could simply require each of them to be normal, but this says very little about their joint distribution and is not much help in handling expressions involving more than one of the {X_i} at once. In case that the random variables are independent, the following result is a very useful property of the normal distribution. All random variables in this post will be real-valued, except where stated otherwise, and we assume that they are defined with respect to some underlying probability space {(\Omega,\mathcal F,{\mathbb P})}.

Lemma 1 Linear combinations of independent normal random variables are again normal.

Proof: More precisely, if {X_1,\ldots,X_n} is a sequence of independent normal random variables and {a_1,\ldots,a_n} are real numbers, then {Y=a_1X_1+\cdots+a_nX_n} is normal. Let us suppose that {X_k} has mean {\mu_k} and variance {\sigma_k^2}. Then, the characteristic function of Y can be computed using the independence property and the characteristic functions of the individual normals,

\displaystyle  \begin{aligned} {\mathbb E}\left[e^{i\lambda Y}\right] &={\mathbb E}\left[\prod_ke^{i\lambda a_k X_k}\right] =\prod_k{\mathbb E}\left[e^{i\lambda a_k X_k}\right]\\ &=\prod_ke^{-\frac12\lambda^2a_k^2\sigma_k^2+i\lambda a_k\mu_k} =e^{-\frac12\lambda^2\sigma^2+i\lambda\mu} \end{aligned}

where we have set {\mu_k=\sum_ka_k\mu_k} and {\sigma^2=\sum_ka_k^2\sigma_k^2}. This is the characteristic function of a normal random variable with mean {\mu} and variance {\sigma^2}. ⬜

The definition of joint normal random variables will include the case of independent normals, so that any linear combination is also normal. We use use this result as the defining property for the general multivariate normal case.

Definition 2 A collection {\{X_i\}_{i\in I}} of real-valued random variables is multivariate normal (or joint normal) if and only if all of its finite linear combinations are normal.

Continue reading “Multivariate Normal Distributions”

Manipulating the Normal Distribution

The normal (or Gaussian) distribution is ubiquitous throughout probability theory for various reasons, including the central limit theorem, the fact that it is realistic for many practical applications, and because it satisfies nice properties making it amenable to mathematical manipulation. It is, therefore, one of the first continuous distributions that students encounter at school. As such, it is not something that I have spent much time discussing on this blog, which is usually concerned with more advanced topics. However, there are many nice properties and methods that can be performed with normal distributions, greatly simplifying the manipulation of expressions in which it is involved. While it is usually possible to ignore these, and instead just substitute in the density function and manipulate the resulting integrals, that approach can get very messy. So, I will describe some of the basic results and ideas that I use frequently.

Throughout, I assume the existence of an underlying probability space {(\Omega,\mathcal F,{\mathbb P})}. Recall that a real-valued random variable X has the standard normal distribution if it has a probability density function given by,

\displaystyle  \varphi(x)=\frac1{\sqrt{2\pi}}e^{-\frac{x^2}2}.

For it to function as a probability density, it is necessary that it integrates to one. While it is not obvious that the normalization factor {1/\sqrt{2\pi}} is the correct value for this to be true, it is the one fact that I state here without proof. Wikipedia does list a couple of proofs, which can be referred to. By symmetry, {-X} and {X} have the same distribution, so that they have the same mean and, therefore, {{\mathbb E}[X]=0}.

The derivative of the density function satisfies the useful identity

\displaystyle  \varphi^\prime(x)=-x\varphi(x). (1)

This allows us to quickly verify that standard normal variables have unit variance, by an application of integration by parts.

\displaystyle  \begin{aligned} {\mathbb E}[X^2] &=\int x^2\varphi(x)dx\\ &= -\int x\varphi^\prime(x)dx\\ &=\int\varphi(x)dx-[x\varphi(x)]_{-\infty}^\infty=1 \end{aligned}

Another identity satisfied by the normal density function is,

\displaystyle  \varphi(x+y)=e^{-xy - \frac{y^2}2}\varphi(x) (2)

This enables us to prove the following very useful result. In fact, it is difficult to overstate how helpful this result can be. I make use of it frequently when manipulating expressions involving normal variables, as it significantly simplifies the calculations. It is also easy to remember, and simple to derive if needed.

Theorem 1 Let X be standard normal and {f\colon{\mathbb R}\rightarrow{\mathbb R}_+} be measurable. Then, for all {\lambda\in{\mathbb R}},

\displaystyle  \begin{aligned} {\mathbb E}[e^{\lambda X}f(X)] &={\mathbb E}[e^{\lambda X}]{\mathbb E}[f(X+\lambda)]\\ &=e^{\frac{\lambda^2}{2}}{\mathbb E}[f(X+\lambda)]. \end{aligned} (3)

Continue reading “Manipulating the Normal Distribution”