Manipulating the Normal Distribution

The normal (or Gaussian) distribution is ubiquitous throughout probability theory for various reasons, including the central limit theorem, the fact that it is realistic for many practical applications, and because it satisfies nice properties making it amenable to mathematical manipulation. It is, therefore, one of the first continuous distributions that students encounter at school. As such, it is not something that I have spent much time discussing on this blog, which is usually concerned with more advanced topics. However, there are many nice properties and methods that can be performed with normal distributions, greatly simplifying the manipulation of expressions in which it is involved. While it is usually possible to ignore these, and instead just substitute in the density function and manipulate the resulting integrals, that approach can get very messy. So, I will describe some of the basic results and ideas that I use frequently.

Throughout, I assume the existence of an underlying probability space {(\Omega,\mathcal F,{\mathbb P})}. Recall that a real-valued random variable X has the standard normal distribution if it has a probability density function given by,

\displaystyle  \varphi(x)=\frac1{\sqrt{2\pi}}e^{-\frac{x^2}2}.

For it to function as a probability density, it is necessary that it integrates to one. While it is not obvious that the normalization factor {1/\sqrt{2\pi}} is the correct value for this to be true, it is the one fact that I state here without proof. Wikipedia does list a couple of proofs, which can be referred to. By symmetry, {-X} and {X} have the same distribution, so that they have the same mean and, therefore, {{\mathbb E}[X]=0}.

The derivative of the density function satisfies the useful identity

\displaystyle  \varphi^\prime(x)=-x\varphi(x). (1)

This allows us to quickly verify that standard normal variables have unit variance, by an application of integration by parts.

\displaystyle  \begin{aligned} {\mathbb E}[X^2] &=\int x^2\varphi(x)dx\\ &= -\int x\varphi^\prime(x)dx\\ &=\int\varphi(x)dx-[x\varphi(x)]_{-\infty}^\infty=1 \end{aligned}

Another identity satisfied by the normal density function is,

\displaystyle  \varphi(x+y)=e^{-xy - \frac{y^2}2}\varphi(x) (2)

This enables us to prove the following very useful result. In fact, it is difficult to overstate how helpful this result can be. I make use of it frequently when manipulating expressions involving normal variables, as it significantly simplifies the calculations. It is also easy to remember, and simple to derive if needed.

Theorem 1 Let X be standard normal and {f\colon{\mathbb R}\rightarrow{\mathbb R}_+} be measurable. Then, for all {\lambda\in{\mathbb R}},

\displaystyle  \begin{aligned} {\mathbb E}[e^{\lambda X}f(X)] &={\mathbb E}[e^{\lambda X}]{\mathbb E}[f(X+\lambda)]\\ &=e^{\frac{\lambda^2}{2}}{\mathbb E}[f(X+\lambda)]. \end{aligned} (3)

Proof: Using identity (2), we can evaluate {{\mathbb E}[e^{\lambda X}f(X)]} as,

\displaystyle  \begin{aligned} \int e^{\lambda x}f(x)\varphi(x)dx &=\int f(x)e^{\frac12\lambda^2}\varphi(x-\lambda)dx\\ &=\int f(y+\lambda)e^{\frac12\lambda^2}\varphi(y)dy\\ &=e^{\frac12\lambda^2}{\mathbb E}[f(X+\lambda)]. \end{aligned}

Here, the substitution {x=y+\lambda} was used. This proves the second line of (3). For the first line, we put {f=1} in the above to obtain {{\mathbb E}[e^{\lambda X}]=e^{\frac12\lambda^2}}. ⬜

The above result clearly applies more generally to complex valued functions {f\colon{\mathbb R}\rightarrow{\mathbb C}}, by linearity. It just needs to be checked that {e^{\lambda X}f(X)} or, equivalently, {f(X+\lambda)} is integrable in order for the expressions to make sense.

Although (3) is simple enough as it is, I find it easier to understand as two separate statements. The first of these is that, when computing the expected value of the product of {e^{\lambda X}} and an arbitrary function of X, then it is equal to the product of the expectations. We just need to remember to shift X by an amount {\lambda} inside the second expectation,

\displaystyle  {\mathbb E}[e^{\lambda X}f(X)]={\mathbb E}[e^{\lambda X}]{\mathbb E}[f(X+\lambda)].

The second statement is the identity {{\mathbb E}[e^{\lambda X}]=e^{\frac12\lambda^2}} which, being the moment generating function, completely determines the standard normal distribution just as well as its probability density. In fact, this expression generalizes to complex values of {\lambda}.

Theorem 2 Let X be a standard normal. Then, {\exp(\lambda X)} is integrable for all {\lambda\in{\mathbb C}} and,

\displaystyle  {\mathbb E}\left[e^{\lambda X}\right]=e^{\frac12\lambda^2}. (4)

Proof: For real {\lambda}, the result is given by theorem 1 and, in particular, it shows that {e^{\lambda X}} is integrable. For complex values, we have {\lvert e^{\lambda X}\rvert=e^{\Re(\lambda)X}} which, again, is integrable. By dominated convergence, the left hand side of (4) is differentiable with,

\displaystyle  \frac{d}{d\lambda}{\mathbb E}[e^{\lambda X}]={\mathbb E}[Xe^{\lambda X}].

As both sides of (4) are differentiable, and agree on {\lambda\in{\mathbb R}}, they agree everywhere by analytic continuation. ⬜

As characteristic functions are defined for all probability distributions, and they uniquely determine the distribution, a particularly common case of (4) is obtained by taking imaginary {\lambda = iu}. This gives the characteristic function of the standard normal,

\displaystyle  {\mathbb E}[e^{iuX}]=e^{-\frac12 u^2}.

Although this is true for all complex values of {u}, for the characteristic function it is usually taken to be real.

Another application of the moment generating function (4) is, as the name suggests, to generate the moments. We expand out the exponentials as power series,

\displaystyle  \begin{aligned} &{\mathbb E}[e^{\lambda X}]=\sum_{n=0}^\infty\frac{\lambda^n}{n!}{\mathbb E}[X^n],\\ &e^{\frac12\lambda^2}=\sum_{n=0}^\infty\frac{\lambda^{2n}}{2^nn!}. \end{aligned}

Comparing coefficients of powers of {\lambda} gives the moments.

Corollary 3 A standard normal variable X has odd moments equal to zero and even moments

\displaystyle  {\mathbb E}[X^{2n}]=\frac{(2n)!}{2^nn!}.

Now, let’s move on and consider normal distributions with arbitrary mean and variance. Let {\mu} and {\sigma\ge0} be real numbers. For a standard normal variable Y, then {X\equiv\mu+\sigma Y} will have the normal distribution with mean {\mu} and variance {\sigma^2}. We denote this distribution as {N(\mu,\sigma^2)}, and will write {X\overset d=N(\mu,\sigma^2)}. For {\sigma=0}, then X is just equal to the constant value {\mu}, otherwise for {\sigma > 0} it has a continuous probability density.

Lemma 4 For {\sigma > 0}, the distribution {N(\mu,\sigma^2)} has probability density

\displaystyle  p(x) = \frac1{\sigma}\varphi\left(\frac{x-\mu}{\sigma}\right)=\frac1{\sqrt{2\pi\sigma^2}}e^{-\frac{(x-\mu)^2}{2\sigma^2}}

Proof: Write {X=\mu+\sigma Y} for standard normal Y. For nonnegative measurable {f\colon{\mathbb R}\rightarrow{\mathbb R}_+}, the expected value of {f(X)} is given by

\displaystyle  \begin{aligned} {\mathbb E}[f(\mu+\sigma Y)] &=\int f(\mu+\sigma y)\varphi(y)dy\\ &=\int f(x)\varphi(\sigma^{-1}(x-\mu))\sigma^{-1}dx \end{aligned}

as required. Here, I substituted in {y=\sigma^{-1}(x-\mu)}. ⬜

The manipulations above for standard normal random variables carries across to general normal distributions, without much trouble.

Theorem 5 If X is normal then {e^{\lambda X}} is integrable for all {\lambda\in{\mathbb C}} and,

\displaystyle  {\mathbb E}[e^{\lambda X}] =e^{\lambda{\mathbb E}[X]+\frac12\lambda^2{\rm Var}(X)} (5)

for all {\lambda\in{\mathbb C}}.

Proof: When X is standard normal, this is just the same as (4). However, it remains true if we multiply X by a real value {\sigma}, since it is just the same as multiplying {\lambda} by {\sigma} on both sides of (5). Also, adding a real value {\mu} to X scales both sides of (5) by {e^{\lambda\mu}}, so it remains true and, hence, holds for all normal X. ⬜

This result should be very easy to remember. As a very naive, or first order approximation, we might expect that {{\mathbb E}[\exp(X)]} is well approximated by {\exp({\mathbb E}[X])}. This cannot hold, because of the convexity of the exponential and, instead, we just need to remember the adjustment term of half the variance,

\displaystyle  {\mathbb E}[\exp(X)]=\exp\left({\mathbb E}[X]+\frac12{\rm Var}(X)\right).

This is (5) for {\lambda=1} and the more general expression is obtained by scaling X by {\lambda}.

Theorem 5 implies the following simple characterisation of normal distributions.

Corollary 6 A real-valued random variable is normal if and only if its characteristic function is of the form {e^{q}} for a quadratic {q}.

Proof: First, if X is normal, then its characteristic function is of the stated form by theorem 5. Conversely, suppose that the characteristic function is of the stated form for a quadratic {q(u)=-\frac12au^2+ibu+c}. That is,

\displaystyle  {\mathbb E}[e^{iuX}]=\exp\left(-\frac12au^2+ibu+c\right).

Taking {u=0} gives {\exp(c)=1} and, hence, we can take {c=0}. Then, since flipping the sign of u replaces the left hand side by its complex conjugate, the same holds on the right hand side. From this, we see that a and b are both real. Then, we see from theorem 5 that this is the characteristic function of a normal with mean b and variance a and, hence, X is normal. ⬜

Theorem 1 also carries across in the same way to arbitrary normal random variables.

Theorem 7 Let X be normal and {f\colon{\mathbb R}\rightarrow{\mathbb R}_+} be measurable. Then,

\displaystyle  {\mathbb E}[e^{\lambda X}f(X)] ={\mathbb E}[e^{\lambda X}]{\mathbb E}\left[f(X+\lambda{\rm Var}(X))\right] (6)

for all {\lambda\in{\mathbb R}}.

Proof: For standard normal X, this is just (3). Then, it remains true if we multiply X by a real value {\sigma}, since this is the same as multiplying {\lambda} by {\sigma} and replacing {f(x)} by {f(\sigma x)} on both sides of (6). Similarly, it remains true if we add a real value {\mu} to X, as this is the same as multiplying both sides of (6) by {e^{\lambda\mu}} and replacing {f(x)} by {f(x+\mu)}. So, it holds for all normal X. ⬜

I actually find this result slightly easier to remember in a modified form. For two random variables X and Y with finite variances, their covariance is defined as

\displaystyle  \begin{aligned} {\rm Cov}(X,Y) &={\mathbb E}[(X-{\mathbb E} X)(Y-{\mathbb E} Y)]\\ &={\mathbb E}[XY]-{\mathbb E}[X]{\mathbb E}[Y]. \end{aligned}

Note that this is bilinear and symmetric in X and Y and that {{\rm Cov}(X,X)={\rm Var}(X)}.

Theorem 8 Let X be a normal random variable and {Y=aX+b} for some {a,b\in{\mathbb R}}. Then,

\displaystyle  {\mathbb E}\left[e^Yf(X)\right]={\mathbb E}\left[e^Y\right]{\mathbb E}\left[f\left(X+{\rm Cov}(X,Y)\right)\right] (7)

for all measurable functions {f\colon{\mathbb R}\rightarrow{\mathbb R}_+}.

Proof: This is immediate from (6) with {\lambda=a}, since {{\rm Cov}(X,Y)=a{\rm Var}(X)}. ⬜

So, the expectation of the product of {e^{Y}} and {f(X)} is equal to the product of their expactations, but we need to shift X by its covariance with Y. While it may not seem obvious, identity (7) certainly makes sense. If X andY have positive covariance, then the term {e^Y} will tend to more highly weight larger values of X and apply a lower weight when X is small. So, if we take it out of the expectation then, to compensate, we should shift X by an amount that depends on their covariance. In fact (7) holds much more generally, as it applies to any joint normal random variables X and Y. However, I am not covering joint normality in this post so do not prove this.

I find theorem 8 more intuitive when understood in terms of changes of measure. For a nonnegative random variable Z with mean 1, we can use this as a weighting to define a new probability measure {{\mathbb Q}} on the same underlying measurable space,

\displaystyle  {\mathbb Q}(A)={\mathbb E}[1_AZ]

for all measurable sets A. Note that the probability of the whole space under the new measure is {{\mathbb Q}(\Omega)={\mathbb E}[Z]}, explaining why we require Z to have mean 1. I will write this as {d{\mathbb Q}=Zd{\mathbb P}}, where the weight Z is called the Radon-Nikodym derivative, and is alternatively written as {Z=d{\mathbb Q}/d{\mathbb P}}. Expectation with respect to this new measure will be denoted by {{\mathbb E}_{\mathbb Q}}, and satisfies

\displaystyle  {\mathbb E}_Q[X]={\mathbb E}[XZ]

for all nonnegative random variables X. If instead we are given a nonnegative random variable Z whose mean is not necessarily equal to 1, but is nonzero and finite, then we can normalize by dividing through by its mean. This defines the new probability measure {d{\mathbb Q}=(Z/{\mathbb E}[Z])d{\mathbb P}}, which I denote by {d{\mathbb Q}\sim Zd{\mathbb P}} to avoid having to explicitly write out the normalization every time. Expectation with respect to {{\mathbb Q}} is then given by

\displaystyle  {\mathbb E}_{\mathbb Q}[X]={\mathbb E}[XZ]/{\mathbb E}[Z]

for nonnegative random variables X.

Normal random variables are not themselves nonnegative, so cannot be used directly for measure changes. However, their exponentials, known as lognormal random variables, are nonnegative. I now rewrite theorem 8 in terms of measure changes, which is my preferred form.

Theorem 9 Let X be a normal random variable and {Y=aX+b} for some {a,b\in{\mathbb R}}. Under the probability measure {d{\mathbb Q}\sim e^Yd{\mathbb P}} then, X is normal with the same variance as under {{\mathbb P}} but with mean {{\mathbb E}[X]+{\rm Cov}(X,Y)}.

Proof: Set {\tilde X=X+{\rm Cov}(X,Y)}, which is normal with the same variance as X but with mean {{\mathbb E}[X]+{\rm Cov}(X,Y)}. For any measurable function {f\colon{\mathbb R}\rightarrow{\mathbb R}_+}, (7) gives

\displaystyle  {\mathbb E}_{\mathbb Q}[f(X)]={\mathbb E}[e^Yf(X)]/{\mathbb E}[e^Y]={\mathbb E}[f(\tilde X)]

as required. ⬜

So, when we apply lognormal measure changes, the original normal random variable remains normal with the same variance, but with a shifted mean.

I now apply the ideas described above to the Black-Scholes formula for financial option pricing. I use {\Phi(x)} for the normal distribution function, which is defined by

\displaystyle  \Phi(x)={\mathbb P}(X < x ) = \int_{-\infty}^x\varphi(y)dy

for a standard normal X.

Example 1 (Black-Scholes formula) Suppose that S is lognormal with mean {F={\mathbb E}[S]} and nonzero log-variance {{\rm Var}(\log S)=\sigma^2}. Then, for all {K > 0},

\displaystyle  {\mathbb E}[(S-K)_+]=F\Phi(d_1)-K\Phi(d_2) (8)

where,

\displaystyle  \begin{aligned} &d_1=\sigma^{-1}\log(F/K)+\sigma/2\\ &d_2=\sigma^{-1}\log(F/K)-\sigma/2 \end{aligned}

While (8) is not difficult to prove by writing out the expectation as an integral, and directly applying changes of variables to this, it can get rather messy. Instead, start by expanding {(S-K)_+} into the difference {S1_{\{S > K\}} - K1_{\{S > K\}}} and using linearity of expectations,

\displaystyle  \begin{aligned} {\mathbb E}[(S-K)_+] &={\mathbb E}[S1_{\{S > K\}}]-K{\mathbb P}(S > K)\\ &=F{\mathbb Q}(S > K)-K{\mathbb P}(S > K), \end{aligned} (9)

where we substituted in the probability measure {d{\mathbb Q}\sim S d{\mathbb P}}. If we set {X = \log S} then, by definition, this is normal with variance {\sigma^2}. Also, applying (5),

\displaystyle  F={\mathbb E}[S]=e^{{\mathbb E}[X]+\frac12\sigma^2}

giving the mean of X as {{\mathbb E}[X]=\log F-\frac12\sigma^2}. Hence, {X=\log F-\sigma^2/2-\sigma Y} for a standard normal Y, and we obtain,

\displaystyle  {\mathbb P}(S > K)={\mathbb P}(\sigma Y < \log(F/K)-\sigma^2/2)=\Phi(d_2).

The final term on the right hand side of (9) then agrees with the final term on the right of (8).

The calculation for {{\mathbb Q}} is the same except that, now, theorem 9 says that {{\mathbb E}_{\mathbb Q}[S]=e^{\sigma^2}F}. Replacing {F} by {e^{\sigma^2}F} in the equality above gives,

\displaystyle  \begin{aligned} {\mathbb Q}(S > K) &={\mathbb Q}(\sigma Y < \log(F/K)-\sigma^2/2)\\ &={\mathbb P}(\sigma Y < \log(F/K)+\sigma^2/2)=\Phi(d_1). \end{aligned}

Substituting this into (9) gives (8) as claimed.

The approach to the Black-Scholes formula here is entirely mathematical, involving the manipulations described above for normal variables. The method, including the use of a measure change, can also described financially in terms of option cashflows Suppose that we want to value a payout at some future time given in terms of a dollar amount. Say, V dollars. Under the forward dollar pricing measure, this is given by an expectation {{\mathbb E}[V]}. Now suppose that S represents the FX rate with respect to a foreign currency, such as the euro. That is, S is the future dollar value of one euro. Now consider the value of a future payout of V euros. This will have a value of {SV} dollars and, hence, we value it as {{\mathbb E}[SV]}. In particular, taking {V=1} gives the forward price {F={\mathbb E}[S]}, which is the number of dollars we would now agree to exchange, at the future date, for one euro.

Again consider a future payment of V euros. As mentioned above, the dollar value is {{\mathbb E}[SV]}. The euro value is given by dividing through by the forward F. This is just the same as the expectation {{\mathbb E}_{\mathbb Q}[V]}, so that {{\mathbb Q}} is the euro pricing measure. The inverted FX cross is just the value of one dollar in euros, or {S^{-1}}. The expected value of this in the euro measure is the number of euros we would agree agree to pay at the future date for one dollar which, to be consistent with the above, must be {F^{-1}}. This can be confirmed mathematically,

\displaystyle  {\mathbb E}_{{\mathbb Q}}[S^{-1}]={\mathbb E}[SS^{-1}]/{\mathbb E}[S]=F^{-1}.

Now consider a call option on the FX cross with strike K. This will pay us one euro in exchange for K dollars if, on the future date, we decide to exercise. Converted to dollars, this is an amount of {S-K} but, as we would only exercise the option if the payout is positive for us, we receive {(S-K)_+}. Hence, the Black-Scholes formula (8) gives the dollar value of this option.

Note that the following two are the same,

  1. Receive one euro and pay K dollars, if {S > K}.
  2. Receive one euro if {S^{-1} < K^{-1}}, and pay K dollars if {S> K}.

The first of these describes a call option of strike K. The second describes a binary put option on {S^{-1}} denominated in euros minus K binary call options on S denominated in dollars. So, these two setups have the same value, and identity (9) is just the mathematical expression of this. The dollar binary call has value {{\mathbb P}(S > K)} which, by the manipulations above has value {\Phi(d_2)}, whereas the euro binary put has value {{\mathbb Q}(S^{-1} < K^{-1})}.

It is well understood that FX options have the same volatility from the viewpoint of both foreign and domestic observers, even though they may be using different measures to express it. This is the first part of the statement of theorem 9 above. This means that {F^2S^{-1}} has the same mean and log-variance under the euro measure as {S} has under {{\mathbb P}}, so we have

\displaystyle  {\mathbb Q}(S^{-1} < K^{-1})={\mathbb P}(S < F^2/K) = 1 - {\mathbb P}(S > F^2/K).

As {\Phi(-d)=1-\Phi(d)}, we obtain {d_1} from {d_2} by replacing K with {F^2/K} and changing the sign.

Note that, in the process of deriving the Black-Scholes formula, we also obtained the following simple result.

Lemma 10 Let S be a lognormal random variable with mean {F={\mathbb E}[S]}. Then, {F^2/S} has the same distribution under the measure {d{\mathbb Q}=S d{\mathbb P}} as S has under {{\mathbb P}}.

Writing this out explicitly gives

\displaystyle  {\mathbb E}\left[Sf\left({\mathbb E}[S]^2/S\right)\right]={\mathbb E}[S]{\mathbb E}\left[f(S)\right]

for all lognormal random variables S and measurable {f\colon{\mathbb R}\rightarrow{\mathbb R}_+}.

The mean of the absolute value and positive part of a standard normal random variable are straightforward to compute.

Lemma 11 Let X be standard normal. Then, {{\mathbb E}[X_+]=(2\pi)^{-1/2}} and {{\mathbb E}[\lvert X\rvert]=(2/\pi)^{1/2}}.

Proof: We apply identity (1) to evaluate {{\mathbb E}[X_+]},

\displaystyle  \int_0^\infty x\varphi(x)dx=-\int_0^\infty\varphi^\prime(x)dx = \varphi(0).

Then, by symmetry, {{\mathbb E}[\lvert X\rvert]=2{\mathbb E}[X_+]}. ⬜

Considering the Black-Scholes example again, sometimes traders use an approximate formula which is simple enough to be able to roughly value options in their head, without having to resort to a calculator. This applies to at-the-money options for which the strike K is close to the forward F, and where the log-variance is low enough that the lognormal distribution can be approximated by a normal. This is usually the case for options which are relatively close to their expiration date, implying a small variance.

Example 2 (Simplified Black-Scholes) Let S be normal with mean F and standard deviation {F\sigma}. Then,

\displaystyle  {\mathbb E}[(S-F)_+]=\frac{F\sigma}{\sqrt{2\pi}}\approx0.4 F\sigma.

Proof: We write {S=F(1+\sigma Y)} for a standard normal Y. So, by lemma 11,

\displaystyle  {\mathbb E}[(S-F)_+]=F\sigma{\mathbb E}[X_+]=(2\pi)^{-1/2}F\sigma.

By direct calculation, {(2\pi)^{-1/2}\approx0.4} holds to within a relative error of 0.3%. ⬜

Moving on, there are also various expressions which help when looking at quadratic functions of normals. Recall that the gamma distribution with shape parameter {k > 0} (and unit rate parameter) is the nonnegative distribution with probability density proportional to {x^{k-1}e^{-x}} over {x > 0}.

Lemma 12 Let X be standard normal. Then, {\frac{1}{2}X^2} has the gamma distribution with shape parameter {1/2}. This has probability density

\displaystyle  p(y)=\pi^{-1/2}y^{-1/2}e^{-y}

over {y > 0}.

Proof: For any measurable function {f\colon{\mathbb R}\rightarrow{\mathbb R}_+}, compute the expectation of {f(X^2/2)} as,

\displaystyle  \begin{aligned} \frac{2}{\sqrt{2\pi}}\int_0^\infty f(x^2/2)e^{-\frac{x^2}2}dx &=\frac{2}{\sqrt{2\pi}}\int_0^\infty f(y)e^{-y} y^{-1/2}\frac{dy}{\sqrt{2y}}\\ &=\frac1{\sqrt\pi}\int_0^\infty f(y)y^{-1/2}e^{-y}dy, \end{aligned}

as required. Here, the substitution {y=x^2/2} was applied. ⬜

A consequence is that we can easily compute all moments of a standard normal, including the noninteger moments, in terms of the gamma function. Recall that this is defined by

\displaystyle  \Gamma(s)=\int_0^\infty x^{s-1}e^{-x}dx

over {\Re(s) > 0}.

Corollary 13 If X is standard normal then, for all {s\in{\mathbb C}} with {\Re(s) > -1},

\displaystyle  {\mathbb E}[\lvert X\rvert^s]=2^{s/2}\pi^{-1/2}\Gamma\left(\frac{s+1}{2}\right).

Proof: As {Y=X^2/2} has the gamma distribution with parameter 1/2, the expected value of {\lvert X\rvert^s} is

\displaystyle  {\mathbb E}[(2Y)^{s/2}] =2^{s/2}\pi^{-1/2}\int_0^\infty y^{s/2} y^{-1/2}e^{-y}dy

as required. ⬜

This result is not immediately obvious, even at {s=0} since, there, the moment is equal to one and the result is equivalent to the identity {\Gamma(1/2)=\sqrt\pi}. This is indeed satisfied by the gamma function. However, we have not stumbled upon a new way of proving this since, by a simple substitution, it can be seen to be equivalent to the fact that the density function {\varphi(x)} integrates to one, so that {\frac1{\sqrt{2\pi}}} is the correct normalization factor, which was assumed above.

For nonnegative even integer values {s=2n}, it is interesting to compare this with the moments given in corollary 3,

\displaystyle  {\mathbb E}[X^{2n}]=2^n\pi^{-1/2}\Gamma(n+1/2)=\frac{(2n)!}{2^nn!}

In particular, this can only be true if the gamma function at half-integer values satisfies

\displaystyle  \Gamma(n+1/2)=\frac{\pi^{1/2}(2n)!}{4^n n!}.

For {n=0}, this is the identity {\Gamma(1/2)=\sqrt{\pi}} discussed above and, for all positive integer n, it follows from the recurrence {\Gamma(x+1)=x\Gamma(x)} and induction.

For another consequence of lemma 12, expectations involving a normal variable can always be expressed using a gamma distribution.

Corollary 14 Let X be standard normal. Then, for all measurable {f\colon{\mathbb R}\rightarrow{\mathbb R}_+},

\displaystyle  {\mathbb E}[f(X)]=\frac12{\mathbb E}\left[f(\sqrt{2Y})+f(-\sqrt{2Y})\right]

where {Y=X^2/2} has the gamma distribution of rate 1/2.

Proof: By symmetry,

\displaystyle  \begin{aligned} &{\mathbb E}[1_{\{X > 0\}}f(X)]=\frac12{\mathbb E}[f(\lvert X\rvert)]=\frac12{\mathbb E}\left[f(\sqrt{2Y})\right],\\ &{\mathbb E}[1_{\{X < 0\}}f(X)]=\frac12{\mathbb E}[f(-\lvert X\rvert)]=\frac12{\mathbb E}\left[f(-\sqrt{2Y})\right], \end{aligned}

and, adding these together gives the result. ⬜

The normal density function also satisfies the following simple identity

\displaystyle  e^{-\frac12\lambda x^2}\varphi(x)=\varphi(\sqrt{1+\lambda}x), (10)

for all real {\lambda > -1}. We can use this to prove the following result, which is of a similar flavour to theorem 1, except that it involves the square of X.

Theorem 15 If X is standard normal then,

\displaystyle  {\mathbb E}[e^{-\frac12\lambda X^2}f(X)]=\frac1{\sqrt{1+\lambda}}{\mathbb E}\left[f\left(\frac X{\sqrt{1+\lambda}}\right)\right] (11)

for all measurable {f\colon{\mathbb R}\rightarrow{\mathbb R}_+} and {\lambda > -1}.

Proof: The expectation of {e^{-\frac12\lambda^2X}f(X)} can be computed using (10),

\displaystyle  \begin{aligned} \int e^{-\frac12\lambda x^2}f(x)\varphi(x)dx &=\int f(x)\varphi(\sqrt{1+\lambda}x)dx\\ &=\int f(y/\sqrt{1+\lambda})\frac{dy}{\sqrt{1+\lambda}} \end{aligned}

as required. Here, the substitution {y=\sqrt{1+\lambda}x} was used. ⬜

The following result follows from theorem 15 in the same way that theorem 2 followed from theorem 1.

Theorem 16 If X is standard normal and {\lambda\in{\mathbb C}}, then {e^{-\frac12\lambda X^2}} is integrable if and only if {\Re(\lambda) > -1}, in which case

\displaystyle  {\mathbb E}[e^{-\frac12\lambda X^2}]=\frac1{\sqrt{1+\lambda}}. (12)

Proof: If {\Re(\lambda)\le-1} then

\displaystyle  \lvert e^{-\frac12\lambda x^2}\varphi(x)\rvert=e^{-\frac12\Re(\lambda)x^2}\varphi(x)\ge\frac1{\sqrt{2\pi}}

has infinite integral over the real numbers and, hence, {e^{-\frac12\lambda X^2}} is not integrable. On the other hand, for real {\lambda > -1} then, taking {f=1} in (11) gives (12). For complex values of {\lambda}, we have to be careful which of the square roots to take in (12). We take the one with positive real part, which is standard and is complex differentiable. Hence, as in the proof of theorem 2, analytic continuation implies that (12) holds for all complex values of {\lambda} with real part greater than -1. ⬜

Using real values of {\lambda} gives the moment generating function of {X^2/2} and imaginary values gives the characteristic function. By lemma 12, these are the moment generating and characteristic function of the gamma distribution with shape parameter 1/2.

The remaining results given above for standard normals also carry across to the case with arbitrary mean and variance in a straightforward way. For example, theorem 16 extends as follows. This result is a bit less easy to remember than the others, so if needed, I would just derive it in the same way as done here.

Lemma 17 If X is normal with mean {\mu} and variance {\sigma^2} then, for {\lambda\in{\mathbb C}}, {e^{-\frac12\lambda X^2}} is integrable if and only if {\Re(\lambda) > -\sigma^{-2}}, in which case

\displaystyle  {\mathbb E}\left[e^{-\frac12\lambda X^2}\right]=\frac{1}{\sqrt{1+\lambda \sigma^2}}e^{-\frac12\lambda\mu^2/(1+\lambda\sigma^2)}. (13)

Proof: As {X=\mu+\sigma Y} for a standard normal Y then, for real {\lambda > -\sigma^{-2}}, the expected value of {e^{-\frac12\lambda X^2}} is given by,

\displaystyle  \begin{aligned} {\mathbb E}\left[e^{-\frac12\lambda\sigma^2Y^2}e^{-\lambda\sigma\mu Y-\frac12\lambda\mu^2}\right] &=\frac1{\sqrt{1+\lambda\sigma^2}}{\mathbb E}\left[e^{-\lambda\sigma\mu Y/\sqrt{1+\lambda\sigma^2}-\frac12\lambda\mu^2}\right]\\ &=\frac1{\sqrt{1+\lambda\sigma^2}}e^{\frac12\lambda^2\sigma^2\mu^2/(1+\lambda\sigma^2)-\frac12\lambda\mu^2} \end{aligned}

The first equality is applying (11) and the second is using (4). Rearranging gives (13). As previously, analytic continuation extends this to all {\Re(\lambda) > -\sigma^{-2}}. Letting {\lambda} decrease to {-\sigma^{-2}}, the right hand side of (13) increases to infinity, so {e^{-\frac12\lambda^2 X^2}} is not integrable for {\lambda=\sigma^{-2}}. Hence, it is not integrable for {\Re(\lambda)\le-\sigma^{-2}}. ⬜

Theorem 9 showing that normal variables remain normal under a lognormal change of measure also extends to changes of measure involving the square of a normal.

Theorem 18 If X is normal then it remains normal under the measure given by {d{\mathbb Q}\sim e^{-\frac12\lambda X^2}d{\mathbb P}} for any {\lambda > -1}.

Proof: As {X=\mu+\sigma Y} for a standard normal Y, we have {d{\mathbb Q}\sim e^{-\frac12\lambda\sigma^2 Y^2-\lambda\sigma\mu Y}d{\mathbb P}}. By theorem 15, Y is normal under the measure {d\tilde{\mathbb P}\sim e^{-\frac12\lambda\sigma^2 Y^2}d{\mathbb P}} and then, by theorem 9, it is also normal under the measure {d{\mathbb Q}\sim e^{-\lambda\sigma\mu Y}d\tilde{\mathbb P}}. ⬜

As stated, this is a very easy result to remember. The exact distribution of X under the measure change requires also computing its mean and variance. For example, the variance of X under the measure {\tilde{\mathbb P}} is given by theorem 15 to be {\sigma^2/(1+\lambda\sigma^2)} and, as the measure change given by theorem 9 does not affect variances, it is the same under {{\mathbb Q}},

\displaystyle  {\rm Var}_{\mathbb Q}(X)=\frac{\sigma^2}{1+\lambda\sigma^2}. (14)

Alternatively, the moment generating function under {{\mathbb Q}} can be computed from (13). Using {\sim} to denote equality up to a scaling factor independent of u,

\displaystyle  \begin{aligned} {\mathbb E}_{\mathbb Q}[e^{-u\lambda X}] &\sim{\mathbb E}[e^{-u\lambda X-\frac12\lambda X^2}]\\ &=e^{\frac12\lambda u^2}{\mathbb E}[e^{-\frac12\lambda(X+u)^2}]\\ &\sim e^{\frac12\lambda u^2}e^{-\frac12\lambda(u+\mu)^2/(1+\lambda\sigma^2)} \end{aligned}

Noting that this is the exponential of a quadratic in u, we can read off the mean and variance from the coefficients of {u^2} and {u}. For example, the coefficient of {u} inside the expectation on the right hand side is {-\lambda\mu/(1+\lambda\sigma^2)} giving the mean of X under {{\mathbb Q}} as,

\displaystyle  {\mathbb E}_{\mathbb Q}[X]=\frac{\mu}{1+\lambda\sigma^2}. (15)

We see that, under the change of measure in theorem 18, both the mean and variance of X are divided by {1+\lambda\sigma^2}.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s