Stochastic Calculus Notes

I have decided to use my blog to post some notes that I initially made on stochastic calculus when learning the subject myself. I wrote these after reading through some books which took an unnecessarily long and difficult route to get to the interesting stuff which I was interested in. Complicated and rather obscure subjects such as optional and predictable projection and a lot of theory of continuous-time martingales were dealt with at length before getting round to the general theory of stochastic integration. Consequently, I decided to go through the theory myself in a more direct way, while still working out rigorous proofs of all the more useful theorems which I was interested in learning. The result was three small notepads containing the following.

  • Basic definitions regarding continuous-time filtrations, adapted processes, predictable processes, stopping times, martingales, etc.
  • Some useful elementary results such as the debut theorem for right continuous processes and the existence of cadlag versions of martingales
  • Definition of stochastic integration and elementary properties.
  • Definition and elementary properties of quadratic variations.
  • Ito’s formula, including the generalized Ito formula for non-continuous processes.
  • Stochastic integration with respect to martingales.
  • The Doob-Meyer decomposition.
  • Quasimartingale decompositions.
  • Decompositions of semimartingales.
  • Decompositions and integration with respect to special semimartingales.

This covers a lot of the general underlying theory required. Of course, being able to apply this to practical applications requires further knowledge of stuff like stochastic differential equations. Time permitting, I’ll start to post these notes here.As I have learned much more since originally making these notes, I will attempt to simplify or improve on the originals where possible.

The prerequisite knowledge required to properly understand these notes is measure theoretic probability theory (e.g., properties of the Lebesgue integral such as dominated convergence, Fubini’s theorem, L^p spaces, convergence in probability, etc.).

Integrating with respect to Brownian motion

In this post I attempt to give a rigorous definition of integration with respect to Brownian motion (as introduced by Itô in 1944), while keeping it as concise as possible. The stochastic integral can also be defined for a much more general class of processes called semimartingales. However, as Brownian motion is such an important special case which can be handled directly, I start with this as the subject of this post. If {\{X_s\}_{s\ge 0}} is a standard Brownian motion defined on a probability space {(\Omega,\mathcal{F},{\mathbb P})} and {\alpha_s} is a stochastic process, the aim is to define the integral

\displaystyle  \int_0^t\alpha_s\,dX_s. (1)

In ordinary calculus, this can be approximated by Riemann sums, which converge for continuous integrands whenever the integrator {X} is of finite variation. This leads to the Riemann-Stietjes integral and, generalizing to measurable integrands, the Lebesgue-Stieltjes integral. Unfortunately this method does not work for Brownian motion which, as discussed in my previous post, has infinite variation over all nontrivial compact intervals.

The standard approach is to start by writing out the integral explicitly for piecewise constant integrands. If there are times {0=t_0\le t_1\le\cdots\le t_n=t} such that {\alpha_s=\alpha_{t_k}} for each {s\in(t_{k-1},t_k)} then the integral is given by the summation,

\displaystyle  \int_0^t\alpha\,dX = \sum_{k=1}^n\alpha_{t_k}(X_{t_k}-X_{t_{k-1}}). (2)

We could try to extend to more general integrands by approximating by piecewise constant processes but, as mentioned above, Brownian motion has infinite variation paths and this will diverge in general.

Fortunately, when working with random processes, there are a couple of observations which improve the chances of being able to consistently define the integral. They are

  • The integral is not a single real number, but is instead a random variable defined on the probability space. It therefore only has to be defined up to a set of zero probability and not on every possible path of {X}.
  • Rather than requiring limits of integrals to converge for each path of {X} (e.g., dominated convergence), the much weaker convergence in probability can be used.

These observations are still not enough, and the main insight is to only look at integrands which are adapted. That is, the value of {\alpha_t} can only depend on {X} through its values at prior times. This condition is met in most situations where we need to use stochastic calculus, such as with (forward) stochastic differential equations. To make this rigorous, for each time {t\ge 0} let {\mathcal{F}_t} be the sigma-algebra generated by {X_s} for all {s\le t}. This is a filtration ({\mathcal{F}_s\subseteq\mathcal{F}_t} for {s\le t}), and {(\Omega,\mathcal{F},\{\mathcal{F}_t\}_{t\ge 0},{\mathbb P})} is referred to as a filtered probability space. Then, {\alpha} is adapted if {\alpha_t} is {\mathcal{F}_t}-measurable for all times {t}. Piecewise constant and left-continuous processes, such as {\alpha} in (2), which are also adapted are commonly referred to as simple processes.

However, as with standard Lebesgue integration, we must further impose a measurability property. A stochastic process {\alpha} can be viewed as a map from the product space {{\mathbb R}_+\times\Omega} to the real numbers, given by {(t,\omega)\mapsto\alpha_t(\omega)}. It is said to be jointly measurable if it is measurable with respect to the product sigma-algebra {\mathcal{B}({\mathbb R}_+)\otimes\mathcal{F}}, where {\mathcal{B}} refers to the Borel sigma-algebra. Finally, it is called progressively measurable, or just progressive, if its restriction to {[0,t]\times\Omega} is {\mathcal{B}([0,t])\otimes\mathcal{F}_t}-measurable for each positive time {t}. It is easily shown that progressively measurable processes are adapted, and the simple processes introduced above are progressive.

With these definitions, the stochastic integral of a progressively measurable process {\alpha} with respect to Brownian motion {X} is defined whenever {\int_0^t\alpha^2ds<\infty} almost surely (that is, with probability one). The integral (1) is a random variable, defined uniquely up to sets of zero probability by the following two properties.

  • The integral agrees with the explicit formula (2) for simple integrands.
  • If {\alpha^n} and {\alpha} are progressive processes such that {\int_0^t(\alpha^n-\alpha)^2\,ds} tends to zero in probability as {n\rightarrow\infty}, then
    \displaystyle  \int_0^t\alpha^n\,dX\rightarrow\int_0^t\alpha\,dX, (3)

    where, again, convergence is in probability.

Continue reading “Integrating with respect to Brownian motion”

The Pathological Properties of Brownian Motion

I turn away with fear and horror from the lamentable plague of continuous functions which do not have derivatives – Charles Hermite (1893)

Brownian motion

Despite being of central importance to the theory of stochastic processes and to many applications in areas such as physics and economics, Brownian motion has some nasty properties such as being nowhere differentiable, which are in stark contrast to the usual well-behaved functions studied in elementary differential calculus. As I intend to post entries on stochastic calculus, it seems that a good place to start is by describing some of the properties of Brownian motion which rule out the use of the standard techniques of differential calculus. Strictly speaking, these properties should not really be regarded as pathological although they can seem so to someone not familiar with such processes and would have been regarded as such at the time of Hermite’s statement above.

Historically, the term `Brownian motion’ refers to the experiments performed by Robert Brown in 1827 where pollen and dust particles floating on the surface of water are observed to move about with a jittery motion. This was explained mathematically by Albert Einstein in 1905 and Marian Smoluchowski in 1906, and is caused by the particles being continuously bombarded by water molecules. Louis Bachelier also studied the mathematical properties of Brownian motion in 1900, applying it to the evolution of stock prices.

Mathematically, Brownian motion is a stochastic process whose increments are independent and identically distributed random variables, and which has continuous sample paths. In the case of the random motion of particles due to collisions with water molecules, as in the experiments performed by Robert Brown, each bombardment by a molecule will not produce a sudden change in the position of the particle. Instead, they will produce a sudden change in the particle’s velocity. So mathematical Brownian motion as described here is better used as model of the velocity of the particle rather than its position (even better – the velocity can be modeled by an Ornstein-Uhlenbeck process). More generally, it is used as a source of random noise in many models of physical and economic systems. It is also referred to as a Wiener process after Norbert Wiener and often represented using a capital W. Continue reading “The Pathological Properties of Brownian Motion”

The Gaussian Correlation Conjecture 2


Update: This conjecture has now been solved! See A simple proof of the Gaussian correlation conjecture extended to multivariate gamma distributions by T. Royen, and Royen’s proof of the Gaussian correlation inequality by Rafał Latała, and Dariusz Matlak.


We continue investigating the Gaussian correlation conjecture in this post. This states that if {\mu_n} is the standard Gaussian measure on {{\mathbb R}^n} then

\displaystyle  \mu_n(A\cap B)\ge \mu_n(A)\mu_n(B) (1)

for all symmetric and convex sets {A,B}. In this entry, we consider a stronger `local’ version of the conjecture, which has the advantage that it can be approached using differential calculus. Inequality (1) can alternatively be stated in terms of integrals,

\displaystyle  \mu_n(fg)\ge\mu_n(f)\mu_n(g). (2)

This is clearly equivalent to (2) when {f,g} are indicator functions of convex symmetric sets. More generally, using linearity, it extends to all nonnegative functions such that {f^{-1}([a,\infty))} and {g^{-1}([a,\infty))} are symmetric and convex subsets of {{\mathbb R}^n} for positive {a}. A class of functions lying between these two extremes, which I consider here, are the log-concave functions. Continue reading “The Gaussian Correlation Conjecture 2”