# Martingale Marginals

[This post originates from twitter threads posted on 9 Sep 21 and 2 Feb 22.]

A while ago, I proved the result: A continuous strong Markov martingale is uniquely determined by its marginal distributions. This is discussed in a recent paper by Beiglböck, Pammer, and Schachermayer.

Time for a thread discussing some of these ideas, which have been studied for decades in both stochastic calculus and mathematical finance.

➢ This is a surprisingly subtle result, which is difficult to prove in full generality. I managed this in the following paper, but the proof is long.

➢ I will go over some of the history of this problem, before moving on to my own contributions. First, let’s start with the basics:

• A (real valued) martingale Xt is a random process where at any time t, its future expected values (conditional on all currently observable information) equals the current value:
 $\displaystyle X_t={\mathbb E}[X_T\;\vert\mathcal F_t]$

for times t ≤ T. E.g., standard Brownian motion, the cumulative winnings playing a fair gambling game, etc.

• Next, the (1-dimensional) marginals are the distributions of Xt at each individual time t. This is way too little information to infer the joint distributions. But, what about if X is known to be a martingale?

➢ The question has a long history. Lévy proved various characterisations for a continuous martingale to be a Brownian motion. E.g., if Xt2 – t is also a martingale, or if X has IID increments.

➢ In 1972 Kellerer proved a necessary and sufficient condition for a martingale to exist matching a given set of marginals: They must be increasing in the convex order. This means that E[f(Xt)] must be increasing in t, for all convex functions f.

A stochastic process or collection of marginal distributions which is increasing in the convex order is sometimes referred to as a ‘peacock’, which is a pun on the French acronym PCOC for “processus croissant pour l’ordre convex”.

➢ The necessity of Kellerer’s condition follows easily from Jensen’s inequality. The significance of the result is that it is also sufficient for such a martingale to exist.

But is it uniquely determined by the marginals? Financial option pricing theory has a lot to say.

➢ The prices of a traded financial asset through time can be modelled as a martingale Xt. Then, the time-0 price of a call option on that asset of strike price x and maturity t is:

 $\displaystyle C(t,x)={\mathbb E}[(X_t-x)^+].$

This is fully determined by the marginals of X and, in the other direction, the distribution function of Xt is obtained as the derivative Cx(t, x).

➢ So, having call prices C(t, x) of all strikes and maturities is the same as having the marginals. As call options are traded in the markets, their prices are often known.

C(t, x) is automatically convex in x, and the property of being increasing in t is equivalent to the marginals increasing in the convex order.

Local volatility is a popular option pricing model. This specifies the asset price X as a solution to the stochastic differential equation (SDE)

 $\displaystyle dX_t = \sigma(t,X_t)dB_t$

for driving Brownian motion B and local volatility surface σ. In 1994 Bruno Dupire noted that option prices satisfy the partial differential equation (PDE)

 $\displaystyle \frac{\partial}{\partial t}C(t,x)=\frac12\sigma(t,x)^2\frac{\partial^2}{\partial x^2}C(t,x).$

As the probability density of Xt is given by the second derivative Cxx(t, x), Dupire’s approach links the call prices/marginal densities to the local volatility. This requires that the call prices are smooth and Cxx is strictly positive.

➢ So, given marginal distributions, the local volatility method is to back out the local vol surface σ,

 $\displaystyle \sigma(t,x)=\sqrt{2\frac{C_t(t,x)}{C_{xx}(t,x)}}$

and solve the SDE

 $\displaystyle dX_t=\sigma(t,X_t)dB_t$

to find a martingale X with the required marginals.

➢ So, there is a unique local volatility model matching given marginals! At least, subject to various smoothness and regularity constraints on the option prices and volatilities…

➢ The images below show an example where the marginals are chosen according to the plotted ‘implied volatilities’. The (normal) implied volatility for each maturity time t and strike price x is, by definition, the constant value of volatility at which the call price equals C(t, x). That is, E[(Xt - x)+] is equal to same value as if Xt is normal with mean 0 and standard deviation σimp(t, x)√t. Implied volatilities are just a convenient alternative way of quoting call prices. I also show the call prices, the marginal densities, and the obtained local vols.

Moving on to my own contributions to this problem, we start with the question: what can we do if the call prices C(t, x) are not smooth? We could try approximating by smooth functions but, in the limit, the local volatilities will not converge to anything, and will oscillate wildy.

➢ We can hope for the following: when the local vols are plugged into the SDE and we compute the joint distributions of the martingale X, the oscillations all cancel out when we take the limit. There does not seem to be any way to take this limit directly using the SDE…

➢ …but there is one property which suggests that the limit does have well-defined distributions. As we are fitting specified marginals we know, by construction, that these one-dimensional marginals will converge. Is it such a stretch to suppose that the joint distributions also converge to a limit?

➢ Rather than using the SDE above, the local vol model can be described by a ‘backwards equation’. For any positive time T, consider the expected value of a function g(XT) of the martingale at earlier times t:

 $\displaystyle f(t,X_t)={\mathbb E}[g(X_T)\;\vert X_t].$

➢ The backwards equation says that f satisfies the PDE

 $\displaystyle \frac{\partial}{\partial t}f(t,x)+\frac12\sigma(t,x)^2\frac{\partial^2}{\partial x^2}f(t,x)=0.$

This is the same as f(t, Xt) being a (local) martingale. If this has a solution for the boundary condition f(T, x) = g(x) then we can compute the conditional probabilities without ever resorting to the SDE.

➢ By a backwards induction (backwards in time) this allows us to compute the expectation of functions of the form g(Xt1)g(Xt2)⋯g(Xtn), completely determining the joint distributions of X. Step back from the final time tn to tn - 1, and so on, until we have a function of Xt1 only. Specifically, if we write

 $\displaystyle \tilde g(X_{t_{n-1}})=g(X_{t_{n-1}}){\mathbb E}\left[g(X_{t_{n}})\;\vert X_{t_{n-1}}\right]$

then,

 $\displaystyle {\mathbb E}\left[g(X_{t_1})g(X_{t_2})\cdots g(X_{t_n})\right]= {\mathbb E}\left[g(X_{t_1})g(X_{t_2})\cdots \tilde g(X_{t_{n-1}})\right],$

reducing the number of terms inside the expectation. Inductively, this reduces to an expectation of a function of X at the initial time only, which can be computed using the marginal distribution.

The main observation: The forwards equation for C (Dupire formula) and backwards equation for f can be combined to eliminate the local volatility term σ:

 \displaystyle \begin{aligned} &\frac{\partial C}{\partial t}=\frac12\sigma^2\frac{\partial^2C}{\partial x^2},\\ &\frac{\partial f}{\partial t}+\frac12\sigma^2\frac{\partial^2 f}{\partial x^2}=0\\ \\ \Rightarrow\quad&\frac{\partial^2f}{\partial x^2}\frac{\partial C}{\partial t}+\frac{\partial^2C}{\partial x^2}\frac{\partial f}{\partial t}=0. \end{aligned}

That is, fxxCt + Cxxft = 0. From call prices C, this gives a backwards equation for f without involving the local vols.

➢ This is satisfyingly simple and symmetric in f and C, suggesting a kind of time reversal where the call prices C and conditional expectations f have their roles reversed. There are still two problems

1. C might not be smooth.
2. There might not be a smooth solution for f.

➢ There is one advantage of expressing the model using a PDE rather than an SDE: it can be smoothed. With any luck, this will cancel out some of the oscillations when f and C approximate non-smooth functions.

Multiply by a smooth function θ(t, x) and integrate. I denote this by μfC(θ). In a stroke of luck, a few applications of integration by parts removes all 2nd derivatives!

We obtain μfC(θ) = ∬(fxCxθt - θxfxCt - Cxθxft)dtdx.

 \displaystyle \begin{aligned} \mu_{fC}(\theta) &=\iint\left(\frac{\partial^2f}{\partial x^2}\frac{\partial C}{\partial t}+\frac{\partial^2C}{\partial x^2}\frac{\partial f}{\partial t}\right)\theta\,dtdx\\ &=\iint\left( \frac{\partial f}{\partial x}\frac{\partial C}{\partial x}\frac{\partial \theta}{\partial t} - \frac{\partial \theta}{\partial x}\frac{\partial f}{\partial x}\frac{\partial C}{\partial t} - \frac{\partial C}{\partial x}\frac{\partial \theta}{\partial x}\frac{\partial f}{\partial t} \right)dtdx. \end{aligned}

➢ Combine partial time derivatives with time integrals to replace with Lebesgue-Stieltjes integrals. I.e., replace ftdt by dtf.

 $\displaystyle \mu_{fC}(\theta)=\iint\left( \frac{\partial f}{\partial x}\frac{\partial C}{\partial x}d_t\theta - \frac{\partial \theta}{\partial x}\frac{\partial f}{\partial x}d_tC - \frac{\partial C}{\partial x}\frac{\partial \theta}{\partial x}d_tf\right)dx.$

As C is convex in x and monotonic in t, it already has x-derivatives almost-everywhere and, by monotonicity, the Lebesgue-Stieltjes integral in t exists. We just need f to also satisfy these properties.

➢ Let’s pause for a moment to consider what the expression μfC(θ) really means. As used here, it is just an expression involving 1st order derivatives which vanishes in the event that f(t, Xt) is a martingale. However, it can also be shown to be given by

 $\displaystyle \mu_{fC}(\theta)={\mathbb E}\left[\int_0^\infty\theta(t,X_t)\,df(t,X_t)\right].$

Alternatively, assuming that f is sufficiently regular that we can decompose f(t, Xt) = Mt + Vt for martingale M and continuous process V of integrable variation (i.e., the drift term) then,

 $\displaystyle \mu_{fC}(\theta)={\mathbb E}\left[\int_0^\infty\theta(t,X_t)\,dV_t\right].$

So, μfC(θ) is the expected value of the integral of θ(t, Xt) with respect to the drift of f(t, Xt), and the vanishing of this drift is our martingale condition. The fact that this can be expressed only using first order derivatives of f, C and θ is surprising.

➢ If we try to prove that μfC = 0 is a valid martingale condition by using the methods of this thread, we must assume that X satisfies the SDE above and f and C are both smooth. Constructing a general proof without imposing smoothness constraints is difficult. I achieved this in the following paper, but required the theory of Dirichlet processes and a time-reversed Doob-Meyer quasimartingale decomposition.

➢ The first step in the proofs, is to identify the class of processes for X to apply the procedure to. It should include solutions to the local vol SDE, but also certain limits of these. It turns out that the suitable properties are that X should be a continuous strong Markov martingale.

➢ I showed that the continuous strong Markov martingale property is preserved under the relevant limits here:

➢ I also showed that if X is of this type the conditional expectations

 $\displaystyle f(t,X_t)={\mathbb E}[g(X_T)\;\vert X_t]$

are sufficiently regular in the following paper.

Specifically, if g is convex then f is convex in x and decreasing in t. So μfC is well-defined, and the approach works!

➢ This used coupling techniques introduced by David Hobson here:

Putting together everything described above gives a complete solution. A backwards equation with well-defined solutions guaranteeing uniqueness of the continuous strong Markov martingale X!

➢ The question of existence of strong Markov continuous martingales with given marginals can be answered by taking limits of smooth (or otherwise tractable) marginals, and using the fact that the property is preserved under such limits.

I did this here:

➢ I did allow certain mild kinds of discontinuity for X in my linked papers, to handle those marginals which are inconsistent with a continuous process.

Now that we have existence and uniqueness, we are done!