Do Convex and Decreasing Functions Preserve the Semimartingale Property?

Some years ago, I spent considerable effort trying to prove the hypothesis below. After failing at this, I spent time trying to find a counterexample, but also with no success. I did post this as a question on mathoverflow, but it has so far received no conclusive answers. So, as far as I am aware, the following statement remains unproven either way.

Hypothesis H1 Let {f\colon{\mathbb R}_+\times{\mathbb R}\rightarrow{\mathbb R}} be such that {f(t,x)} is convex in x and right-continuous and decreasing in t. Then, for any semimartingale X, {f(t,X_t)} is a semimartingale.

It is well known that convex functions of semimartingales are themselves semimartingales. See, for example, the Ito-Tanaka formula. More generally, if {f(t,x)} was increasing in t rather than decreasing, then it can be shown without much difficulty that {f(t,X_t)} is a semimartingale. Consider decomposing {f(t,X_t)} as

\displaystyle  f(t,X_t)=\int_0^tf_x(s,X_{s-})\,dX_s+V_t, (1)

for some process V. By convexity, the right hand derivative of {f(t,x)} with respect to x always exists, and I am denoting this by {f_x}. In the case where f is twice continuously differentiable then the process V is given by Ito’s formula which, in particular, shows that it is a finite variation process. If {f(t,x)} is convex in x and increasing in t, then the terms in Ito’s formula for V are all increasing and, so, it is an increasing process. By taking limits of smooth functions, it follows that V is increasing even when the differentiability constraints are dropped, so {f(t,X_t)} is a semimartingale. Now, returning to the case where {f(t,x)} is decreasing in t, Ito’s formula is only able to say that V is of finite variation, and is generally not monotonic. As limits of finite variation processes need not be of finite variation themselves, this does not say anything about the case when f is not assumed to be differentiable, and does not help us to determine whether or not {f(t,X_t)} is a semimartingale.

Hypothesis H1 can be weakened by restricting to continuous functions of continuous martingales.

Hypothesis H2 Let {f\colon{\mathbb R}_+\times{\mathbb R}\rightarrow{\mathbb R}} be such that {f(t,x)} is convex in x and continuous and decreasing in t. Then, for any continuous martingale X, {f(t,X_t)} is a semimartingale.

As continuous martingales are special cases of semimartingales, hypothesis H1 implies H2. In fact, the reverse implication also holds so that hypotheses H1 and H2 are equivalent.

Hypotheses H1 and H2 can also be recast as a simple real analysis statement which makes no reference to stochastic processes.

Hypothesis H3 Let {f\colon{\mathbb R}_+\times{\mathbb R}\rightarrow{\mathbb R}} be such that {f(t,x)} is convex in x and decreasing in t. Then, {f=g-h} where {g(t,x)} and {h(t,x)} are convex in x and increasing in t.

Before going any further, I will give some motivation for hypotheses H1 and H2 and explain why it would be good to know whether or not it is true.

Suppose that X is a Markov process and {g\colon{\mathbb R}\rightarrow{\mathbb R}} is such that {g(X_t)} is integrable. Then, we can define a function {f\colon[0,T]\times{\mathbb R}\rightarrow{\mathbb R}} by

\displaystyle  f(t,X_t) = {\mathbb E}[g(X_T)\;\vert\mathcal{F}_t].

This is equivalent to the following two conditions,

  1. {f(T,x) = g(x)}.
  2. {f(t,X_t)} is a martingale over {t\le T}.

If we are able to determine which functions f satisfy the martingale condition 2 above, then we can reconstruct the Markov transition function and the distribution of X. For twice differentiable functions, the Kolmogorov backwards equation or Feynman-Kac formula can be used. However, in general, what regularity properties can be imposed on f? It is known that, for diffusions with smoothly defined coefficients, f will be smooth. This does not hold for more general Markov processes. In the case that X is a continuous and strong Markov martingale then, if g is convex, {f(t,x)} can also be chosen to be convex in x and decreasing in t (see Hobson, Volatility misspecification, option pricing and superreplication via coupling). This is a familiar property in finance, where call and put options have a price which is convex in the underlying asset price and increasing with time to maturity. Although convex and monotonic functions need not be differentiable, the derivatives {f_{xx}\,dx} and {f_t\,dt} can be interpreted in a measure theoretic sense, and extensions of the backwards equation can be applied. I did post a paper on the arXiv, Fitting Martingales to Given Marginals, using these ideas to show that continuous and strong Markov martingales are uniquely determined by their 1-dimensional marginal distributions (for diffusions with smooth coefficients, this is a well known property of local volatility models used in finance). However, the proof is rather complicated due to the fact that, without knowing hypothesis H1 to be true, it is not even known a-priori whether or not {f(t,X_t)} is a semimartingale. Not having a positive answer to hypothesis H1 means that a lot of extra work is required, including the papers Nondifferentiable functions of one-dimensional semimartingales and A Generalized Backward Equation For One Dimensional Processes, which were written in order to develop an alternative technique to get around this obstacle.

Before moving on to the equivalence of the different forms of the hypothesis I will mention that, on balance, I expect that the hypothesis is probably false. As I will show below, it is equivalent to certain sets of integrals being uniformly bounded (H7), and I see no reason for such a bound to exist. However, actually finding examples making these integrals large is difficult, and I still do not have any proven counterexamples to the hypothesis.

I will state various alternative forms of hypothesis H1 in this post and give explanations showing why they are equivalent. For brevity, I do not intend to give complete and rigorous proofs here. However, by filling in the details, all of the arguments can be made fully rigorous. In total I will state nine different forms of the hypothesis, H1 through to H9, and sketch how each of the following implications holds.

H2 H1 H3 H4 H5
H9 H8 H7 H6

Combining these implications shows that all of H1 to H9 are equivalent. In most cases, the reverse implications can also be shown without too much work, the exceptions being H3H1H2. I do not know a quick proof of the converse of these without going all the way round the circle of implications above.

As continuous martingales are special cases of semimartingales, the implication H1H2 is trivial. The implication H3H1 is also straightforward. As argued above, when f is increasing in time, equation (1) decomposes {f(t,X_t)} as a stochastic integral plus an increasing process. Hence, whenever {f(t,x)} decomposes as the difference of functions which are convex in x and increasing in t then (1) expresses {f(t,X_t)} as a stochastic integral plus a finite variation process and, if it is also right-continuous, it is a semimartingale.

In hypothesis H3, the terms g and h in the decomposition {f=g-h} are given equal status. However, when trying to find such decompositions, I find that it is convenient to regard h as the primary function to be constructed. That is, the problem is to find a function {h(t,x)} which is convex in x such that {f(t,x)+h(t,x)} is increasing in t. Then, the function g is defined by {g=f+h}. Now, restricting to the unit interval {I=[0,1]}, hypothesis H3 can be localized.

Hypothesis H4 Let {f\colon I^2\rightarrow{\mathbb R}} be such that {f(t,x)} is convex and Lipschitz continuous in x and decreasing in t. Then, {f=g-h} where {g(t,x)} and {h(t,x)} are convex in x and increasing in t.

The idea is that, if f is as in hypothesis H3, then we can apply the decompositions given by H4 on a set of overlapping rectangles covering the right half-plane, and glue these together to obtain the global decomposition. For example, suppose that we have already decomposed {f=g_1-h_1} on a rectangle {[0,T]\times[a,b]}. Then, for any {a^\prime < b < b^\prime} apply H4 to obtain the decomposition {f=g_2-h_2} on {[0,T]\times[a^\prime,b^\prime]}. These can be glued together by choosing a smooth function {\theta} on the reals such that {\theta(x)} is 0 for {x\le a^\prime} and 1 for {x\ge b}. Setting

\displaystyle  h(t,x) =\begin{cases} h_1(t,x),&x\le a^\prime,\\ h_2(t,x),&x\ge b,\\ (1-\theta(x))h_1(t,x)+\theta(x)h_2(t,x),&a^\prime < x < b, \end{cases}

extends {h_1} to the larger rectangle {[0,T]\times[a,b^\prime]}. It need not be the case that {h} and {f+h} are convex in x, but they will have second derivative bounded below and can be made convex by adding a multiple of {(x-a^\prime)_+^2}. Continuing in this way, we can extend h to ever larger rectangles to obtain the global decomposition required by H3.

It will be useful to further restrict attention to functions which are equal to zero on the upper and lower edges of the unit rectangle. With this is mind, I make the following definitions.

  • {\mathcal{D}} is the space of functions {f\colon I^2\rightarrow{\mathbb R}} such that {f(t,x)} is convex in x and satisfies {f(t,0)=f(t,1)=0}.
  • {\mathcal{D}^+} is the space of {f\in\mathcal{D}} such that {f(t,x)} is increasing in t.
  • {\mathcal{D}^-} is the space of {f\in\mathcal{D}} such that {f(t,x)} is decreasing in t.

In the remainder of the post, for functions of {(t,x)} I will frequently disregard the arguments to keep the expressions reasonably short. I will also use the notation {f_t} and {f_x} to denote the derivatives of f with respect to its first and second argument respectively. If f is not differentiable, then these can be understood in the sense of distributions. The notation {\lVert\cdot\rVert} will be used for the supremum norm. In particular, {\lVert f_x\rVert} is finite if and only if f is Lipschitz continuous with respect to x, with Lipschitz constant {\lVert f_x\rVert}.

Hypothesis H5 Every {f\in\mathcal{D}^-} such that {\lVert f_x\rVert < \infty} decomposes as {f=g-h} for {g,h\in\mathcal{D}^+}.

For any {f\colon I^2\rightarrow{\mathbb R}} which is convex and Lipschitz continuous in x and decreasing in t, hypothesis H5 can be used to obtain the decomposition stated in H4. By adding a constant to f, without loss of generality we can suppose that {f\le-c} for any {c > 0}. We can then extend f to a larger rectangle {[0,1]\times[-\epsilon,1+\epsilon]} by setting {f(t,x)=0} for x equal to {-\epsilon} and {1+\epsilon}, and linearly interpolating in x across the intervals {(-\epsilon,0)} and {(1,1+\epsilon)}. So long as {c/\epsilon\ge\lVert f_x\rVert}, this retains convexity in x. Then, hypothesis H5 can be used to decompose {f=g-h} and, by restricting to {I^2}, we obtain the decomposition required by hypothesis H4.

Now, if we have a collection {f=g^i-h^i} of decompositions ({i\in I}), as in H5, then we can set

\displaystyle  \setlength\arraycolsep{2pt} \begin{array}{rl} \displaystyle h&\displaystyle=\sup_{i\in I}h^i,\smallskip\\ \displaystyle g&\displaystyle=f+h=\sup_{i\in I}g^i. \end{array}

As taking the supremum preserves both convexity in x and monotonicity in t, this gives a decomposition {f=g-h} as in H5 with {g\ge g^i} and {h\ge h^i} for all {i\in I}. Applying this to the collection of all such decompositions gives a maximal decomposition.

Lemma 1 Suppose that {f\in\mathcal{D}^-} is such that the decomposition in (H5) exists. Then, there exists a unique maximal choice for g, h. That is, if {f=g_1-h_1} is any other such decomposition, then {g\ge g_1} and {h\ge h_1}.

Next, hypothesis H5 is equivalent to the following, apparently weaker, statement.

Hypothesis H6 There exists a constant K such that every {f\in\mathcal{D}^-} with {\lVert f_x\rVert < \infty} decomposes as {f=g-h} for {g,h\in\mathcal{D}^+} and

\displaystyle  \lVert h\rVert\le K\lVert f_x\rVert (2)

Clearly, hypothesis H5 follows immediately from H6 simply by dropping (2) from the conclusion. However, H6 does have one advantage — it is only necessary to prove it for a dense subset of {\mathcal{D}^-}. A function {f\in\mathcal{D}} is smooth if it is continuous and all of its partial derivates exist, to all orders, on the interior of {I^2} and extend continuously to {I^2}. For any {f\in\mathcal{D}^-}, it is not difficult to construct a sequence of smooth {f^n\in\mathcal{D}^-} with {\lVert f^n_x\rVert\le\lVert f_x\rVert} and converging pointwise to f. So, to prove H6, it is enough to prove it just for smooth functions.

I’ll now describe how to construct the decomposition in H5. This will always converge to the maximal decomposition whenever it exists, or diverge to {-\infty} if there is no such decomposition. Start by choosing a partition of the unit interval

\displaystyle  \mathbb{T}=\left\{0=t_0 < t_1 <\cdots< t_r=1\right\}.

We now construct {h\colon\mathbb{T}\times I\rightarrow{\mathbb R}} such that {h} is convex in x and {f+h} is increasing in t. The second condition implies the inequality

\displaystyle  h(t_{k-1},x)\le h(t_k,x)+f(t_k,x)-f(t_{k-1},x).

Also, from the definition of {\mathcal{D}}, we are looking for functions satisfying {h(1,x)\le0}. Use {{\rm CH}(u(x))} to denote the convex hull of a function {u\colon I\rightarrow{\mathbb R}}. This is the maximum convex function bounded above by u. Then, {h(t_k,x)} is constructed starting at {k=r} and, then, inductively for {k=n-1,n-2,\ldots}.

\displaystyle  \setlength\arraycolsep{2pt} \begin{array}{rl} \displaystyle h(t_r,x)&\displaystyle=0,\smallskip\\ \displaystyle h(t_{k-1},x)&\displaystyle={\rm CH}\left(h(t_k,x)+f(t_k,x)-f(t_{k-1},x)\right). \end{array} (3)

We can extend h in between the times of the partition however we like, so long as monotonicity in t is preserved. Choosing partitions with mesh {\max_k\lvert t_k-t_{k-1}\rvert} going to zero, we do indeed get convergence to the maximal decomposition whenever it exists.

Lemma 2 Suppose that {f\in\mathcal{D}^-} and, for each {n\in{\mathbb N}}, let {0=t^n_0 < t^n_1 < \cdots < t^n_{r_n}=1} be a partition of the unit interval. We suppose that the partitions have mesh going to 0, and eventually include all times at which f is discontinuous. For each n, let {h^n\in\mathcal{D}^+} be the function constructed as above using the n’th partition. Then, exactly one of the following holds.

  • f decomposes as in (H5) and {h^n\rightarrow h} pointwise on {I^2}, where {f=g-h} is the maximal decomposition.
  • f does not have a decomposition as in (H5), and {h_n(0,x)\rightarrow-\infty} for all {0 < x < 1}.

Proof: For any given partition, h defined as in (3) is the maximum non-positive function {\mathbb{T}\times I\rightarrow{\mathbb R}} such that it is convex in x and {f+h} is increasing in t. This means that {h^n\le h} on {\mathbb{T}\times I} whenever the partition {\{t^n_k\}} is a refinement of {\mathbb{T}}. It follows that

\displaystyle  \limsup_{n\rightarrow\infty} h^n\le h

on {\mathbb{T}\times I}. Using the fact that the sequence of partitions have mesh going to zero and eventually includes each discontinuity time of f,

\displaystyle  \limsup_{n\rightarrow\infty} h^n = \liminf_{n\rightarrow\infty} h^n.

So, the constructions along the partitions do converge to a limit, although it could be infinite.

Now, suppose that a decomposition {f=g-h} as in H5 does exist. Then, the constructions along partitions are bounded below, {h^n\ge h}, and must converge to a finite limit. As limits of convex and monotonic functions are, respectively, convex and monotonic, the limit {h^\infty=\lim_n h^n} is convex in x and {f+h^\infty} is increasing in t. As {h^\infty\ge h}, the limit is the maximal decomposition.

Conversely, suppose that there is no such decomposition as in H5. Then, the sequence {\{h^n\}} cannot converge to a finite limit everywhere. Hence {h^n(t,y)\rightarrow-\infty} for some t and y. By monotonicity in t, {h^n(0,y)} tends to minus infinity. By convexity, for all {0 < x \le y},

\displaystyle  h^n(0,x) \le \frac xy h^n(0,y)\rightarrow-\infty.

Similarly, the same limit holds for {y \le x < 1}. ⬜

The construction described above does indeed converge to a finite limit for smooth f.

Lemma 3 Suppose that {f\in\mathcal{D}^-} is smooth. Then, the maximal decomposition {f=g-h} exists, the derivative {h_{xx}} is bounded, and satisfies

\displaystyle  \frac12\int_0^1h_x(0,x)^2\,dx = -\iint h_{xx}f_t\,dt\,dx. (4)

Proof: For a given partition of the unit interval, construct h as in (2), and interpolate linearly in between the times of the partition. Setting

\displaystyle  h^0(t_{k-1},x)=h(t_k,x)+f(t_k,x)-f(t_{k-1},x),

then {h(t_{k-1},x)} is the convex hull of {h^0(t_{k-1},x)}. The bound

\displaystyle  h(t_k,x)\ge h^0(t_{k-1},x)\ge h(t_k,x)-\lVert f_t\rVert \delta t_k

is immediate, where {\delta t_k} denotes {t_k-t_{k-1}}. Hence, {h(t_k,x)-h(t_{k-1},x)} is bounded by {\lVert f_t\rVert\delta t_k}. So, {\lVert h_t\rVert} is bounded by {\lVert f_t\rVert}. Integrating over t also gives {\lVert h\rVert\le \lVert f_t\rVert}.

We can also bound {h^0_{xx}(t_{k-1},x)} above by {h_{xx}(t_k,x)+\lVert f_{xxt}\rVert\delta t_k}. Taking its convex hull preserves the bound, so {\lVert h_{xx}\rVert} is bounded by {\lVert f_{xxt}\rVert}. Similarly, {h^0_x(t_{k-1},x)} is bounded by {\lvert h_x(t_k,x)\rvert + \lVert f_{xt}\rVert\delta t_k}, so {\lVert h_x\rVert\le\lVert f_{xt}\rVert}.

Next, {h^0_{xx}(t_{k-1},x)} is bounded below by {-\lVert f_{xxt}\rVert\delta t_k} and, from this, it can be shown that {h^0_x(t_{k-1},x)-h_x(t_{k-1},x)} is bounded by {\lVert f_{xxt}\rVert\delta t_k}. So, {h_x(t_k,x)-h_x(t_{k-1},x)} is bounded by {(\lVert f_{xxt}\rVert+\lVert f_{xt}\rVert)\delta t_k}.

Putting all of these together gives the following set of inequalities.

\displaystyle  \setlength\arraycolsep{2pt} \begin{array}{rl} \displaystyle\lVert h\rVert &\displaystyle\le\lVert f_t\rVert\smallskip\\ \displaystyle\lVert h_x\rVert &\displaystyle\le \lVert f_{xt}\rVert\smallskip\\ \displaystyle\lVert h_{xx}\rVert &\displaystyle\le \lVert f_{xxt}\rVert\smallskip\\ \displaystyle\lVert h_t\rVert &\displaystyle\le \lVert f_t\rVert\smallskip\\ \displaystyle\lVert h_{xt}\rVert &\displaystyle\le \lVert f_{xxt}\rVert + \lVert f_{xt}\rVert \end{array} (5)

Now, if {u\colon I\rightarrow{\mathbb R}} is a continuous function with convex hull v, then {v\le u} and, on each interval for which {v < u}, we have {v_{xx}=0}. It follows that {v_{xx}(u-v)=0}. Applying this to (2) gives

\displaystyle  h_{xx}(t_{k-1},x)\left(h(t_k,x)-h(t_{k-1},x)+f(t_k,x)-f(t_{k-1},x)\right)=0. (6)

We can now take limits as the mesh of the partition goes to zero. The inequality {\lVert h\rVert\le\lVert f_t\rVert} shows that the construction cannot diverge and, by Lemma 2, the decomposition of hypothesis H5 exists and we have convergence to the maximal decomposition. The inequalities (5) follow for the maximal decomposition and, taking limits of (6) gives

\displaystyle  h_{xx}(h_t+f_t)=0.

Integrating over {I^2} and applying integration by parts,

\displaystyle  \setlength\arraycolsep{2pt} \begin{array}{rl} \displaystyle \iint h_{xx} h_t\,dt\,dx &\displaystyle= -\iint h_x h_{xt}\,dt\,dx\smallskip\\ &\displaystyle= -\frac12\iint (h_x^2)_t\,dt\,dx\smallskip\\ &\displaystyle=\frac12\int \left(h_x(0,x)^2-h_x(1,x)^2\right)\,dx. \end{array}

Equality (4) follows from this. ⬜

To construct the decomposition in H5, we can apply the decomposition for smooth f and then take limits to obtain the decomposition for arbitrary {f\in\mathcal{D}^-}. The problem is that, when taking the limit, the terms g and h could diverge to minus infinity. In some cases, (4) can be used to bound h and avoid this potential divergence. However, for this to work, the following alternative version of the hypothesis is required.

Hypothesis H7 There exists a constant K such that all smooth {f\in\mathcal{D}^+} and {g\in\mathcal{D}^-} satisfy

\displaystyle  \iint f_{xx}g_t\,dt\,dx\le K\lVert f_x\rVert\lVert g_x\rVert.

For smooth f, Lemma 3 states that the maximal decomposition {f=g-h} as in H5 exists and, assuming hypothesis H7, we have the inequality

\displaystyle  \frac12\int_0^1 h_x(0,x)^2\,dx=\iint h_{xx}f_t\,dt\,dx \le K \lVert h_x\rVert \lVert f_x\rVert.

If the left hand side can be replaced by a multiple of {\lVert h\rVert \lVert h_x\rVert} then, cancelling {\lVert h_x\rVert}, this would give inequality (2) as required by hypothesis H6. To do this, we can consider applying the decomposition over a slightly larger region. For any {\epsilon, c > 0}, define {\tilde f} on {I\times[-\epsilon,1 + \epsilon]} by setting

\displaystyle  \tilde f(t,x) = \begin{cases} f(t,x) - c,&0\le x\le1,\\ (x-1-\epsilon)c/\epsilon,& 1 < x \le 1+\epsilon,\\ -(x+\epsilon)c/\epsilon,& -\epsilon\le x < 0. \end{cases}

This is convex in x so long as {c/\epsilon\ge\lVert f_x\rVert}. Applying the decomposition given by Lemma 2 to {\tilde f} gives a {\tilde h} such that {\tilde h} is convex in x, {\tilde f + \tilde h} is increasing in t and,

\displaystyle  \setlength\arraycolsep{2pt} \begin{array}{rl} \displaystyle \int_{-\epsilon}^{1+\epsilon} \tilde h_x(0,x)^2\,dx &\displaystyle\le 2K(1+2\epsilon)\lVert\tilde h_x\rVert\lVert \tilde f_x\rVert\smallskip\\ &\displaystyle= 2K(1+2\epsilon)\lVert\tilde h_x\rVert c/\epsilon \end{array}

The additional factor of {1+2\epsilon} on the right hand side is because we are applying the decomposition over an interval of width {1+2\epsilon} rather than the unit interval. Using the fact that {\tilde h(t,x)} is linear over the intervals {[-\epsilon,0]} and {[1,1+\epsilon]},

\displaystyle  \int \tilde h_x(0,x)^2\,dx\ge \epsilon\lVert \tilde h_x\rVert^2

Combining with the previous inequality,

\displaystyle  \lVert \tilde h_x\rVert \le 2K(1+2\epsilon) c/\epsilon^2

As {\tilde h=0} on the edges of the rectangle {I\times[-\epsilon,1+\epsilon]}, this can be integrated to give

\displaystyle  \lVert \tilde h\rVert \le K(1+2\epsilon)^2 c/\epsilon^2.

Restricting to the unit square {I^2}, this gives a convex non-positive h which is convex in x and such that {f+h} is increasing in t. Setting {c=\lVert f_x\rVert/2} and {\epsilon=1/2},

\displaystyle  \setlength\arraycolsep{2pt} \begin{array}{rl} \displaystyle \lVert h\rVert &\displaystyle\le K(1+2\epsilon)^2 c/\epsilon^2\smallskip\\ &\displaystyle= 8 K \lVert f_x\rVert. \end{array}

This shows that if hypothesis H7 holds for some constant K, then hypothesis H6 also holds, with constant K replaced by {8K}.

Now, I move on to yet another form of the hypothesis. For smooth {f\in\mathcal{D}^+} and {g\in\mathcal{D}^-}, consider defining

\displaystyle  \mu_{[f,g]}(\theta)=\iint (f_{xx}g_t+g_{xx}f_t)\theta\,dt\,dx (7)

for all smooth {\theta\colon(0,1)^2\rightarrow{\mathbb R}} of compact support. If hypothesis H7 is true, then the inequality

\displaystyle  \lvert\mu_{[f,g]}(\theta)\rvert\le 2K\lVert f_x\rVert g_x\rVert \lVert\theta\rVert.

would hold. Using integration by parts, {\mu_{[f,g]}(\theta)} can be rearranged as

\displaystyle  \mu_{[f,g]}(\theta)=\iint\left(f_xg_x\theta_t-f_xg_t\theta_x-f_tg_x\theta_x\right)\,dt\,dx.

This form has the advantage that it makes sense for all {f\in\mathcal{D}^+} and {g\in\mathcal{D}^-} without imposing any smoothness constraints. By convexity, {f_x} and {g_x} are well-defined bounded functions and, by monotonicity, {\int\cdot f_t\,dt} and {\int\cdot g_t\,dt} are well-defined measures (i.e., using Lebesgue-Stieltjes integration). I used this idea in the papers A Generalized Backward Equation For One Dimensional Processes and Fitting Martingales to Given Marginals to derive a martingale condition for {f(t,X_t)} where X is a continuous strong Markov martingale, and f is convex in x and decreasing in t, without imposing any differentiability constraints. For our purposes in this post, we just use it for the following form of the hypothesis.

Hypothesis H8 There exists a constant K such that

\displaystyle  \lvert\mu_{[f,g]}(\theta)\rvert\le K\lVert f_x\rVert\lVert g_x\rVert\lVert\theta\rVert

for all {f\in\mathcal{D}^+}, {g\in\mathcal{D}^-} and smooth {h\colon(0,1)^2\rightarrow{\mathbb R}} of compact support.

For smooth {f\in\mathcal{D}^+} and {g\in\mathcal{D}^-}, equation (7) shows that hypothesis H8 is equivalent to

\displaystyle  \iint\left\lvert f_{xx}g_t+g_{xx}f_t\right\rvert\,dt\,dx\le K\lVert f_x\rVert\lVert g_x\rVert.

As the terms {f_{xx}g_t} and {g_{xx}f_t} have opposite signs, putting a bound on their sum does not imply any bound on the individual terms. Instead, consider choosing a partition {0=t_0 < t_1 < \cdots < t_n=1}, let {s_k=(t_{k-1}+t_k)/2} be the mid-points, and define {\tilde f} and {\tilde g} by

\displaystyle  \setlength\arraycolsep{2pt} \begin{array}{rl} \displaystyle \tilde f(t,x)&\displaystyle=\begin{cases} f(2t - t_{k-1},x),&t_{k-1}\le t\le s_k,\\ f(t_k,x),&s_k\le t\le t_k, \end{cases}\smallskip\\ \displaystyle \tilde g(t,x)&\displaystyle=\begin{cases} g(t_{k-1},x),&t_{k-1}\le t\le s_k,\\ g(2t-t_k,x),&s_k\le t\le t_k. \end{cases} \end{array}

Now, we have {\tilde f_t=0} over the intervals {[s_k,t_k]} and {\tilde g_t=0} over the intervals {[t_{k-1},s_k]}. So,

\displaystyle  \left\lvert\tilde f_{xx}\tilde g_t+\tilde g_{xx}\tilde f_t\right\rvert=\left\lvert\tilde f_{xx}\tilde g_t\right\rvert+\left\lvert\tilde g_{xx}\tilde f_t\right\rvert.

Applying hypothesis H8 to the functions {\tilde f} and {\tilde g} gives

\displaystyle  \iint\left(\left\lvert\tilde f_{xx}\tilde g_t\right\rvert+\left\lvert\tilde g_{xx}\tilde f_t\right\rvert\right)\,dt\,dx\le K\lVert f_x\rVert \lVert g_x\rVert.

Taking the limit as the mesh of the partition goes to zero gives the inequality

\displaystyle  \iint\left(\left\lvert f_{xx} g_t\right\rvert+\left\lvert g_{xx} f_t\right\rvert\right)\,dt\,dx\le K\lVert f_x\rVert \lVert g_x\rVert.

and hypothesis H7 follows immediately from this.

I now move on to the final form of the hypothesis, which re-introduces the stochastic calculus element.

Hypothesis H9 There exists a constant {K > 0} such that, for every {f\in\mathcal{D}^-} and continuous martingale {\{X_t\}_{t\in[0,1]}} with {0\le X\le1}, {Y_t\equiv f(t,X_t)} has mean variation

\displaystyle  {\rm Var}_1(Y)\le K\lVert f_x\rVert.

In particular, this implies that Y is a quasimartingale. Suppose that hypothesis holds. Given any smooth {g\in\mathcal{D}^+} with {g_{xx} > 0} on the interior of {I^2} and {\lVert g_x\rVert\le1}, define the function {C\colon[0,1]\times{\mathbb R}\rightarrow{\mathbb R}_+} by

\displaystyle  C(t,x)=\begin{cases} 1/2-x,&x\le 0,\\ (1-x+g(t,x))/2,&0\le x\le 1,\\ 0,&x\ge1. \end{cases}

This is convex in x and increasing in t. Now, consider the stochastic differential equation

\displaystyle  dX_t=\sigma(t,X_t)\,dW (8)

for a Brownian motion W and

\displaystyle  \sigma(t,x)^2=\frac{2C_t(t,x)}{C_{xx}(t,x)}=\frac{2g_t(t,x)}{g_{xx}(t,x)}.

We only consider solving (8) up until the first time at which X hits 0 or 1, after which X is constant. If the initial distribution of X is chosen so that

\displaystyle  {\mathbb E}[(X_t-x)_+]=C(t,x)

for {t=0}, then this holds for all {t\in[0,1]}. This is a well known result, used in financial option pricing by the local volatility model, where {C(t,x)} represents the price of a call option of strike price x and maturity t. For any bounded measurable {\theta\colon(0,1)^2\rightarrow{\mathbb R}}, the following identities hold,

\displaystyle  \setlength\arraycolsep{2pt} \begin{array}{rl} \displaystyle{\mathbb E}\left[\theta(t,X_t)\right]&\displaystyle=\int_0^1\theta(t,x)C_{xx}(t,x)\,dx\smallskip\\ &\displaystyle= \frac12\int_0^1\theta(t,x)g_{xx}(t,x)\,dx,\medskip\\ \displaystyle{\mathbb E}\left[\int_0^1\theta(s,X_s)\,d[X]_s\right]&\displaystyle=2\iint\theta C_t\,dt\,dx\smallskip\\ &\displaystyle = \iint\theta g_t\,dt\,dx. \end{array}

Now, for smooth {f\in\mathcal{D}^-}, (1) expresses {Y_t=f(X_t,t)} as a martingale plus a finite variation term

\displaystyle  V_t=f(0,X_0) + \frac12\int_0^t f_{xx}(s,X_s)\,d[X]_s + \int_0^t f_t(s,X_s)\,ds.

Now, using the identities above, for bounded measurable {\theta\colon(0,1)^2\rightarrow{\mathbb R}},

\displaystyle  \setlength\arraycolsep{2pt} \begin{array}{rl} \displaystyle {\mathbb E}\left[\int_0^1\theta(t,X_t)\,dV_t\right]&\displaystyle=\frac12\iint(f_{xx}g_t+g_{xx}f_t)\theta\,dt\,dx\smallskip\\ &\displaystyle=\frac12\mu_{[f,g]}(\theta). \end{array}

The left hand side is bounded by the mean variation of Y whenever {\theta} is bounded by 1 so, if hypothesis H9 holds, it is bounded by {K\lVert f_x\rVert}. Scaling g and {\theta} gives,

\displaystyle  \left\lvert\mu_{[f,g]}(\theta)\right\rvert\le 2K\lVert f_x\rVert \lVert g_x\rVert \rVert\theta\rVert.

Approximating arbitrary f and g by smooth functions implies hypothesis H8.

It only remains to show how hypothesis H2 implies H9, which I will do now. This is the most difficult of the implications shown in this post, and I will instead show the contrapositive. That is, supposing that H9 is false, we show that H2 is also false. The idea is to construct a continuous martingale X and {f\colon[0,1]\times{\mathbb R}\rightarrow{\mathbb R}} such that the process V in decomposition (1) has infinite variation. This will imply that {f(t,X_t)} is not a semimartingale.

Start by choosing a sequence of continuous martingales {\{X^n\}_{t\in[0,1]}} with {0\le X^n\le1} and smooth {f^n\in\mathcal{D}^-} such that the mean variation of {f(t,X^n_t)} is greater than {4^n}. Passing to a larger probability space if necessary, we suppose that we have a doubly indexed sequence {X^{mn}} of independent continuous martingales, where {X^{mn}} has the same distribution as {X^m}. We then decompose

\displaystyle  f^m(t,X^{mn}_t)=\int_0^tf^m_x(s,X^{mn}_s)\,dX^{mn}_s+V^{mn}_t.

The variation, {W^{mn}}, of {V^{mn}} over {[0,1]} then has expectation equal to the mean variation of {f^m(t,X^{mn}_t)}. By the weak law of large numbers,

\displaystyle  4^{-r_m}\sum_{k=1}^{4^{r_m}}W^{mk} > 4^m

with probability at least 1/2, for large enough {r_m}. We also suppose that {r_{m+1} > r_m}. Setting {\epsilon_{mn}=2^{-r_m-m}\sqrt{3}} for {n\le 4^{r_m}} and {\epsilon_{mn}=0} otherwise. Then,

\displaystyle  \setlength\arraycolsep{2pt} \begin{array}{rl} &\displaystyle \sum_{mn}\epsilon^2_{mn}=1,\smallskip\\ &\displaystyle \sum_{mn}\epsilon^2_{mn}W^{mn}=\infty, \end{array}

and {\epsilon_{mn}/\epsilon_{(m+1)n}} is an integer. Rearranging the non-zero terms gives a singly indexed sequence {\epsilon_n} with

\displaystyle  \setlength\arraycolsep{2pt} \begin{array}{rl} &\displaystyle \sum_{n}\epsilon^2_{n}=1,\smallskip\\ &\displaystyle \sum_{n}\epsilon^2_{n}W^{n}=\infty, \end{array}

where {\epsilon_n/\epsilon_{n+1}} is an integer, {f^n\in\mathcal{D}^-} are smooth with {\lVert f^n_x\rVert\le1}, {\{X^n\}_{t\in[0,1]}} is an independent sequence of continuous martingales with decomposition

\displaystyle  f^n(t,X^n_t)=\int_0^tf^n_x(s,X^n_s)\,dX^n_s+V^n_s,

and {V^n} has variation {W^n} over {[0,1]}.

Let {R} be the rectangle {[0,1]\times[-1,1]}. By extrapolating {f^n} and applying a change of variables, we can suppose that {X^n} are continuous martingales with {X^n_0=0} and {X^n_1=\pm1}, and {f\colon R\rightarrow{\mathbb R}} is convex in x and decreasing in t, such that {f^n(t,x)=x^2-t} on the boundary of {R} and {f^n_x(t,x)=2x} at {x=\pm1}.

Now define the sequence of times {t_n=\sum_{k\le n}\epsilon^2_k}. Define the martingale M by

\displaystyle  M_t=\sum_{k < n}\epsilon_k X^k_1 + \epsilon_nX^n_{\epsilon_n^{-2}(t-t_{n-1})}

for {t_{n-1}\le t\le t_n}. This is a random walk, interpolated by the continuous martingales {X^1}. The times {t_n} are increasing to 1 and,

\displaystyle  {\mathbb E}[M_{t_n}^2]=\sum_{k=1}^n\epsilon_n^2\le1.

This is {L^2}-bounded so, by martingale convergence, the limit {M_1=\lim_{t\rightarrow1}M_t} exists in {L^2} and with probability 1, giving a continuous {L^2}-bounded martingale {\{M_t\}_{t\in[0,1]}}.

The fact that {\epsilon_n/\epsilon_{n+1}} is an integer implies that the support of {M_{t_{n-1}}} is contained in the set {S_n=2\mathbb{Z}\epsilon_n} or {S_n=(2\mathbb{Z}+1)\epsilon_n} for each {n}.

Define {g\colon[0,1]\times{\mathbb R}\rightarrow{\mathbb R}} by {g(t,x)=x^2-t} for {t= t_0,t_1,\ldots} and {t=1}. We interpolate between these times by setting

\displaystyle  g(t,x)=2ax-a^2-t_{n-1} + \epsilon_n^2f^n(\epsilon_n^{-2}(t-t_{n-1}),\epsilon_n^{-1}(x-a))

for {t_{n-1}\le t\le t_n} and {\lvert x-a\rvert\le\epsilon_n}, some {a\in S_n}. This is convex in x and decreasing in t. It can be seen that

\displaystyle  g(t,M_t)=g(t,M_{t_{n-1}})+\epsilon_n^2f^n\left(\epsilon_n^{-2}(t-t_{n-1}),X^n_{\epsilon_n^{-2}(t-t_{n-1})}\right).

Then, if we decompose

\displaystyle  g(t,M_t) = g(0,M_0)+\int_0^tg_x(s,M_s)\,dM_s+V_s (9)

the process V is continuous with variation

\displaystyle  \int_0^{t_n}\,\lvert dV_t\rvert=\sum_{k=1}^n\epsilon_k^2W^k.

This is finite, but tends to infinity almost surely as n goes to infinity.

We can now conclude that {f(t,M_t)} is not a semimartingale as, otherwise, it would decompose uniquely as

\displaystyle  f(t,M_t)=N_t+A_t

for a continuous local martingale N and continuous FV process A with {A_0=0}. Comparing with (9) over each interval {[0,t]} for {t < 1} gives {A_t=V_t} and, hence, A has almost surely infinite variation on {[0,1]}, contradicting the fact that it is an FV process. This then contradicts hypothesis H2.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s