On The Integral ∫I(W ≥ 0)dW

In this post I look at the integral X_t = ∫₀^t 1_{W≥0} dW for standard Brownian motion W. This is a particularly interesting example of stochastic integration with connections to local times, option pricing and hedging, and demonstrates behaviour not seen for deterministic integrals that can seem counter-intuitive. For a start, X is a martingale so has zero expectation. To some it might, at first, seem that X is nonnegative and — furthermore — equals W ∨ 0. However, this has positive expectation contradicting the first property. In fact, X can go negative and we can compute its distribution. In a Twitter post, Oswin So asked about this very point, showing some plots demonstrating the behaviour of the integral.

simulation of X — Figure 1: Numerically evaluating ∫¹₀ 1_{W≥0} dW

We can evaluate the integral as X_t = W_t ∨ 0 – 12 L_t⁰ where L_t⁰ is the local time of W at 0. The local time is a continuous increasing process starting from 0, and only increases at times where W = 0. That is, it is constant over intervals on which W is nonzero. The first term, W_t ∨ 0 has probability density p(x) equal to that of a normal density over x > 0 and has a delta function at zero. Subtracting the nonnegative value L⁰_t spreads out the density of this delta function to the left, leading to the odd looking density computed numerically in So’s Twitter post, with a peak just to the left of the origin and dropping instantly to a smaller value on the right. We will compute an exact form for this probability density but, first, let’s look at an intuitive interpretation in the language of option pricing.

Consider a financial asset such as a stock, whose spot price at time t is S_t. We suppose that the price is defined at all times t ≥ 0 and has continuous sample paths. Furthermore, suppose that we can buy and sell at spot any time with no transaction costs. A call option of strike price K and maturity T pays out the cash value (S_T - K)₊ at time T. For simplicity, assume that this is ‘out of the money’ at the initial time, meaning that S₀ ≤ K.

The idea of option hedging is, starting with an initial investment, to trade in the stock in such a way that at maturity T, the value of our trading portfolio is equal to (S_T - K)₊. This synthetically replicates the option. A naive suggestion which is sometimes considered is to hold one unit of stock at all times t for which S_t ≥ K and zero units at all other times.The profit from such a strategy is given by the integral X_T = ∫₀^T 1_{S≥K} dS. If the stock only equals the strike price at finitely many times then this works. If it first hits K at time s and does not drop back below it on interval (s, t) then the profit at t is equal to the amount S_t – K that it has gone up since we purchased it. If it drops back below the strike then we sell at K for zero profit or loss, and this repeats for subsequent times that it exceeds K. So, at time T, we hold one unit of stock if its value is above K for a profit of S_T – K and zero units for zero profit otherwise. This replicates the option payoff.

The idea described works if S_T hits the strike K at a finite set of times,and also if the path of S_t has finite variation, in which case Lebesgue-Stieltjes integration gives X_T = (S_T - K)₊. It cannot work for stock prices though! If it did, then we have a trading strategy which is guaranteed to never lose money but generates profits on the positive probability event that S_T > K. This is arbitrage, generating money with zero risk, which should be impossible.

What goes wrong? First, Brownian motion does not have sample paths with finite variation and will not hit a level finitely often. Instead, if it reaches K then it hits the level uncountably often. As our simple trading strategy would involve buying and selling infinitely often, it is not so easy. Instead, we can approximate by a discrete-time strategy and take the limit. Choosing a finite sequence of times 0 = t₀ < t₁ < ⋯< t_n = T, the discrete approximation is to hold one unit of the asset over the interval (t_i, t_i+1] if S_{t_i} ≥ K and zero units otherwise.

The discrete strategy involves buying one unit of the asset whenever its price reaches K at one of the discrete times and selling whenever it drops back below. This replicates the option payoff, except for the fact then when we buy above K we effectively overpay by amount S_{t_i} – K and, when we sell below K, we lose K – S_{t_i}. This results in some slippage from not being able to execute at the exact level,

$\displaystyle A_T=\sum_{i=1}^{n}1_{\{S_{t_{i-1}} < K\le S_{t_i}{\rm\ or\ }S_{t_{i-1}}\ge K > S_{t_i}\}}\lvert S_{t_i}-K\rvert.$

So, our simple trading strategy generates profit (S_T - K)₊ – A_T, missing the option value by amount A_T. In the limit as n goes to infinity with time step size going to zero, the slippage A_T does not go to zero. For equally spaced times, It can be shown that the number of times that spot crosses K is of order √n, and each of these times generates slippage of order 1/√n on average. So, in the limit, A_T does not vanish and, instead, converges on a positive value equal to half the local time L_T^K.

Figure 2: Naive option hedge with slippage

Figure 2 shows the situation, with the slippage A shown on the same plot (using K as the zero axis, so they are on the same scale). We can just take K = 0 for an asset whose spot price can be positive or negative. Then, with S = W, our integral X_T = ∫₀^T 1_{W≥0} dW is the same as the payoff from the naive option hedge, or (S_T)₊ minus slippage L⁰_T/2.

Now lets turn to a computation of the probability density of X_T = W_T ∨ 0 – L_T⁰/2. By the scaling property of Brownian motion, the distribution of X_T/√T does not depend on T, so we take T = 1 without loss of generality. The first trick to this is to make use of the fact that, if M_t = sup_s≤tW_s is the running maximum then (|W_t|, L_t⁰) has the same joint distribution as (M_t - W_t, M_t). This immediately tells us that L₁⁰ has the same distribution as M₁ which, by the reflection principle, has the same distribution as |W₁|. Using

$\displaystyle \varphi(x)=\frac1{\sqrt{2\pi}}e^{-\frac12x^2}$

for the standard normal density, this shows that the local time L₁⁰ has probability density 2φ(x) over x > 0.

Next, as flipping the sign W does not impact either |W₁| or L₁⁰, sgn(W₁) is independent of these. On the event W₁ < 0 we have X₁ = –L₁⁰/2 which has density 4φ(2x) over x < 0. On the event W₁ > 0, we have X₁ = |W₁|-L₁⁰/2, which has the same distribution as M₁/2 – W₁.

To complete the computation of the probability density of X₁, we need to know the joint distribution of M₁ and W₁, which can be done as described in the post on the reflection principle. The probability that W₁ is in an interval of width δx about a point x and that M₁ > y, for some y > x is, by reflection, equal to the probability that W₁ is in an interval of width δx about the point 2y – x. This has probability φ(2y - x)δx and, by differentiating in y, gives a joint probability density of 2φ′(x - 2y) for (W₁, M₁).

The expectation of f(X₁) for bounded measurable function f can be computed by integrating over this joint probability density.

$\displaystyle \begin{aligned} {\mathbb E}[f(X_1)\vert\;W_1 > 0] &={\mathbb E}[f(M_1/2-W_1)]\\ &=2\int_{-\infty}^\infty\int_{x_+}^\infty f(y/2-x)\varphi'(x-2y)\,dydx\\ &=4\int_{-\infty}^\infty\int_{(-x)\vee(-x/2)}^\infty f(z)\varphi'(-3x-4z)\,dzdx\\ &=4\int_{-\infty}^\infty\int_{(-z)\vee(-2z)}^\infty f(z)\varphi'(-3x-4z)\,dxdz\\ &=\frac43\int_{-\infty}^\infty f(z)\varphi(2z)\,dz+\frac43\int_0^\infty f(z)\varphi(z)\,dz. \end{aligned}$

The substitution z = y/2 – x was applied in the inner integral, and the order of integration switched. The probability density of X₁ conditioned on W₁ > 0 is therefore,

$\displaystyle p_{X_1}(x\vert\; W_1 > 0)=\begin{cases} \frac43\varphi(x),&{\rm for\ }x > 0,\\ \frac43\varphi(2x),&{\rm for\ }x < 0. \end{cases}$

Conditioned on W₁ < 0, we have already shown that the density is 4φ(2x) over x < 0 so, taking the average of these, we obtain

$\displaystyle p_{X_1}(x)=\begin{cases} \frac23\varphi(x),&{\rm for\ }x > 0,\\ \frac83\varphi(2x),&{\rm for\ }x < 0. \end{cases}$

This is plotted in figure 3 below, agreeing with So’s numerical estimation from the Twitter post shown in figure 1 above.

Stochastic Differential Equations

Stochastic differential equations (SDEs) form a large and very important part of the theory of stochastic calculus. Much like ordinary differential equations (ODEs), they describe the behaviour of a dynamical system over infinitesimal time increments, and their solutions show how the system evolves over time. The difference with SDEs is that they include a source of random noise., typically given by a Brownian motion. Since Brownian motion has many pathological properties, such as being everywhere nondifferentiable, classical differential techniques are not well equipped to handle such equations. Standard results regarding the existence and uniqueness of solutions to ODEs do not apply in the stochastic case, and cannot readily describe what it even means to solve such as system. I will make some posts explaining how the theory of stochastic calculus applies to systems described by an SDE.

Consider a stochastic differential equation describing the evolution of a real-valued process {X_t}_t≥0,

$\displaystyle dX_t = \sigma(X_t)\,dW_t + b(X_t)\,dt$

(1)

which can be specified along with an initial condition X₀ = x₀. Here, b is the drift specifying how X moves on average across the dt time, σ is a volatility term giving the amplitude of the random noise and W is a driving Brownian motion providing the source of the randomness. There are numerous situations where equations such as (1) are used, with applications in physics, finance, filtering theory, and many other areas.

In the case where σ is zero, (1) is just an ordinary differential equation dX/dt = b(X). In the general case, we can informally think of dividing through by dt to give an ODE plus an additional noise term

$\displaystyle \frac{dX_t}{dt}=b(X_t)+\sigma(X_t)\xi_t.$

(2)

I have set ξ_t = dW_t/dt which can be thought of as a process whose values at each time are independent zero-mean random variables. As mentioned above, though, Brownian motion is not differentiable so this does not exist in the usual sense. While it can be described by a kind of random distribution, even distribution theory is not well-equipped to handle such equations involving multiplying by the nondifferentiable process σ(X_t). Instead, (1) can be integrated to obtain

$\displaystyle X_t=X_0+\int_0^t\sigma(X_s)\,dW_s+\int_0^tb(X_s)\,ds,$

(3)

where the right-hand-side is interpreted using stochastic integration with respect to the semimartingale W. Likewise, X will be a semimartingale, and such solutions are often referred to as diffusions.

The differential form (1) can be interpreted as a shorthand for the integral expression (3), which I will do in these notes. It can be generalized to n-dimensional processes by allowing b to take values in ℝⁿ, σ(x) to be an n × m matrix, and W to be an m-dimensional Brownian motion. That is, W = (W¹, …, W^m) where Wⁱ are independent Brownian motions. I will sometimes write this as

$\displaystyle dX^t_i=\sigma_{ij}(X_t)dW^j_t+b_i(X_t)dt$

where the summation convention is being applied, with subscripts or superscripts occuring more than once in a single term being summed from 1 to n.

Unlike ODEs, when dealing with SDEs we need to consider what underlying probability space the solution is defined with respect to. This leads to the existence of different classes of solutions.

Strong solutions where X can be expressed as a measurable function of the Brownian motion W or, equivalently, X is adapted to its natural filtration.
Weak solutions where X need not be a function of W. Such cases may require additional randomness so may not exist on the probability space with respect to which the Brownian motion W is defined. It can be necessary to extend the filtered probability space to construct these solutions.

Likewise, when considering uniqueness of solutions, there are different ways this occurs.

Pathwise uniqueness where, up to indistinguishability, there is only one solution X. This should hold not just on one specific space containing a Brownian motion W, but on all such spaces. That is, weak solutions should be unique.
Uniqueness in law where there may be multiple pathwise solutions, but their distribution is uniquely determined by the SDE.

There are various general conditions under which strong solutions and pathwise uniqueness are guaranteed for SDE (1) , such as the Itô result for Lipschitz continuous coefficients. I covered this situation in a previous post.

Other than using the SDE (1), such systems can also be described by an associated differential operator. For the n-dimensional case set a(x) = σ(x)σ(x)^T, which is an n × n positive semidefinite matrix. Then, the second order operator L can be defined

$\displaystyle Lf(x)=\frac12a_{ij}(x)f_{,ij}(x)+b_{i}(x)f_{,i}(x)$

operating on twice continuously differentiable functions f: ℝⁿ → ℝ. Being able to effortlessly switch between descriptions using the SDE (1) and the operator L is a huge benefit when working with such systems. There are several different ways in which the operator can be used to describe a stochastic process, all of which relate to weak solutions and uniqueness in law of the SDE.

Markov Generator: A Markov process is a weak solution to the SDE (1) if its infinitesimal generator is L. That is, if the transition function is P_t then,

$\displaystyle \lim_{t\rightarrow0}t^{-1}(P_tf-f)=Lf$

for suitably regular functions f.

Backwards Equation: For a function f: ℝⁿ × ℝ₊ → ℝ, f(t, X_t) is a local martingale if and only if it solves the partial differential equation (PDE)

$\displaystyle \frac{\partial f}{\partial t}+Lf=0.$

Consequently, for any time t > 0 and function g: ℝ^d → ℝ, if we let f be a solution to the PDE above with boundary condition f(x, t) = g(x) then, assuming integrability conditions, the conditional expectations at times s < t are

$\displaystyle {\mathbb E}[g(X_t)\;\vert\mathcal F_s]=f(X_s,s).$

If the conditions are satisfied, this describes a Markov process and gives its transition probabilities, describing the distribution of X and implying uniqueness in law.

Forward Equation: Assuming that it is sufficiently smooth, the probability density p(t, x) of X_t satisfies the PDE

$\displaystyle \frac{\partial p}{\partial t}=L^Tf.$

where L^T is the transpose of operator L

$\displaystyle L^Tp=\frac12(a_{ij}p)_{,ij}+(b_ip)_{,i}.$

If this PDE has a unique solution for given initial distribution, then this uniquely determines the distribution of X_t. So, if unique solutions to the forward equation exist starting at every future time, it gives uniqueness in law for X.

Martingale problem: Any weak solution to SDE (1) satisfies the property that

$\displaystyle f(X_t)-\int_0^t Lf(X_s)\,ds$

is a local martingale for twice continuously differentiable functions f: ℝⁿ → ℝ. This approach, which was pioneered by Stroock and Varadhan, has many benefits over the other applications of operator L described above, since it applies much more generally. We do not need to a-priori impose any properties on X such as being Markov, and as the test functions f are chosen at will, they automatically satisfy the necessary regularity properties. As well as being a very general way to describe solutions to a stochastic dynamical system, it turns out to be very fruitful. The striking and far-reaching Stroock–Varadhan uniqueness theorem, in particular, guarantees existence and uniqueness in law so long as a is continuous and positive definite and b is locally bounded.

The Ito-Tanaka-Meyer Formula

Ito’s lemma is one of the most important and useful results in the theory of stochastic calculus. This is a stochastic generalization of the chain rule, or change of variables formula, and differs from the classical deterministic formulas by the presence of a quadratic variation term. One drawback which can limit the applicability of Ito’s lemma in some situations, is that it only applies for twice continuously differentiable functions. However, the quadratic variation term can alternatively be expressed using local times, which relaxes the differentiability requirement. This generalization of Ito’s lemma was derived by Tanaka and Meyer, and applies to one dimensional semimartingales.

The local time of a stochastic process X at a fixed level x can be written, very informally, as an integral of a Dirac delta function with respect to the continuous part of the quadratic variation ${[X]^{c}}$ ,

$\displaystyle L^x_t=\int_0^t\delta(X-x)d[X]^c.$

(1)

This was explained in an earlier post. As the Dirac delta is only a distribution, and not a true function, equation (1) is not really a well-defined mathematical expression. However, as we saw, with some manipulation a valid expression can be obtained which defines the local time whenever X is a semimartingale.

Going in a slightly different direction, we can try multiplying (1) by a bounded measurable function ${f(x)}$ and integrating over x. Commuting the order of integration on the right hand side, and applying the defining property of the delta function, that ${\int f(X-x)\delta(x)dx}$ is equal to ${f(X)}$ , gives

$\displaystyle \int_{-\infty}^{\infty} L^x_t f(x)dx=\int_0^tf(X)d[X]^c.$

(2)

By eliminating the delta function, the right hand side has been transformed into a well-defined expression. In fact, it is now the left side of the identity that is a problem, since the local time was only defined up to probability one at each level x. Ignoring this issue for the moment, recall the version of Ito’s lemma for general non-continuous semimartingales,

$\displaystyle \begin{aligned} f(X_t)=& f(X_0)+\int_0^t f^{\prime}(X_-)dX+\frac12A_t\\ &\quad+\sum_{s\le t}\left(\Delta f(X_s)-f^\prime(X_{s-})\Delta X_s\right). \end{aligned}$

(3)

where ${A_t=\int_0^t f^{\prime\prime}(X)d[X]^c}$ . Equation (2) allows us to express this quadratic variation term using local times,

$\displaystyle A_t=\int_{-\infty}^{\infty} L^x_t f^{\prime\prime}(x)dx.$

The benefit of this form is that, even though it still uses the second derivative of ${f}$ , it is only really necessary for this to exist in a weaker, measure theoretic, sense. Suppose that ${f}$ is convex, or a linear combination of convex functions. Then, its right-hand derivative ${f^\prime(x+)}$ exists, and is itself of locally finite variation. Hence, the Stieltjes integral ${\int L^xdf^\prime(x+)}$ exists. The infinitesimal ${df^\prime(x+)}$ is alternatively written ${f^{\prime\prime}(dx)}$ and, in the twice continuously differentiable case, equals ${f^{\prime\prime}(x)dx}$ . Then,

$\displaystyle A_t=\int _{-\infty}^{\infty} L^x_t f^{\prime\prime}(dx).$

(4)

Using this expression in (3) gives the Ito-Tanaka-Meyer formula. Continue reading “The Ito-Tanaka-Meyer Formula” →

Failure of the Martingale Property For Stochastic Integration

If X is a cadlag martingale and ${\xi}$ is a uniformly bounded predictable process, then is the integral

$\displaystyle Y=\int\xi\,dX$

(1)

a martingale? If ${\xi}$ is elementary this is one of most basic properties of martingales. If X is a square integrable martingale, then so is Y. More generally, if X is an ${L^p}$ -integrable martingale, any ${p > 1}$ , then so is Y. Furthermore, integrability of the maximum ${\sup_{s\le t}\lvert X_s\rvert}$ is enough to guarantee that Y is a martingale. Also, it is a fundamental result of stochastic integration that Y is at least a local martingale and, for this to be true, it is only necessary for X to be a local martingale and ${\xi}$ to be locally bounded. In the general situation for cadlag martingales X and bounded predictable ${\xi}$ , it need not be the case that Y is a martingale. In this post I will construct an example showing that Y can fail to be a martingale. Continue reading “Failure of the Martingale Property For Stochastic Integration” →

Special Semimartingales

For stochastic processes in discrete time, the Doob decomposition uniquely decomposes any integrable process into the sum of a martingale and a predictable process. If ${\{X_n\}_{n=0,1,\ldots}}$ is an integrable process adapted to a filtration ${\{\mathcal{F}_n\}_{n=0,1,\ldots}}$ then we write ${X_n=M_n+A_n}$ . Here, M is a martingale, so that ${M_{n-1}={\mathbb E}[M_n\vert\mathcal{F}_{n-1}]}$ , and A is predictable with ${A_0=0}$ . By saying that A is predictable, we mean that ${A_n}$ is ${\mathcal{F}_{n-1}}$ measurable for each ${n\ge1}$ . It can be seen that this implies that

$\displaystyle A_n-A_{n-1}={\mathbb E}[A_n-A_{n-1}\vert\mathcal{F}_{n-1}]={\mathbb E}[X_n-X_{n-1}\vert\mathcal{F}_{n-1}].$

Then it is possible to write A and M as

$\displaystyle \setlength\arraycolsep{2pt} \begin{array}{rl} \displaystyle A_n&\displaystyle=\sum_{k=1}^n{\mathbb E}[X_k-X_{k-1}\vert\mathcal{F}_{k-1}],\smallskip\\ \displaystyle M_n&\displaystyle=X_n-A_n. \end{array}$

(1)

So, the Doob decomposition is unique and, conversely, the processes A and M constructed according to equation (1) can be seen to be respectively, a predictable process starting from zero and a martingale. For many purposes, this allows us to reduce problems concerning processes in discrete time to simpler statements about martingales and separately about predictable processes. In the case where X is a submartingale then things reduce further as, in this case, A will be an increasing process.

The situation is considerably more complicated when looking at processes in continuous time. The extension of the Doob decomposition to continuous time processes, known as the Doob-Meyer decomposition, was an important result historically in the development of stochastic calculus. First, we would usually restrict attention to sufficiently nice modifications of the processes and, in particular, suppose that X is cadlag. When attempting an analogous decomposition to the one above, it is not immediately clear what should be meant by the predictable component. The continuous time predictable processes are defined to be the set of all processes which are measurable with respect to the predictable sigma algebra, which is the sigma algebra generated by the space of processes which are adapted and continuous (or, equivalently, left-continuous). In particular, all continuous and adapted processes are predictable but, due to the existence of continuous martingales such as Brownian motion, this means that decompositions as sums of martingales and predictable processes are not unique. It is therefore necessary to impose further conditions on the term A in the decomposition. It turns out that we obtain unique decompositions if, in addition to being predictable, A is required to be cadlag with locally finite variation (an FV process). The processes which can be decomposed into a local martingale and a predictable FV process are known as special semimartingales. This is precisely the space of locally integrable semimartingales. As usual, we work with respect to a complete filtered probability space ${(\Omega,\mathcal{F},\{\mathcal{F}_t\}_{t\ge0},{\mathbb P})}$ and two stochastic processes are considered to be the same if they are equivalent up to evanescence.

Theorem 1 For a process X, the following are equivalent.

X is a locally integrable semimartingale.

X decomposes as

$\displaystyle X=M+A$ (2)

for a local martingale M and predictable FV process A.

Furthermore, choosing ${A_0=0}$ , decomposition (2) is unique.

Theorem 1 is a general version of the Doob-Meyer decomposition. However, the name `Doob-Meyer decomposition’ is often used to specifically refer to the important special case where X is a submartingale. Historically, the theorem was first stated and proved for that case, and I will look at the decomposition for submartingales in more detail in a later post. Continue reading “Special Semimartingales” →

Predictable FV Processes

By definition, an FV process is a cadlag adapted stochastic process which almost surely has finite variation over finite time intervals. These are always semimartingales, because the stochastic integral for bounded integrands can be constructed by taking the Lebesgue-Stieltjes integral along sample paths. Also, from the previous post on continuous semimartingales, we know that the class of continuous FV processes is particularly well behaved under stochastic integration. For one thing, given a continuous FV process X and predictable ${\xi}$ , then ${\xi}$ is X-integrable in the stochastic sense if and only if it is almost surely Lebesgue-Stieltjes integrable along the sample paths of X. In that case the stochastic and Lebesgue-Stieltjes integrals coincide. Furthermore, the stochastic integral preserves the class of continuous FV processes, so that ${\int\xi\,dX}$ is again a continuous FV process. It was also shown that all continuous semimartingales decompose in a unique way as the sum of a local martingale and a continuous FV process, and that the stochastic integral preserves this decomposition.

Moving on to studying non-continuous semimartingales, it would be useful to extend the results just mentioned beyond the class of continuous FV processes. The first thought might be to simply drop the continuity requirement and look at all FV processes. After all, we know that every FV process is a semimartingale and, by the Bichteler-Dellacherie theorem, that every semimartingale decomposes as the sum of a local martingale and an FV process. However, this does not work out very well. The existence of local martingales with finite variation means that the decomposition given by the Bichteler-Dellacherie theorem is not unique, and need not commute with stochastic integration for integrands which are not locally bounded. Also, it is possible for the stochastic integral of a predictable ${\xi}$ with respect to an FV process X to be well-defined even if ${\xi}$ is not Lebesgue-Stieltjes integrable with respect to X along its sample paths. In this case, the integral ${\int\xi\,dX}$ is not itself an FV process. See this post for examples where this happens.

Instead, when we do not want to restrict ourselves to continuous processes, it turns out that the class of predictable FV processes is the correct generalisation to use. By definition, a process is predictable if it is measurable with respect to the set of adapted and left-continuous processes so, in particular, continuous FV processes are predictable. We can show that all predictable FV local martingales are constant (Lemma 2 below), which will imply that decompositions into the sum of local martingales and predictable FV processes are unique (up to constant processes). I do not look at general semimartingales in this post, so will not prove the existence of such decompositions, although they do follow quickly from the results stated here. We can also show that predictable FV processes are very well behaved with respect to stochastic integration. A predictable process ${\xi}$ is integrable with respect to a predictable FV process X in the stochastic sense if and only if it is Lebesgue-Stieltjes integrable along the sample paths, in which case stochastic and Lebesgue-Stieltjes integrals agree. Also, ${\int\xi\,dX}$ will again be a predictable FV process. See Theorem 6 below.

In the previous post on continuous semimartingales, it was also shown that the continuous FV processes can be characterised in terms of their quadratic variations and covariations. They are precisely the semimartingales with zero quadratic variation. Alternatively, they are continuous semimartingales which have zero quadratic covariation with all local martingales. We start by extending this characterisation to the class of predictable FV processes. As always, we work with respect to a complete filtered probability space ${(\Omega,\mathcal{F},\{\mathcal{F}_t\}_{t\ge0},{\mathbb P})}$ and two stochastic processes are considered to be equal if they are equivalent up to evanescence. Recall that, in these notes, the notation ${[X]^c_t=[X]_t-\sum_{s\le t}(\Delta X_s)^2}$ is used to denote the continuous part of the quadratic variation of a semimartingale X.

Theorem 1 For a process X, the following are equivalent.

X is a predictable FV process.

X is a predictable semimartingale with ${[X]^c=0}$ .

X is a semimartingale such that ${[X,M]}$ is a local martingale for all local martingales M.

X is a semimartingale such that ${[X,M]}$ is a local martingale for all uniformly bounded cadlag martingales M.

Continue reading “Predictable FV Processes” →

Continuous Semimartingales

A stochastic process is a semimartingale if and only if it can be decomposed as the sum of a local martingale and an FV process. This is stated by the Bichteler-Dellacherie theorem or, alternatively, is often taken as the definition of a semimartingale. For continuous semimartingales, which are the subject of this post, things simplify considerably. The terms in the decomposition can be taken to be continuous, in which case they are also unique. As usual, we work with respect to a complete filtered probability space ${(\Omega,\mathcal{F},\{\mathcal{F}_t\}_{t\ge0},{\mathbb P})}$ , all processes are real-valued, and two processes are considered to be the same if they are indistinguishable.

Theorem 1 A continuous stochastic process X is a semimartingale if and only if it decomposes as

$\displaystyle X=M+A$ (1)

for a continuous local martingale M and continuous FV process A. Furthermore, assuming that ${A_0=0}$ , decomposition (1) is unique.

Proof: As sums of local martingales and FV processes are semimartingales, X is a semimartingale whenever it satisfies the decomposition (1). Furthermore, if ${X=M+A=M^\prime+A^\prime}$ were two such decompositions with ${A_0=A^\prime_0=0}$ then ${M-M^\prime=A^\prime-A}$ is both a local martingale and a continuous FV process. Therefore, ${A^\prime-A}$ is constant, so ${A=A^\prime}$ and ${M=M^\prime}$ .

It just remains to prove the existence of decomposition (1). However, X is continuous and, hence, is locally square integrable. So, Lemmas 4 and 5 of the previous post say that we can decompose ${X=M+A}$ where M is a local martingale, A is an FV process and the quadratic covariation ${[M,A]}$ is a local martingale. As X is continuous we have ${\Delta M=-\Delta A}$ so that, by the properties of covariations,

$\displaystyle -[M,A]_t=-\sum_{s\le t}\Delta M_s\Delta A_s=\sum_{s\le t}(\Delta A_s)^2.$

(2)

We have shown that ${-[M,A]}$ is a nonnegative local martingale so, in particular, it is a supermartingale. This gives ${\mathbb{E}[-[M,A]_t]\le\mathbb{E}[-[M,A]_0]=0}$ . Then (2) implies that ${\Delta A}$ is zero and, hence, A and ${M=X-A}$ are continuous. ⬜

Using decomposition (1), it can be shown that a predictable process ${\xi}$ is X-integrable if and only if it is both M-integrable and A-integrable. Then, the integral with respect to X breaks down into the sum of the integrals with respect to M and A. This greatly simplifies the construction of the stochastic integral for continuous semimartingales. The integral with respect to the continuous FV process A is equivalent to Lebesgue-Stieltjes integration along sample paths, and it is possible to construct the integral with respect to the continuous local martingale M for the full set of M-integrable integrands using the Ito isometry. Many introductions to stochastic calculus focus on integration with respect to continuous semimartingales, which is made much easier because of these results.

Theorem 2 Let ${X=M+A}$ be the decomposition of the continuous semimartingale X into a continuous local martingale M and continuous FV process A. Then, a predictable process ${\xi}$ is X-integrable if and only if

$\displaystyle \int_0^t\xi^2\,d[M]+\int_0^t\vert\xi\vert\,\vert dA\vert < \infty$ (3)

almost surely, for each time ${t\ge0}$ . In that case, ${\xi}$ is both M-integrable and A-integrable and,

$\displaystyle \int\xi\,dX=\int\xi\,dM+\int\xi\,dA$ (4)

gives the decomposition of ${\int\xi\,dX}$ into its local martingale and FV terms.

Continue reading “Continuous Semimartingales” →

The Bichteler-Dellacherie Theorem

In this post, I will give a statement and proof of the Bichteler-Dellacherie theorem describing the space of semimartingales. A semimartingale, as defined in these notes, is a cadlag adapted stochastic process X such that the stochastic integral ${\int\xi\,dX}$ is well-defined for all bounded predictable integrands ${\xi}$ . More precisely, an integral should exist which agrees with the explicit expression for elementary integrands, and satisfies bounded convergence in the following sense. If ${\{\xi^n\}_{n=1,2,\ldots}}$ is a uniformly bounded sequence of predictable processes tending to a limit ${\xi}$ , then ${\int_0^t\xi^n\,dX\rightarrow\int_0^t\xi\,dX}$ in probability as n goes to infinity. If such an integral exists, then it is uniquely defined up to zero probability sets.

An immediate consequence of bounded convergence is that the set of integrals ${\int_0^t\xi\,dX}$ for a fixed time t and bounded elementary integrands ${\vert\xi\vert\le1}$ is bounded in probability. That is,

$\displaystyle \left\{\int_0^t\xi\,dX\colon\xi{\rm\ is\ elementary},\ \vert\xi\vert\le1\right\}$

(1)

is bounded in probability, for each ${t\ge0}$ . For cadlag adapted processes, it was shown in a previous post that this is both a necessary and sufficient condition to be a semimartingale. Some authors use the property that (1) is bounded in probability as the definition of semimartingales (e.g., Protter, Stochastic Calculus and Differential Equations). The existence of the stochastic integral for arbitrary predictable integrands does not follow particularly easily from this definition, at least, not without using results on extensions of vector valued measures. On the other hand, if you are content to restrict to integrands which are left-continuous with right limits, the integral can be constructed very efficiently and, furthermore, such integrands are sufficient for many uses (integration by parts, Ito’s formula, a large class of stochastic differential equations, etc).

It was previously shown in these notes that, if X can be decomposed as ${X=M+V}$ for a local martingale M and FV process V then it is possible to construct the stochastic integral, so X is a semimartingale. The importance of the Bichteler-Dellacherie theorem is that it tells us that a process is a semimartingale if and only if it is the sum of a local martingale and an FV process. In fact this was the historical definition used of semimartingales, and is still probably the most common definition.

Throughout, we work with respect to a complete filtered probability space ${(\Omega,\mathcal{F},\{\mathcal{F}_t\}_{t\ge0},{\mathbb P})}$ , and all processes are real-valued.

Theorem 1 (Bichteler-Dellacherie) For a cadlag adapted process X, the following are equivalent.

X is a semimartingale.

For each ${t\ge0}$ , the set given by (1) is bounded in probability.

X is the sum of a local martingale and an FV process.

Furthermore, the local martingale term in 3 can be taken to be locally bounded.

Continue reading “The Bichteler-Dellacherie Theorem” →

Failure of Pathwise Integration for FV Processes

Figure 1: A non-pathwise stochastic integral of an FV Process

The motivation for developing a theory of stochastic integration is that many important processes — such as standard Brownian motion — have sample paths which are extraordinarily badly behaved. With probability one, the path of a Brownian motion is nowhere differentiable and has infinite variation over all nonempty time intervals. This rules out the application of the techniques of ordinary calculus. In particular, the Stieltjes integral can be applied with respect to integrators of finite variation, but fails to give a well-defined integral with respect to Brownian motion. The Ito stochastic integral was developed to overcome this difficulty, at the cost both of restricting the integrand to be an adapted process, and the loss of pathwise convergence in the dominated convergence theorem (convergence in probability holds intead).

However, as I demonstrate in this post, the stochastic integral represents a strict generalization of the pathwise Lebesgue-Stieltjes integral even for processes of finite variation. That is, if V has finite variation, then there can still be predictable integrands ${\xi}$ such that the integral ${\int\xi\,dV}$ is undefined as a Lebesgue-Stieltjes integral on the sample paths, but is well-defined in the Ito sense. Continue reading “Failure of Pathwise Integration for FV Processes” →

The Martingale Representation Theorem

The martingale representation theorem states that any martingale adapted with respect to a Brownian motion can be expressed as a stochastic integral with respect to the same Brownian motion.

Theorem 1 Let B be a standard Brownian motion defined on a probability space ${(\Omega,\mathcal{F},{\mathbb P})}$ and ${\{\mathcal{F}_t\}_{t\ge 0}}$ be its natural filtration.

Then, every ${\{\mathcal{F}_t\}}$ –local martingale M can be written as

$\displaystyle M = M_0+\int\xi\,dB$

for a predictable, B-integrable, process ${\xi}$ .

As stochastic integration preserves the local martingale property for continuous processes, this result characterizes the space of all local martingales starting from 0 defined with respect to the filtration generated by a Brownian motion as being precisely the set of stochastic integrals with respect to that Brownian motion. Equivalently, Brownian motion has the predictable representation property. This result is often used in mathematical finance as the statement that the Black-Scholes model is complete. That is, any contingent claim can be exactly replicated by trading in the underlying stock. This does involve some rather large and somewhat unrealistic assumptions on the behaviour of financial markets and ability to trade continuously without incurring additional costs. However, in this post, I will be concerned only with the mathematical statement and proof of the representation theorem.

In more generality, the martingale representation theorem can be stated for a d-dimensional Brownian motion as follows.

Theorem 2 Let ${B=(B^1,\ldots,B^d)}$ be a d-dimensional Brownian motion defined on the filtered probability space ${(\Omega,\mathcal{F},\{\mathcal{F}_t\}_{t\ge 0},{\mathbb P})}$ , and suppose that ${\{\mathcal{F}_t\}}$ is the natural filtration generated by B and ${\mathcal{F}_0}$ .

$\displaystyle \mathcal{F}_t=\sigma\left(\{B_s\colon s\le t\}\cup\mathcal{F}_0\right)$

Then, every ${\{\mathcal{F}_t\}}$ -local martingale M can be expressed as

$\displaystyle M=M_0+\sum_{i=1}^d\int\xi^i\,dB^i$ (1)

for predictable processes ${\xi^i}$ satisfying ${\int_0^t(\xi^i_s)^2\,ds<\infty}$ , almost surely, for each ${t\ge0}$ .

Continue reading “The Martingale Representation Theorem” →

	Anonymous on Poisson Processes
	Anonymous on About
	Anonymous on About
	Anonymous on About
	Anonymous on The Projection Theorems
	Anonymous on Feller Processes
	SilverBladeII on Cadlag Modifications
	Anonymous on Spitzer’s Formula
	Anonymous on Spitzer’s Formula
	Anonymous on Brownian Bridges