# Existence of the Stochastic Integral

The principal reason for introducing the concept of semimartingales in stochastic calculus is that they are precisely those processes with respect to which stochastic integration is well defined. Often, semimartingales are defined in terms of decompositions into martingale and finite variation components. Here, I have taken a different approach, and simply defined semimartingales to be processes with respect to which a stochastic integral exists satisfying some necessary properties. That is, integration must agree with the explicit form for piecewise constant elementary integrands, and must satisfy a bounded convergence condition. If it exists, then such an integral is uniquely defined. Furthermore, whatever method is used to actually construct the integral is unimportant to many applications. Only its elementary properties are required to develop a theory of stochastic calculus, as demonstrated in the previous posts on integration by parts, Ito’s lemma and stochastic differential equations.

The purpose of this post is to give an alternative characterization of semimartingales in terms of a simple and seemingly rather weak condition, stated in Theorem 1 below. The necessity of this condition follows from the requirement of integration to satisfy a bounded convergence property, as was commented on in the original post on stochastic integration. That it is also a sufficient condition is the main focus of this post. The aim is to show that the existence of the stochastic integral follows in a relatively direct way, requiring mainly just standard measure theory and no deep results on stochastic processes.

Recall that throughout these notes, we work with respect to a complete filtered probability space ${(\Omega,\mathcal{F},\{\mathcal{F}_t\}_{t\ge 0},{\mathbb P})}$. To recap, elementary predictable processes are of the form

 $\displaystyle \xi_t=Z_01_{\{t=0\}}+\sum_{k=1}^n Z_k1_{\{s_{k} (1)

for an ${\mathcal{F}_0}$-measurable random variable ${Z_0}$, real numbers ${s_k,t_k\ge 0}$ and ${\mathcal{F}_{s_k}}$-measurable random variables ${Z_k}$. The integral with respect to any other process X up to time t can be written out explicitly as,

 $\displaystyle \int_0^t\xi\,dX = \sum_{k=1}^n Z_k(X_{t_k\wedge t}-X_{s_k\wedge t}).$ (2)

The predictable sigma algebra, ${\mathcal{P}}$, on ${{\mathbb R}_+\times\Omega}$ is generated by the set of left-continuous and adapted processes or, equivalently, by the elementary predictable process. The idea behind stochastic integration is to extend this to all bounded and predictable integrands ${\xi\in{\rm b}\mathcal{P}}$. Other than agreeing with (2) for elementary integrands, the only other property required is bounded convergence in probability. That is, if ${\xi^n\in{\rm b}\mathcal{P}}$ is a sequence uniformly bounded by some constant K, so that ${\vert\xi^n\vert\le K}$, and converging to a limit ${\xi}$ then, ${\int_0^t\xi^n\,dX\rightarrow\int_0^t\xi\,dX}$ in probability. Nothing else is required. Other properties, such as linearity of the integral with respect to the integrand follow from this, as was previously noted. Note that we are considering two random variables to be the same if they are almost surely equal. Similarly, uniqueness of the stochastic integral means that, for each integrand, the integral is uniquely defined up to probability one.

Using the definition of a semimartingale as a cadlag adapted process with respect to which the stochastic integral is well defined for bounded and predictable integrands, the main result is as follows. To be clear, in this post all stochastic processes are real-valued.

Theorem 1 A cadlag adapted process X is a semimartingale if and only if, for each ${t\ge 0}$, the set

 $\displaystyle \left\{\int_0^t\xi\,dX\colon \xi{\rm\ is\ elementary}, \vert\xi\vert\le 1\right\}$ (3)

is bounded in probability.

As was previously noted, the necessity of this condition follows from bounded convergence. In fact, boundedness in probability of the set in (3) is equivalent to the statement that, for any sequence ${\xi^n}$ of bounded predictable processes converging uniformly to zero, ${\int_0^t\xi^n\,dX}$ converges to zero in probability. It is interesting to note that this seemingly much weaker uniform convergence property is enough to imply the stronger property of bounded convergence in probability.

The proof of Theorem 1 is given below. However, it helps to take a step back at this point and ask the following question. What do we need to do, in general, to construct linear maps satisfying bounded convergence of sequences? This question is not restricted to stochastic calculus, and can be asked in a much more general situation. Given a measurable space ${(E,\mathcal{E})}$ and a topological vector space V, what does it take to construct a linear map ${\mu\colon{\rm b}\mathcal{E}\rightarrow V}$ which satisfies bounded convergence? In this general situation, bounded convergence means that if ${\xi^n\in{\rm b}\mathcal{E}}$ is a sequence uniformly bounded by a constant K, so that ${\vert\xi^n\vert\le K}$, and ${\xi^n\rightarrow\xi}$ then, ${\mu(\xi^n)\rightarrow\mu(\xi)}$ in the topology of V. Such maps will be referred to as V-valued measures. For stochastic integration, we are only really concerned with the case ${(E,\mathcal{E})=({\mathbb R}_+\times\Omega,\mathcal{P})}$ and where V is the space ${L^0(\Omega,\mathcal{F},{\mathbb P})}$ of random variables under the topology of convergence in probability. However, even in the general situation the following result is true. Here, a subalgebra of ${{\rm b}\mathcal{E}}$ is a subset closed under linear combinations and pointwise multiplication, and containing the constant functions.

Theorem 2 Let ${(E,\mathcal{E})}$ be a measurable space, ${\mathcal{A}}$ be a subalgebra of ${{\rm b}\mathcal{E}}$ generating ${\mathcal{E}}$, and V be a complete vector space. Then, a linear map ${\mu\colon\mathcal{A}\rightarrow V}$ extends to a V-valued measure on ${(E,\mathcal{E})}$ if and only if it satisfies the following properties for sequences ${\xi^n\in\mathcal{A}}$.

1. If ${\xi^n\downarrow 0}$ then ${\mu(\xi^n)\rightarrow 0}$.
2. If ${\sum_n\vert\xi^n\vert\le 1}$, then ${\mu(\xi^n)\rightarrow 0}$.

The proof of this result will be given in the next post, as the details would cloud the main argument here and, in any case, it is just a statement involving pure measure theory with no stochastic calculus whatsoever. Both of the conditions of this theorem involve a uniformly bounded sequence ${\xi^n\in\mathcal{A}}$ tending to zero so that, if ${\mu}$ is to extend to a measure satisfying bounded convergence, it is necessary that ${\mu(\xi^n)\rightarrow 0}$. The main result is that these are also sufficient conditions.

It is instructive to pause here for a moment, and consider how these conditions are shown to be true in the case of Lebesgue integration and Stieltjes integration with respect to finite variation functions, and to compare with the stochastic case. The first condition, of monotone convergence, can be shown by making use of the compactness of the interval ${[0,t]}$, and follows for stochastic integration in much the same way as for the Lebesgue integral. In fact, if the set ${\mathcal{A}}$ of functions from which the integral is to be extended are continuous then, by compactness, any sequence decreasing monotonically to zero will also converge uniformly. This useful property is used in the proof below, first extending the integral to continuous integrands and, then, applying Theorem 2 to extend to all bounded predictable integrands.

The second condition of Theorem 2 is really where the difference between standard integration and the stochastic integral comes in. Consider, for example, the usual Lebesgue integral on the interval ${[0,t]}$. This is a nonnegative map so that, as long as the absolute value ${\vert\xi\vert}$ of any ${\xi\in\mathcal{A}}$ is also in ${\mathcal{A}}$, the following inequality holds

 $\displaystyle \sum_{k=1}^n\vert\mu(\xi^k)\vert\le\mu\left(\sum_{k=1}^n\vert\xi^k\vert\right)\le\mu(1).$

It follows that ${\sum_{k=1}^\infty\vert\mu(\xi^k)\vert<\infty}$ so that ${\mu(\xi^n)\rightarrow 0}$ as required by Theorem 2.

Similarly, if ${\mu}$ represents integration with respect to a function of finite variation V, then ${\mu(\xi)}$ will be bounded by V whenever ${\vert\xi\vert\le 1}$. Defining constants ${\epsilon_k= 1}$ whenever ${\mu(\xi^k)\ge 0}$ and ${\epsilon_k=-1}$ otherwise then, under the second condition of the theorem, the partial sums ${\sum_{k=1}^n\epsilon_k\xi^k}$ are all bounded by 1. This gives the inequality

 $\displaystyle \sum_{k=1}^n\vert\mu(\xi^k)\vert=\sum_{k=1}^n\epsilon_k\mu(\xi^k)=\mu\left(\sum_{k=1}^n\epsilon_k\xi^k\right)\le V.$ (4)

So, once again, ${\sum_{k=1}^\infty\vert\mu(\xi^k)\vert}$ is finite.

This argument does not apply to stochastic integration, which takes values in the space ${L^0}$ of random variables. This is because ${\epsilon_k}$ as defined above would not be real numbers but, instead, are themselves random variables. So, ${\epsilon_k\xi^k}$ need not be predictable processes and equation (4) just does not make sense. In fact, in the stochastic case it is not even true that ${\sum_n\vert\mu(\xi^n)\vert}$ is finite. For example, suppose that ${\mu(\xi)=\int_0^1\xi\,dB}$ is integration with respect to standard Brownian motion on the unit interval. If ${\xi^n_t=1_{\{1/(n+1) then, ${\mu(\xi^n)=B_{1/n}-B_{1/(n+1)}}$ are independent centered Gaussians with variance 1/n-1/(n+1). Their absolute values have mean going to zero at rate 1/n and, with probability 1, ${\sum_n\vert\mu(\xi^n)\vert=\infty}$.

However, it turns out that a modified argument along similar lines does indeed work for stochastic integration and, more generally, for all vector-valued measures taking values in a space ${L^0}$ of random variables under convergence in probability. Instead of carefully choosing ${\epsilon_k}$ to make ${\epsilon_k\mu(\xi^k)}$ positive, as in equation (4), the total opposite is done. That is, ${\epsilon_k}$ are chosen completely at random in the set ${\{1,-1\}}$. This will be enough to show that ${\sum_n\mu(\xi^n)^2}$ is finite. The idea is to apply a variant of the Khintchine inequalities, as stated below in Lemma 3.

Suppose that ${\epsilon_1,\ldots,\epsilon_n}$ is a sequence of independent random variables, each taking the values 1,-1 both with probability 1/2. If ${\alpha_1,\ldots,\alpha_n}$ is a sequence of real numbers, define a random variable Z and ${\sigma\ge 0}$ by,

 $\displaystyle Z = \sum_{k=1}^n\epsilon_k\alpha_k,\ \sigma^2 = \sum_{k=1}^n\alpha_k^2,$ (5)

so that Z has mean zero and variance ${\sigma^2}$.

Lemma 3 Given a real number ${0, there exists a ${\delta>0}$ such that for any ${Z,\sigma}$ of the form (5),

 $\displaystyle {\mathbb P}\left(\vert Z\vert\ge K\sigma\right)\ge\delta.$

Here, ${\delta}$ only depends on K and not on the choice of ${\alpha_1,\ldots,\alpha_n}$.

Proof: We shall make use of the inequality

 $\displaystyle {\mathbb P}(Y> 0)\ge{\mathbb E}[Y]^2/{\mathbb E}[Y^2].$ (6)

This holds for any integrable random variable ${Y}$ with positive mean, and follows from taking expectations of ${1_{\{Y>0\}}\ge L^{-2}Y(2L-Y)}$ and substituting ${{\mathbb E}[Y^2]/{\mathbb E}[Y]}$ for ${L}$.

Now use the properties that ${\epsilon_k}$ are independent with zero mean and ${\epsilon_k^2=1}$ to get ${{\mathbb E}[Z^2]=\sigma^2}$ and,

 $\displaystyle \setlength\arraycolsep{2pt} \begin{array}{rl} \displaystyle{\mathbb E}[Z^4] &\displaystyle=3\sum_{j\not=k}\alpha_j^2\alpha_k^2+\sum_k\alpha_k^4=3\left(\sum_k\alpha_k^2\right)^2-2\sum_k\alpha_k^4\smallskip\\ &\displaystyle\le 3\sigma^4. \end{array}$

If all of the ${\alpha_k}$ are zero, the result is immediate for any ${\delta\le 1}$, otherwise put ${Y=Z^2-K^2\sigma^2}$ into (6),

 $\displaystyle \setlength\arraycolsep{2pt} \begin{array}{rl} \displaystyle{\mathbb P}(\vert Z\vert>K\sigma)&\displaystyle\ge{\mathbb E}[Z^2-K^2\sigma^2]^2/{\mathbb E}[(Z^2-K^2\sigma^2)^2]\smallskip\\ &\displaystyle=\sigma^4(1-K^2)^2/({\mathbb E}[Z^4]-2K^2\sigma^4+K^4\sigma^4)\smallskip\\ &\displaystyle\ge (1-K^2)^2/(3-2K^2+K^4) \end{array}$

and the result follows by taking ${\delta}$ to be ${(1-K^2)^2/(3-2K^2+K^4)}$. ⬜

Lemma 3 can be used to give the following result concerning sums of random variables.

Lemma 4 Let ${Z_1,Z_2,\ldots}$ be a sequence of random variables such that the set

 $\displaystyle S\equiv\left\{\sum_{k=1}^n\epsilon_kZ_k\colon n\in{\mathbb N}, \epsilon_1,\ldots,\epsilon_n\in\{ 1, -1\}\right\}$

is bounded in probability. Then, ${\sum_{k=1}^\infty Z_k^2}$ is almost surely finite. In particular, ${Z_k\rightarrow 0}$ almost surely and, hence, also in probability.

Proof: Set ${\sigma_n\equiv(\sum_{k=1}^nZ_k^2)^{1/2}}$. Then, if ${\mu}$ is the uniform probability measure on ${\{1,-1\}^n}$ and ${K,\delta>0}$ are as in Lemma 3,

 $\displaystyle \int 1_{\{\vert\sum_{k=1}^n\epsilon_kZ_k\vert\ge K\sigma_n\}}\,d\mu({\bf\epsilon})\ge\delta.$

Now, choose a positive constant L, multiply both sides by ${1_{\{\sigma_n\ge L\}}}$ and take expectations. Exchanging the order of integration with respect to ${{\mathbb P}}$ and ${\mu}$ gives,

 $\displaystyle \int {\mathbb P}\left(\left\vert\sum_{k=1}^n\epsilon_kZ_k\right\vert\ge K\sigma_n\ge KL\right)\,d\mu({\bf\epsilon})\ge\delta{\mathbb P}(\sigma_n\ge L).$ (7)

For any ${\beta>0}$, the boundedness of S in probability means that L can be chosen large enough that ${{\mathbb P}(\vert Y\vert\ge KL)\le\beta}$ for all Y in S. Substituting this inequality in the left hand side of (7),

 $\displaystyle \beta\ge\delta{\mathbb P}(\sigma_n\ge L).$

Now, let n increase to infinity,

 $\displaystyle {\mathbb P}(\sum_k Z_k^2=\infty)\le {\mathbb P}(\sum_k Z_k^2> L^2)=\lim_n{\mathbb P}(\sigma_n>L)\le \beta/\delta.$

Finally, as ${\beta>0}$ can be chosen arbitrarily small, this implies that ${\{\sum_kZ_k^2=\infty\}}$ has zero probability. ⬜

This result shows that, in the case where V is a space ${L^0(\Omega,\mathcal{F},{\mathbb P})}$ of random variables under the topology of convergence in probability, the second condition in Theorem 2 can be dropped altogether.

Theorem 5 Let ${(E,\mathcal{E})}$ be a measurable space, ${\mathcal{A}}$ be a subalgebra of ${{\rm b}\mathcal{E}}$ generating ${\mathcal{E}}$, and ${(\Omega,\mathcal{F},{\mathbb P})}$ be a probability space. Then, a linear map ${\mu\colon\mathcal{A}\rightarrow L^0(\Omega,\mathcal{F},{\mathbb P})}$ extends to an ${L^0}$-valued measure on ${(E,\mathcal{E})}$ if and only if it satisfies the following property.

• For any sequence ${\xi^n\in\mathcal{A}}$ with ${\xi^n\downarrow 0}$, then ${\mu(\xi^n)\rightarrow 0}$ (in probability).

Proof: It just needs to be shown then the second condition of Theorem 2 is satisfied. First, note that the condition of the theorem is strong enough to ensure that the set

 $\displaystyle S=\left\{\mu(\xi)\colon\xi\in\mathcal{A},\ \vert\xi\vert\le 1\right\}$

is bounded in probability. This follows from the sequential characterization of boundedness — if ${\xi^n\in\mathcal{A}}$ are bounded by 1 then ${\zeta^n\equiv 2^{-n}(3+\xi^n)}$ is decreasing to zero, so ${2^{-n}\mu(\xi^n)=\mu(\zeta^n)-2^{-n}\mu(3)}$ tends to zero.

Now consider a sequence ${\xi^n\in\mathcal{A}}$ satisfying ${\sum_n\vert\xi^n\vert\le 1}$. Then, for any ${\epsilon_1,\ldots,\epsilon_n\in\{1,-1\}}$, the partial sum ${\sum_{k=1}^n\epsilon_k\xi^k\in\mathcal{A}}$ is bounded by 1. So, the set of all such sums ${\sum_{k=1}^n\epsilon_k\mu(\xi^k)}$ is in S and, hence, is bounded in probability. Then, Lemma 4 says that ${\sum_n\mu_n(\xi^n)^2}$ is almost surely finite, so that ${\mu(\xi^n)\rightarrow 0}$ in probability. The result now follows from Theorem 2. ⬜

#### Proof of the existence of the stochastic integral

Throughout this section, it is assumed that X is a cadlag adapted process and that the set in (3) is bounded in probability. Fixing a time t, define ${\mu(\xi)\equiv\int_0^t\xi\,dX}$ for elementary predictable processes ${\xi}$, which is a linear map from the elementary predictable processes to ${L^0(\Omega,\mathcal{F},{\mathbb P})}$. The condition on X implies that ${\mu(\xi)}$ is continuous under the topology of uniform convergence for ${\xi}$. This can be strengthened to uniform convergence on compacts in probability.

Lemma 6 If ${\xi^n}$ is a sequence of elementary predictable processes converging ucp to zero, then ${\mu(\xi^n)\rightarrow 0}$ in probability.

Proof: By definition of ucp convergence, ${\sup_{s\le t}|\xi^n_s|}$ tends to zero in probability as n goes to infinity. It follows that there is a sequence ${\epsilon_n\rightarrow 0}$ such that ${{\mathbb P}(\sup_{s\le t}\vert\xi^n_s\vert>\epsilon_n)\rightarrow 0}$. Then, ${\zeta^n\equiv\max(\min(\xi^n,\epsilon_n),-\epsilon_n)}$ are elementary processes converging uniformly to zero, so ${\mu(\zeta^n)\rightarrow 0}$ in probability. From (1), ${\mu(\xi^n)=\mu(\zeta^n)}$ whenever ${\xi^n_s}$ and ${\zeta^n_s}$ agree on ${s\le t}$, which has probability ${{\mathbb P}(\sup_{s\le t}\vert\xi^n\vert\le\epsilon_n)\rightarrow 1}$, so ${\mu(\xi^n)\rightarrow 0}$ in probability as required. ⬜

By extension of continuous linear functions, this allows us to uniquely extend ${\mu}$ to the closure of the elementary processes under the ucp topology, say, ${\mathcal{\bar E}\subseteq{\rm b}\mathcal{P}}$. So, ${\mu\colon\mathcal{\bar E}\rightarrow L^0}$ is linear and continuous under the ucp topology on ${\mathcal{\bar E}}$ and convergence in probability on ${L^0}$.

Every uniformly bounded, continuous and adapted process ${\xi}$ is in ${\mathcal{\bar E}}$. In particular, the elementary processes

 $\displaystyle \xi^n_t = \xi_0 1_{\{t=0\}} + \sum_{k=1}^{n^2}\xi_{(k-1)/n}1_{\{(k-1)/n

converge uniformly on compacts to ${\xi}$. Theorem 5 can be applied with ${\mathcal{A}}$ being the set of uniformly bounded, continuous and adapted processes, completing the proof of Theorem 1.

Lemma 7 ${\mu}$ extends to an ${L^0}$-valued measure ${\mu^*\colon{\rm b}\mathcal{P}\rightarrow L^0}$.

Proof: As noted above, ${\mu}$ can be extended to the closure ${\mathcal{\bar E}\subseteq{\rm b}\mathcal{P}}$ of the elementary predictable processes under ucp convergence, which contains the set ${\mathcal{A}}$ of uniformly bounded, continuous and adapted processes.

Suppose that a sequence of nonnegative processes ${\xi^n\in\mathcal{A}}$ decreases to zero. Then, for each ${T,\epsilon>0}$ and ${\omega\in\Omega}$, the sets ${C_n\equiv\{s\in[0,T]\colon\xi^n_{s}(\omega)\ge\epsilon\}}$ are closed, bounded and have empty intersection. By compactness, ${C_n=\emptyset}$ for large n, showing that ${\sup_{s\le T}\xi^n_s}$ decreases to zero. So, ${\xi^n}$ converges ucp to zero and, by continuity, ${\mu(\xi^n)}$ goes to zero in probability.

Theorem 5 can now be applied to find an ${L^0}$-valued measure ${\mu^*\colon{\rm b}\mathcal{P}\rightarrow L^0}$ agreeing with ${\mu}$ on ${\mathcal{A}}$. To complete the proof, it only remains to show that they agree on the elementary predictable processes.

Choosing ${\epsilon>0}$, any elementary predictable process ${\xi}$ can be smoothed out as follows to obtain a continuous and adapted process,

 $\displaystyle \zeta^\epsilon_s \equiv \int_0^1\xi_{(s-u\epsilon)\vee 0}\,du.$

Approximating uniformly by the elementary processes ${n^{-1}\sum_{k=1}^n\xi_{(s-k\epsilon/n)\vee 0}}$ gives

 $\displaystyle \setlength\arraycolsep{2pt} \begin{array}{rl} \displaystyle\mu(\zeta^\epsilon)&\displaystyle=\lim_{n\rightarrow\infty}n^{-1}\int_0^t\sum_{k=1}^n\xi_{(s-k\epsilon/n)\vee 0}\,dX_s\smallskip\\ &\displaystyle=\lim_{n\rightarrow\infty}\int_0^t\xi\,dX^n+\xi_0(X^n_0-X_0), \end{array}$

where ${X^n_s\equiv n^{-1}\sum_{k=1}^nX_{(s+k\epsilon/n)\wedge t}}$. Taking the limit as n goes to infinity,

 $\displaystyle \mu^*(\zeta^\epsilon)=\mu(\zeta^\epsilon) = \int_0^t\xi\,d\tilde X^\epsilon + \xi_0(\tilde X^\epsilon_0-X_0).$

where ${\tilde X^\epsilon_s=\int_0^1 X_{(s+u\epsilon)\wedge t}}$. Finally, as ${\epsilon}$ decreases to zero, ${\zeta^\epsilon\rightarrow\xi}$ by left-continuity of ${\xi}$, and ${\tilde X^\epsilon\rightarrow X}$ by right continuity of X. This gives ${\mu^*(\xi)=\int_0^t\xi\,dX}$ as required. ⬜

#### Notes

The definition I gave for a semimartingale as being any cadlag adapted process with respect to which the stochastic integral is well-defined differs somewhat from the `classical’ definition. Historically, semimartingales have been defined as processes which can be decomposed into the sum of a local martingale and an FV process. Stochastic integration was developed with respect to such processes using generalizations of the Ito isometry applied to locally square integrable martingales. It was later shown, by the Bichteler-Dellacherie theorem, that any process satisfying the conditions of Theorem 1 above does have such a decomposition. So, the classical semimartingale definition does indeed agree with one given in these notes.

A different approach is used here, because it is possible to approach subjects such as Ito’s lemma and stochastic differential equations immediately in their full generality, without first developing a lot of technical machinery. Also, the proof given in this post that stochastic integration can be performed with respect to all processes satisfying the conditions of Theorem 1 is much, much more direct than the classical one. For example, it should be clear that the hypothesis that X is adapted can be dropped altogether, as it wasn’t even used anywhere above. Furthermore, right-continuity is only required right at the end to ensure that the constructed integral agrees with the formula for elementary processes.

Later authors have used a closer approach to the one in these notes. For example, Protter (Stochastic Integration and Differential Equations) gives two definitions of semimartingales. Both the classical definition, and the one given by Theorem 1 above are provided. Using this newer semimartingale definition, he is able to derive the stochastic integral for integrands which are L-processes (left continuous with right limits) using continuity under ucp convergence as stated in Lemma 6 above. This is enough for many applications, such as Ito’s lemma. However, he doesn’t extend the integral to arbitrary predictable integrands directly, instead relying on the Bichteler-Dellacherie theorem to show that the classical and newer semimartingale definitions agree. Then, for the full development of the integral, techniques closer to the standard ones were used.

The only author I am aware of who has developed stochastic calculus without relying on the classical definition is Bichteler (Stochastic Integration with Jumps), where a semimartingale was defined as any process satisfying the conditions of Theorem 1, and the integral was developed directly from this. This method seems to date back to a 1981 paper by the same author (Stochastic Integration and Lp-Theory of Semimartingales). It is from his treatment of stochastic integration where the idea of using the Khintchine inequality (Lemma 3) to simplify the construction of the integral was obtained. This allowed us to remove the second condition in Theorem 2. There are other methods of proving that the integral satisfies this property, but the use of the Khintchine inequality shows that it is much more general, applying to arbitrary ${L^0}$-valued measures. In fact, Theorem 5 can be similarly proved for all ${L^p}$ spaces, ${0\le p<\infty}$, although that is not used in these notes.

## 2 thoughts on “Existence of the Stochastic Integral”

1. Hello. First, I must thank you for your great blog! I have read them (multiple times! they are sometimes not easy to follow…) and learned a lot from them.

I have a question regarding the proof of Theorem 5. In the proof you prove that, given a sequence $\xi^n$ such that $\vert\xi^n\vert\le 1$, then $2^{-n}\,\mu(\xi^n)$ converges to 0 in probability. From here you imply the boundedness in probability of the image of the unit ball under the (continuous) linear operator, by invoking the sequential characterization of boundedness. However, if I understand it correctly, that characterization requires convergence to 0 of all, not just some, sequences of the form $z_n \mu(\xi^n)$, with $z_n\rightarrow 0$. For real numbers it is obvious, since the sequence $x_n = (\sqrt{2})^n$ satisfies that $x_n/2^n$ converges to zero but is not bounded.

On the other hand, for the particular case at hand (images of bounded, adapted, continuous processes) boundedness follows from the continuity under uniform convergence, which is guaranteed by continuity under UCP convergence.

1. Sorry, I did not answer this comment when it was posted. The sequential characterisation of boundedness applies here because $\xi^n$ is an arbitrary sequence bounded by 1. Let $\alpha_n$ be any sequence of positive reals and S be a set of random variables. If $\alpha_n X_n\to0$ in probability for every sequence $X_n\in S$, then S is bounded in probability.