States on *-Algebras

So far, we have been considering positive linear maps on a *-algebra. Taking things a step further, we want to consider positive maps which are normalized so as to correspond to expectations under a probability measure. That is, we require ${p(1)=1}$, although this is only defined for unitial algebras. I use the definitions and notation of the previous post on *-algebras.

Definition 1 A state on a unitial *-algebra ${\mathcal A}$ is a positive linear map ${p\colon\mathcal A\rightarrow{\mathbb C}}$ satisfying ${p(1)=1}$.

Examples 3 and 4 of the previous post can be extended to give states.

Example 1 Let ${(X,\mathcal E,\mu)}$ be a probability space, and ${\mathcal A}$ be the bounded measurable maps ${X\rightarrow{\mathbb C}}$. Then, integration w.r.t. ${\mu}$ defines a state on ${\mathcal A}$,

$\displaystyle p(f)=\int f d\mu.$

Example 2 Let ${V}$ be an inner product space, and ${\mathcal A}$ be a *-algebra of the space of linear maps ${a\colon V\rightarrow V}$ as in example 2 of the previous post, and including the identity map ${I}$. Then, any ${\xi\in V}$ with ${\lVert\xi\rVert=1}$ defines a state on ${\mathcal A}$,

$\displaystyle p(a)=\langle\xi,a\xi\rangle.$

*-Algebras

After the previous posts motivating the idea of studying probability spaces by looking at states on algebras, I will now make a start on the theory. The idea is that an abstract algebra can represent the collection of bounded, and complex-valued, random variables, with a state on this algebra taking the place of the probability measure. By allowing the algebra to be noncommutative, we also incorporate quantum probability.

I will take very small first steps in this post, considering only the basic definition of a *-algebra and positive maps. To effectively emulate classical probability theory in this context will involve additional technical requirements. However, that is not the aim here. We take a bare-bones approach, to get a feeling for the underlying constructs, and start with the definition of a *-algebra. I use ${\bar\lambda}$ to denote the complex conjugate of a complex number ${\lambda}$.

Definition 1 An algebra ${\mathcal A}$ over field ${K}$ is a ${K}$-vector space together with a binary product ${(a,b)\mapsto ab}$ satisfying

$\displaystyle \setlength\arraycolsep{2pt} \begin{array}{rl} &\displaystyle a(bc)=(ab)c,\smallskip\\ &\displaystyle \lambda(ab)=(\lambda a)b=a(\lambda b)\smallskip\\ &\displaystyle a(b+c)=ab+ac,\smallskip\\ &\displaystyle (a+b)c=ac+bc, \end{array}$

for all ${a,b,c\in\mathcal A}$ and ${\lambda\in K}$.

A *-algebra ${\mathcal A}$ is an algebra over ${{\mathbb C}}$ with a unary involution, ${a\mapsto a^*}$ satisfying

$\displaystyle \setlength\arraycolsep{2pt} \begin{array}{rl} &\displaystyle (\lambda a+\mu b)^*=\bar\lambda a^*+\bar\mu b^*,\smallskip\\ &\displaystyle (ab)^*=b^*a^*\smallskip\\ &\displaystyle a^{**}=a. \end{array}$

for all ${a,b,c\in\mathcal A}$ and ${\lambda,\mu\in{\mathbb C}}$.

An algebra is called unitial if there exists ${1\in\mathcal A}$ such that

$\displaystyle 1a=a1=a$

for all ${a\in\mathcal A}$. Then, ${1}$ is called the unit or identity of ${\mathcal A}$.

Algebraic Probability: Quantum Theory

We continue the investigation of representing probability spaces as states on algebras. Whereas, previously, I focused attention on the commutative case and on classical probabilities, in the current post I will look at non-commutative quantum probability.

Quantum theory is concerned with computing probabilities of outcomes of measurements of a physical system, as conducted by an observer. The standard approach is to start with a Hilbert space ${\mathcal H}$, which is used to represent the states of the system. This is a vector space over the complex numbers, together with an inner product ${\langle\cdot,\cdot\rangle}$. By definition, this is linear in one argument and anti-linear in the other,

$\displaystyle \setlength\arraycolsep{2pt} \begin{array}{rl} &\displaystyle\langle\phi,\lambda\psi+\mu\chi\rangle=\lambda\langle\phi,\psi\rangle+\mu\langle\phi,\chi\rangle,\smallskip\\ &\displaystyle\langle\lambda\phi+\mu\psi,\chi\rangle=\bar\lambda\langle\phi,\chi\rangle+\bar\mu\langle\psi,\chi\rangle,\smallskip\\ &\displaystyle\langle\psi,\phi\rangle=\overline{\langle\phi,\psi\rangle}, \end{array}$

for ${\phi,\psi,\chi\in\mathcal H}$ and ${\lambda,\mu\in{\mathbb C}}$. Positive definiteness is required, so that ${\langle\psi,\psi\rangle > 0}$ for ${\psi\not=0}$. I am using the physicists’ convention, where the inner product is linear in the second argument and anti-linear in the first. Furthermore, physicists often use the bra–ket notation ${\langle\phi\vert\psi\rangle}$, which can be split up into the bra’ ${\langle\phi\vert}$ and ket’ ${\vert\psi\rangle}$ considered as elements of the dual space of ${\mathcal H}$ and of ${\mathcal H}$ respectively. For a linear operator ${A\colon\mathcal H\rightarrow\mathcal H}$, the expression ${\langle\phi,A\psi\rangle}$ is often expressed as ${\langle\phi\vert A\vert\psi\rangle}$ in the physicists’ language. By the Hilbert space definition, ${\mathcal H}$ is complete with respect to the norm ${\lVert\psi\rVert=\sqrt{\langle\psi,\psi\rangle}}$. Continue reading “Algebraic Probability: Quantum Theory”

Algebraic Probability (continued)

Continuing on from the previous post, I look at cases where the abstract concept of states on algebras correspond to classical probability measures. Up until now, we have considered commutative real algebras but, before going further, it will help to look instead at algebras over the complex numbers ${{\mathbb C}}$. In the commutative case, we will see that this is equivalent to using real algebras, but can be more convenient, and in the non-commutative case it is essential. When using complex algebras, we will require the existence of an involution, which can be thought of as a generalisation of complex conjugation.

Recall that, by an algebra ${\mathcal A}$ over a field ${K}$, we mean that ${\mathcal A}$ is a ${K}$-vector space together with a binary product operation satisfying associativity, distributivity over addition, compatibility with scalars, and which has a multiplicative identity.

Definition 1 A *-algebra ${\mathcal A}$ is an algebra over ${{\mathbb C}}$ together with an involution, which is a unary operator ${\mathcal A\rightarrow\mathcal A}$, ${a\mapsto a^*}$, satisfying,

1. Anti-linearity: ${(\lambda a+\mu b)^*=\bar\lambda a^*+\bar\mu b^*}$.
2. ${(ab)^*=b^*a^*}$.
3. ${a^{**}=a}$

for all ${a,b\in\mathcal A}$ and ${\lambda,\mu\in{\mathbb C}}$.

Algebraic Probability

The aim of this post is to motivate the idea of representing probability spaces as states on a commutative algebra. We will consider how this abstract construction relates directly to classical probabilities.

In the standard axiomatization of probability theory, due to Kolmogorov, the central construct is a probability space ${(\Omega,\mathcal F,{\mathbb P})}$. This consists of a state space ${\Omega}$, an event space ${\mathcal F}$, which is a sigma-algebra of subsets of ${\Omega}$, and a probability measure ${{\mathbb P}}$. The measure ${{\mathbb P}}$ is defined as a map ${{\mathbb P}\colon\mathcal F\rightarrow{\mathbb R}^+}$ satisfying countable additivity and normalised as ${{\mathbb P}(\Omega)=1}$.

A measure space allows us to define integrals of real-valued measurable functions or, in the language of probability, expectations of random variables. We construct the set ${L^\infty(\Omega,\mathcal F)}$ of all bounded measurable functions ${X\colon\Omega\rightarrow{\mathbb R}}$. This is a real vector space and, as it is closed under multiplication, is an algebra. Expectation, by definition, is the unique linear map ${L^\infty\rightarrow{\mathbb R}}$, ${X\mapsto{\mathbb E}[X]}$ satisfying ${{\mathbb E}[1_A]={\mathbb P}(A)}$ for ${A\in\mathcal F}$ and monotone convergence: if ${X_n\in L^\infty}$ is a nonnegative sequence increasing to a bounded limit ${X}$, then ${{\mathbb E}[X_n]}$ tends to ${{\mathbb E}[X]}$.

In the opposite direction, any nonnegative linear map ${p\colon L^\infty(\Omega,\mathcal F)\rightarrow{\mathbb R}}$ satisfying monotone convergence and ${p(1)=1}$ defines a probability measure by ${{\mathbb P}(A)=p(1_A)}$. This is the unique measure with respect to which expectation agrees with the linear map, ${{\mathbb E}=p}$. So, probability measures are in one-to-one correspondence with such linear maps, and they can be viewed as one and the same thing. The Kolmogorov definition of a probability space can be thought of as representing the expectation on the subset of ${L^\infty}$ consisting of indicator functions ${1_A}$. In practice, it is often more convenient to start with a different subset of ${L^\infty}$. For example, probability measures on ${{\mathbb R}^+}$ can be defined via their Laplace transform, ${\mathcal L_{{\mathbb P}}(a)=\int e^{-ax}d{\mathbb P}(x)}$, which represents the expectation on exponential functions ${x\mapsto e^{-ax}}$. Generalising to complex-valued random variables, probability measures on ${{\mathbb R}}$ are often represented by their characteristic function ${\varphi(a)=\int e^{iax}d{\mathbb P}(x)}$, which is just the expectation of the complex exponentials ${x\mapsto e^{iax}}$. In fact, by the monotone class theorem, we can uniquely represent probability measures on ${(\Omega,\mathcal F)}$ by the expectations on any subset ${\mathcal K\subseteq L^\infty}$ which is closed under taking products and generates the sigma-algebra ${\mathcal F}$. Continue reading “Algebraic Probability”

The Functional Monotone Class Theorem

The monotone class theorem is a very helpful and frequently used tool in measure theory. As measurable functions are a rather general construct, and can be difficult to describe explicitly, it is common to prove results by initially considering just a very simple class of functions. For example, we would start by looking at continuous or piecewise constant functions. Then, the monotone class theorem is used to extend to arbitrary measurable functions. There are different, but related, monotone class theorems’ which apply, respectively, to sets and to functions. As the theorem for sets was covered in a previous post, this entry will be concerned with the functional version. In fact, even for the functional version, there are various similar, but slightly different, statements of the monotone class theorem. In practice, it is beneficial to use the version which most directly applies to the specific application. So, I will state and prove several different versions in this post. Continue reading “The Functional Monotone Class Theorem”

The Monotone Class Theorem

The monotone class theorem, and closely related ${\pi}$-system lemma, are simple but fundamental theorems in measure theory, and form an essential step in the proofs of many results. General measurable sets are difficult to describe explicitly so, when proving results in measure theory, it is often necessary to start by considering much simpler sets. The monotone class theorem is then used to extend to arbitrary measurable sets. For example, when proving a result about Borel subsets of ${{\mathbb R}}$, we may start by considering compact intervals and then apply the monotone class theorem. I include this post on the monotone class theorem for reference. Continue reading “The Monotone Class Theorem”

Logical Consequence

The aim of these notes is to achieve a basic understanding of the concepts of mathematical logic. The process of logical deduction is clearly a central theme of mathematics, where the idea is to prove a stated result by the means of an argument which is broken down into small steps, each of which should be obviously valid to the intended audience. This is typically done in a slightly informal fashion, where the validity of each stage of the proof is supposed to be clear, but does not necessary follow a fixed and clearly stated framework. The study of logic is meant to clarify how this process works, and give a clearly defined framework for logical reasoning. While the history of logic goes back thousands of years, a solid mathematical foundation was only developed in the late nineteenth and early twentieth centuries. For example, the Principia Mathematica published in 1910 by Whitehead and Russell was an attempt to show that all of mathematics can be derived from some precisely stated set of axioms together with inference rules for obtaining conclusions from lists of premises. From this standpoint, logic becomes another field of mathematical study. Just as rings, fields, measure spaces, etc., are objects of mathematical study, so is logic. As in any of these other areas, we use standard mathematical reasoning to prove results about logical systems. However, these results can then shed light on the process of mathematical reasoning itself, as well as providing justification for the underlying frameworks used for much of mathematics.

This has been a very successful endeavor, with theories such as Zermelo-Fraenkel set theory (ZF) providing the standard set of axioms used by mathematicians in many different fields of study. Many important ideas and results have arisen from the study of logic, such as the relative consistency of different theories and independence of certain statements. The axiom of choice, for example, is often considered to be intuitively obvious but, at other times, has been considered controversial. Thanks to the mathematical study of logic, it is now known to be independent of the other axioms of ZFC. Similarly, the continuum hypothesis which Georg Cantor spent many years trying to prove, has also been shown to be independent of ZFC. It is also known that, although the axiom of choice can be used to construct sets which are not Lebesgue measurable, it is consistent with the axiom of dependent choice that all sets of real numbers are measurable. In the other direction, reverse mathematics has been successful in determining precisely which axioms are really required for various mathematical theorems. At a higher level, results such as the completeness theorem have put the theory on a solid footing, establishing the equivalence of semantic truth and syntactic provabilty in first order logics. The incompleteness theorem, on the other hand, shows that no recursively enumerable proof system can prove all true statements about the arithmetic of the natural numbers and, furthermore, no sufficiently strong proof system is able to prove its own consistency (unless it is actually inconsistent). For example, it is not possible to prove the consistency of ZFC just by using the axioms and rules of ZFC itself, although it is possible in the presence of additional large cardinal axioms. Similarly, Peano arithmetic is not able to prove its own consistency, but it is possible if the well-ordering property of the ordinal ${\epsilon_0}$ is added.

The mathematical study of logic has also made clear the distinction between classical and intuitionistic or constructive logics where the law of the excluded middle does not hold. Another consequence of putting logic on a solid mathematical foundation is that it should be possible to check the validity of mathematical proofs in an entirely systematic way and could, in theory, be checked by computer. There are various proof-checkers available, although they are currently much more difficult and tedious to use than writing out proofs for human readers, so tend not to be used for most mathematics. There is also a large intersection between mathematical logic and computer science. For example, the Curry-Howard correspondence gives a one-to-one relation between statements of intuitionistic implicational logic and the types of valid programs in simply typed lambda calculus, with the programs or lambda expressions playing the part of proofs. This correspondence has been used as the basis for various computer implementations of proof systems.

The starting point for most logical theories is a language in which statements can be formed according to rules for what constitutes a valid statement (wffs, or well-formed formulas). This generally includes certain special logical connectives which allow statements to be formed which express a logical connection between its component parts. For example, if P’ and Q’ are valid sentences, then ${P\rightarrow Q}$‘ is also valid (meaning, if P then Q’). These logical connectives come with prescribed rules of inference and, together with a fixed list of axioms, we can prove theorems. This does, however, raise various questions. What is the correct’ set of connectives and what is the correct set of rules of inference that they should follow. It is possible that, with different connectives or rules of inference, an entirely separate set of theorems would result. In this post, I take a step back from such specific theories or rules. The idea is to first look at the most general concept of what a logic is, and only after that, can we can determine exactly what connectives or rules of inference are possible or desirable.

Consequence Relations

Possibly the most basic concept in logic is that of entailment or inference. We start with a collection of premises, which are statements in some language and, according to some rules, we establish a result. The starting point is a set ${L}$, which can be thought of as a set of well-formed statements in some language although, to be as general as possible, we just assume that ${L}$ is a set with no specific restriction or interpretation assumed of its elements. We write

 $\displaystyle a_1,a_2,a_3,\ldots,a_n\vdash b,$ (1)

for ${a_i,b\in L}$, to mean that ${a_1,\ldots,a_n}$ entails ${b}$ or, equivalently, that ${b}$ is a logical consequence of ${a_1,a_2\ldots,a_n}$. Quite what the relation ${\vdash}$ really means is left open at this stage. For example, considering a collection of interpretations or models, each of which assigns truth values to the elements of ${L}$, (1) can be taken to mean that any model assigning the truth value 1 to the ${a_i}$ also assigns the value 1 to ${b}$. Alternatively, (1) could mean that, with respect to some formal proof system, there exists a proof of ${b}$ from the premises ${a_i}$. More generally,

 $\displaystyle \Gamma\vdash a$ (2)

means that ${a\in L}$ is a logical consequence of the set of premises ${\Gamma\subseteq L}$. For convenience, we often put a list of subsets or elements of ${L}$ as the list of premises, so that

 $\displaystyle \Gamma_1,\Gamma_2,\ldots,\Gamma_m,a_1,a_2,\ldots,a_n\vdash b,$

for ${\Gamma_i\subseteq L}$ and ${a_i,b\in L}$, is just another way of writing

 $\displaystyle \Gamma_1\cup\Gamma_2\cup\ldots\cup\Gamma_m\cup\{a_1,a_2,\ldots,a_n\}\vdash b.$

Similarly, the statement ${\vdash a}$ with no premises is just another way of writing ${\emptyset\vdash a}$.

As a matter of terminology, expression (2) is a sequent, and the ${\vdash}$ symbol is referred to as a turnstile. The left hand side, ${\Gamma}$, is the set of premises or the antecedent and ${b}$ is the consequent, and we say that ${\Gamma}$ entails ${a}$ or that ${a}$ is a consequence of ${\Gamma}$. Although we are being rather general here and not considering any particular interpretation of the set ${L}$ or of the relation ${\vdash}$, there is a short list of obvious’ properties which should hold for logical inference.

Definition 1 A consequence relation ${\vdash}$ on ${L}$ is a relation on ${\mathcal{P}L\times L}$ satisfying

1. ${\Gamma\!,a\vdash a}$ (reflexivity/axiom of identity),
2. if ${\Gamma\vdash a}$ then ${\Gamma\!,\Delta\vdash a}$ (weakening),
3. if ${\Gamma\vdash x}$ for all ${x\in\Delta}$ and ${\Gamma\!,\Delta\vdash a}$ then ${\Gamma\vdash a}$ (transitivity/rule of cut),

for all ${\Gamma,\Delta\subseteq L}$ and ${a\in L}$. We will say that ${\vdash}$ is finitary if, whenever ${\Gamma\vdash a}$ then ${\Delta\vdash a}$ for some finite ${\Delta\subseteq\Gamma}$.

A pair ${(L,\vdash)}$ consisting of a set ${L}$ together with a consequence relation ${\vdash}$ on ${L}$ will be known as a logic. We will often just use ${L}$ to denote the logic ${(L,\vdash)}$.

Properties of the Dual Projections

In the previous post I introduced the definitions of the dual optional and predictable projections, firstly for processes of integrable variation and, then, generalised to processes which are only required to be locally (or prelocally) of integrable variation. We did not look at the properties of these dual projections beyond the fact that they exist and are uniquely defined, which are significant and important statements in their own right.

To recap, recall that an IV process, A, is right-continuous and such that its variation

 $\displaystyle V_t\equiv \lvert A_0\rvert+\int_0^t\,\lvert dA\rvert$ (1)

is integrable at time ${t=\infty}$, so that ${{\mathbb E}[V_\infty] < \infty}$. The dual optional projection is defined for processes which are prelocally IV. That is, A has a dual optional projection ${A^{\rm o}}$ if it is right-continuous and its variation process is prelocally integrable, so that there exist a sequence ${\tau_n}$ of stopping times increasing to infinity with ${1_{\{\tau_n > 0\}}V_{\tau_n-}}$ integrable. More generally, A is a raw FV process if it is right-continuous with almost-surely finite variation over finite time intervals, so ${V_t < \infty}$ (a.s.) for all ${t\in{\mathbb R}^+}$. Then, if a jointly measurable process ${\xi}$ is A-integrable on finite time intervals, we use

$\displaystyle \xi\cdot A_t\equiv\xi_0A_0+\int_0^t\xi\,dA$

to denote the integral of ${\xi}$ with respect to A over the interval ${[0,t]}$, which takes into account the value of ${\xi}$ at time 0 (unlike the integral ${\int_0^t\xi\,dA}$ which, implicitly, is defined on the interval ${(0,t]}$). In what follows, whenever we state that ${\xi\cdot A}$ has any properties, such as being IV or prelocally IV, we are also including the statement that ${\xi}$ is A-integrable so that ${\xi\cdot A}$ is a well-defined process. Also, whenever we state that a process has a dual optional projection, then we are also implicitly stating that it is prelocally IV.

From theorem 3 of the previous post, the dual optional projection ${A^{\rm o}}$ is the unique prelocally IV process satisfying

$\displaystyle {\mathbb E}[\xi\cdot A^{\rm o}_\infty]={\mathbb E}[{}^{\rm o}\xi\cdot A_\infty]$

for all measurable processes ${\xi}$ with optional projection ${{}^{\rm o}\xi}$ such that ${\xi\cdot A^{\rm o}}$ and ${{}^{\rm o}\xi\cdot A}$ are IV. Equivalently, ${A^{\rm o}}$ is the unique optional FV process such that

$\displaystyle {\mathbb E}[\xi\cdot A^{\rm o}_\infty]={\mathbb E}[\xi\cdot A_\infty]$

for all optional ${\xi}$ such that ${\xi\cdot A}$ is IV, in which case ${\xi\cdot A^{\rm o}}$ is also IV so that the expectations in this identity are well-defined.

I now look at the elementary properties of dual optional projections, as well as the corresponding properties of dual predictable projections. The most important property is that, according to the definition just stated, the dual projection exists and is uniquely defined. By comparison, the properties considered in this post are elementary and relatively easy to prove. So, I will simply state a theorem consisting of a list of all the properties under consideration, and will then run through their proofs. Starting with the dual optional projection, the main properties are listed below as Theorem 1.

Note that the first three statements are saying that the dual projection is indeed a linear projection from the prelocally IV processes onto the linear subspace of optional FV processes. As explained in the previous post, by comparison with the discrete-time setting, the dual optional projection can be expressed, in a non-rigorous sense, as taking the optional projection of the infinitesimal increments,

 $\displaystyle dA^{\rm o}={}^{\rm o}dA.$ (2)

As ${dA}$ is interpreted via the Lebesgue-Stieltjes integral ${\int\cdot\,dA}$, it is a random measure rather than a real-valued process. So, the optional projection of ${dA}$ appearing in (2) does not really make sense. However, Theorem 1 does allow us to make sense of (2) in certain restricted cases. For example, if A is differentiable so that ${dA=\xi\,dt}$ for a process ${\xi}$, then (9) below gives ${dA={}^{\rm o}\xi\,dt}$. This agrees with (2) so long as ${{}^{\rm o}(\xi\,dt)}$ is interpreted to mean ${{}^{\rm o}\xi\,dt}$. Also, restricting to the jump component of the increments, ${\Delta A=A-A_-}$, (2) reduces to (11) below.

We defined the dual projection via expectations of integrals ${\xi\cdot A}$ with the restriction that this is IV. An alternative approach is to first define the dual projections for IV processes, as was done in theorems 1 and 2 of the previous post, and then extend to (pre)locally IV processes by localisation of the projection. That this is consistent with our definitions follows from the fact that (pre)localisation commutes with the dual projection, as stated in (10) below.

Theorem 1

1. A raw FV process A is optional if and only if ${A^{\rm o}}$ exists and is equal to A.
2. If the dual optional projection of A exists then,
 $\displaystyle (A^{\rm o})^{\rm o}=A^{\rm o}.$ (3)
3. If the dual optional projections of A and B exist, and ${\lambda}$, ${\mu}$ are ${\mathcal F_0}$-measurable random variables then,
 $\displaystyle (\lambda A+\mu B)^{\rm o}=\lambda A^{\rm o}+\mu B^{\rm o}.$ (4)
4. If the dual optional projection ${A^{\rm o}}$ exists then ${{\mathbb E}[\lvert A_0\rvert\,\vert\mathcal F_0]}$ is almost-surely finite and
 $\displaystyle A^{\rm o}_0={\mathbb E}[A_0\,\vert\mathcal F_0].$ (5)
5. If U is a random variable and ${\tau}$ is a stopping time, then ${U1_{[\tau,\infty)}}$ is prelocally IV if and only if ${{\mathbb E}[1_{\{\tau < \infty\}}\lvert U\rvert\,\vert\mathcal F_\tau]}$ is almost surely finite, in which case
 $\displaystyle \left(U1_{[\tau,\infty)}\right)^{\rm o}={\mathbb E}[1_{\{\tau < \infty\}}U\,\vert\mathcal F_\tau]1_{[\tau,\infty)}.$ (6)
6. If the prelocally IV process A is nonnegative and increasing then so is ${A^{\rm o}}$ and,
 $\displaystyle {\mathbb E}[\xi\cdot A^{\rm o}_\infty]={\mathbb E}[{}^{\rm o}\xi\cdot A_\infty]$ (7)

for all nonnegative measurable ${\xi}$ with optional projection ${{}^{\rm o}\xi}$. If A is merely increasing then so is ${A^{\rm o}}$ and (7) holds for nonnegative measurable ${\xi}$ with ${\xi_0=0}$.

7. If A has dual optional projection ${A^{\rm o}}$ and ${\xi}$ is an optional process such that ${\xi\cdot A}$ is prelocally IV then, ${\xi}$ is ${A^{\rm o}}$-integrable and,
 $\displaystyle (\xi\cdot A)^{\rm o}=\xi\cdot A^{\rm o}.$ (8)
8. If A is an optional FV process and ${\xi}$ is a measurable process with optional projection ${{}^{\rm o}\xi}$ such that ${\xi\cdot A}$ is prelocally IV then, ${{}^{\rm o}\xi}$ is A-integrable and,
 $\displaystyle (\xi\cdot A)^{\rm o}={}^{\rm o}\xi\cdot A.$ (9)
9. If A has dual optional projection ${A^{\rm o}}$ and ${\tau}$ is a stopping time then,
 $\displaystyle \setlength\arraycolsep{2pt} \begin{array}{rl} &\displaystyle(A^{\tau})^{\rm o}=(A^{\rm o})^{\tau},\smallskip\\ &\displaystyle(A^{\tau-})^{\rm o}=(A^{\rm o})^{\tau-}. \end{array}$ (10)
10. If the dual optional projection ${A^{\rm o}}$ exists, then its jump process is the optional projection of the jump process of A,
 $\displaystyle \Delta A^{\rm o}={}^{\rm o}\!\Delta A.$ (11)
11. If A has dual optional projection ${A^{\rm o}}$ then
 $\displaystyle \setlength\arraycolsep{2pt} \begin{array}{rl} &\displaystyle{\mathbb E}\left[\xi_0\lvert A^{\rm o}_0\rvert + \int_0^\infty\xi\,\lvert dA^{\rm o}\rvert\right]\le{\mathbb E}\left[{}^{\rm o}\xi_0\lvert A_0\rvert + \int_0^\infty{}^{\rm o}\xi\,\lvert dA\rvert\right],\smallskip\\ &\displaystyle{\mathbb E}\left[\xi_0(A^{\rm o}_0)_+ + \int_0^\infty\xi\,(dA^{\rm o})_+\right]\le{\mathbb E}\left[{}^{\rm o}\xi_0(A_0)_+ + \int_0^\infty{}^{\rm o}\xi\,(dA)_+\right],\smallskip\\ &\displaystyle{\mathbb E}\left[\xi_0(A^{\rm o}_0)_- + \int_0^\infty\xi\,(dA^{\rm o})_-\right]\le{\mathbb E}\left[{}^{\rm o}\xi_0(A_0)_- + \int_0^\infty{}^{\rm o}\xi\,(dA)_-\right], \end{array}$ (12)

for all nonnegative measurable ${\xi}$ with optional projection ${{}^{\rm o}\xi}$.

12. Let ${\{A^n\}_{n=1,2,\ldots}}$ be a sequence of right-continuous processes with variation

$\displaystyle V^n_t=\lvert A^n_0\rvert + \int_0^t\lvert dA^n\rvert.$

If ${\sum_n V^n}$ is prelocally IV then,

 $\displaystyle \left(\sum\nolimits_n A^n\right)^{\rm o}=\sum\nolimits_n\left(A^n\right)^{\rm o}.$ (13)

Dual Projections

The optional and predictable projections of stochastic processes have corresponding dual projections, which are the subject of this post. I will be concerned with their initial construction here, and show that they are well-defined. The study of their properties will be left until later. In the discrete time setting, the dual projections are relatively straightforward, and can be constructed by applying the optional and predictable projection to the increments of the process. In continuous time, we no longer have discrete time increments along which we can define the dual projections. In some sense, they can still be thought of as projections of the infinitesimal increments so that, for a process A, the increments of the dual projections ${A^{\rm o}}$ and ${A^{\rm p}}$ are determined from the increments ${dA}$ of A as

 $\displaystyle \setlength\arraycolsep{2pt} \begin{array}{rl} &\displaystyle dA^{\rm o}={}^{\rm o}(dA),\smallskip\\ &\displaystyle dA^{\rm p}={}^{\rm p}(dA). \end{array}$ (1)

Unfortunately, these expressions are difficult to make sense of in general. In specific cases, (1) can be interpreted in a simple way. For example, when A is differentiable with derivative ${\xi}$, so that ${dA=\xi dt}$, then the dual projections are given by ${dA^{\rm o}={}^{\rm o}\xi dt}$ and ${dA^{\rm p}={}^{\rm p}\xi dt}$. More generally, if A is right-continuous with finite variation, then the infinitesimal increments ${dA}$ can be interpreted in terms of Lebesgue-Stieltjes integrals. However, as the optional and predictable projections are defined for real valued processes, and ${dA}$ is viewed as a stochastic measure, the right-hand-side of (1) is still problematic. This can be rectified by multiplying by an arbitrary process ${\xi}$, and making use of the transitivity property ${{\mathbb E}[\xi\,{}^{\rm o}(dA)]={\mathbb E}[({}^{\rm o}\xi)dA]}$. Integrating over time gives the more meaningful expressions

$\displaystyle \setlength\arraycolsep{2pt} \begin{array}{rl} &\displaystyle {\mathbb E}\left[\int_0^\infty \xi\,dA^{\rm o}\right]={\mathbb E}\left[\int_0^\infty{}^{\rm o}\xi\,dA\right],\smallskip\\ &\displaystyle{\mathbb E}\left[\int_0^\infty \xi\,dA^{\rm p}\right]={\mathbb E}\left[\int_0^\infty{}^{\rm p}\xi\,dA\right]. \end{array}$

In contrast to (1), these equalities can be used to give mathematically rigorous definitions of the dual projections. As usual, we work with respect to a complete filtered probability space ${(\Omega,\mathcal F,\{\mathcal F_t\}_{t\ge0},{\mathbb P})}$, and processes are identified whenever they are equal up to evanescence. The terminology raw IV process‘ will be used to refer to any right-continuous integrable process whose variation on the whole of ${{\mathbb R}^+}$ has finite expectation. The use of the word `raw’ here is just to signify that we are not requiring the process to be adapted. Next, to simplify the expressions, I will use the notation ${\xi\cdot A}$ for the integral of a process ${\xi}$ with respect to another process A,

$\displaystyle \xi\cdot A_t\equiv\xi_0A_0+\int_0^t\xi\,dA.$

Note that, whereas the integral ${\int_0^t\xi\,dA}$ is implicitly taken over the range ${(0,t]}$ and does not involve the time-zero value of ${\xi}$, I have included the time-zero values of the processes in the definition of ${\xi\cdot A}$. This is not essential, and could be excluded, so long as we were to restrict to processes starting from zero. The existence and uniqueness (up to evanescence) of the dual projections is given by the following result.

Theorem 1 (Dual Projections) Let A be a raw IV process. Then,

• There exists a unique raw IV process ${A^{\rm o}}$ satisfying
 $\displaystyle {\mathbb E}\left[\xi\cdot A^{\rm o}_\infty\right]={\mathbb E}\left[{}^{\rm o}\xi\cdot A_\infty\right]$ (2)

for all bounded measurable processes ${\xi}$. We refer to ${A^{\rm o}}$ as the dual optional projection of A.

• There exists a unique raw IV process ${A^{\rm p}}$ satisfying
 $\displaystyle {\mathbb E}\left[\xi\cdot A^{\rm p}_\infty\right]={\mathbb E}\left[{}^{\rm p}\xi\cdot A_\infty\right]$ (3)

for all bounded measurable processes ${\xi}$. We refer to ${A^{\rm p}}$ as the dual predictable projection of A.

Furthermore, if A is nonnegative and increasing then so are ${A^{\rm o}}$ and ${A^{\rm p}}$.