Proof of the Measurable Projection and Section Theorems

The aim of this post is to give a direct proof of the theorems of measurable projection and measurable section. These are generally regarded as rather difficult results, and proofs often use ideas from descriptive set theory such as analytic sets. I did previously post a proof along those lines on this blog. However, the results can be obtained in a more direct way, which is the purpose of this post. Here, I present relatively self-contained proofs which do not require knowledge of any advanced topics beyond basic probability theory.

The projection theorem states that if {(\Omega,\mathcal F,{\mathbb P})} is a complete probability space, then the projection of a measurable subset of {{\mathbb R}\times\Omega} onto {\Omega} is measurable. To be precise, the condition is that S is in the product sigma-algebra {\mathcal B({\mathbb R})\otimes\mathcal F}, where {\mathcal B({\mathbb R})} denotes the Borel sets in {{\mathbb R}}, and the projection map is denoted

\displaystyle  \setlength\arraycolsep{2pt} \begin{array}{rl} &\displaystyle\pi_\Omega\colon{\mathbb R}\times\Omega\rightarrow\Omega,\smallskip\\ &\displaystyle\pi_\Omega(t,\omega)=\omega. \end{array}

Then, measurable projection states that {\pi_\Omega(S)\in\mathcal{F}}. Although it looks like a very basic property of measurable sets, maybe even obvious, measurable projection is a surprisingly difficult result to prove. In fact, the requirement that the probability space is complete is necessary and, if it is dropped, then {\pi_\Omega(S)} need not be measurable. Counterexamples exist for commonly used measurable spaces such as {\Omega= {\mathbb R}} and {\mathcal F=\mathcal B({\mathbb R})}. This suggests that there is something deeper going on here than basic manipulations of measurable sets.

By definition, if {S\subseteq{\mathbb R}\times\Omega} then, for every {\omega\in\pi_\Omega(S)}, there exists a {t\in{\mathbb R}} such that {(t,\omega)\in S}. The measurable section theorem — also known as measurable selection — says that this choice can be made in a measurable way. That is, if S is in {\mathcal B({\mathbb R})\otimes\mathcal F} then there is a measurable section,

\displaystyle  \setlength\arraycolsep{2pt} \begin{array}{rl} &\displaystyle\tau\colon\pi_\Omega(S)\rightarrow{\mathbb R},\smallskip\\ &\displaystyle(\tau(\omega),\omega)\in S. \end{array}

It is convenient to extend {\tau} to the whole of {\Omega} by setting {\tau=\infty} outside of {\pi_\Omega(S)}.

measurable section
Figure 1: A section of a measurable set

The graph of {\tau} is

\displaystyle  [\tau]=\left\{(t,\omega)\in{\mathbb R}\times\Omega\colon t=\tau(\omega)\right\}.

The condition that {(\tau(\omega),\omega)\in S} whenever {\tau < \infty} can alternatively be expressed by stating that {[\tau]\subseteq S}. This also ensures that {\{\tau < \infty\}} is a subset of {\pi_\Omega(S)}, and {\tau} is a section of S on the whole of {\pi_\Omega(S)} if and only if {\{\tau < \infty\}=\pi_\Omega(S)}.

The results described here can also be used to prove the optional and predictable section theorems which, at first appearances, also seem to be quite basic statements. The section theorems are fundamental to the powerful and interesting theory of optional and predictable projection which is, consequently, generally considered to be a hard part of stochastic calculus. In fact, the projection and section theorems are really not that hard to prove.

Let us consider how one might try and approach a proof of the projection theorem. As with many statements regarding measurable sets, we could try and prove the result first for certain simple sets, and then generalise to measurable sets by use of the monotone class theorem or similar. For example, let {\mathcal S} denote the collection of all {S\subseteq{\mathbb R}\times\Omega} for which {\pi_\Omega(S)\in\mathcal F}. It is straightforward to show that any finite union of sets of the form {A\times B} for {A\in\mathcal B({\mathbb R})} and {B\in\mathcal F} are in {\mathcal S}. If it could be shown that {\mathcal S} is closed under taking limits of increasing and decreasing sequences of sets, then the result would follow from the monotone class theorem. Increasing sequences are easily handled — if {S_n} is a sequence of subsets of {{\mathbb R}\times\Omega} then from the definition of the projection map,

\displaystyle  \pi_\Omega\left(\bigcup\nolimits_n S_n\right)=\bigcup\nolimits_n\pi_\Omega\left(S_n\right).

If {S_n\in\mathcal S} for each n, this shows that the union {\bigcup_nS_n} is again in {\mathcal S}. Unfortunately, decreasing sequences are much more problematic. If {S_n\subseteq S_m} for all {n\ge m} then we would like to use something like

\displaystyle  \pi_\Omega\left(\bigcap\nolimits_n S_n\right)=\bigcap\nolimits_n\pi_\Omega\left(S_n\right). (1)

However, this identity does not hold in general. For example, consider the decreasing sequence {S_n=(n,\infty)\times\Omega}. Then, {\pi_\Omega(S_n)=\Omega} for all n, but {\bigcap_nS_n} is empty, contradicting (1). There is some interesting history involved here. In a paper published in 1905, Henri Lebesgue claimed that the projection of a Borel subset of {{\mathbb R}^2} onto {{\mathbb R}} is itself measurable. This was based upon mistakenly applying (1). The error was spotted in around 1917 by Mikhail Suslin, who realised that the projection need not be Borel, and lead him to develop the theory of analytic sets.

Actually, there is at least one situation where (1) can be shown to hold. Suppose that for each {\omega\in\Omega}, the slices

\displaystyle  S_n(\omega)\equiv\left\{t\in{\mathbb R}\colon(t,\omega)\in S_n\right\} (2)

are compact. For each {\omega\in\bigcap_n\pi_\Omega(S_n)}, the slices {S_n(\omega)} give a decreasing sequence of nonempty compact sets, so has nonempty intersection. So, letting S be the intersection {\bigcap_nS_n}, the slice {S(\omega)=\bigcap_nS_n(\omega)} is nonempty. Hence, {\omega\in\pi_\Omega(S)}, and (1) follows.

The starting point for our proof of the projection and section theorems is to consider certain special subsets of {{\mathbb R}\times\Omega} where the compactness argument, as just described, can be used. The notation {\mathcal A_\delta} is used to represent the collection of countable intersections, {\bigcap_{n=1}^\infty A_n}, of sets {A_n} in {\mathcal A}.

Lemma 1 Let {(\Omega,\mathcal F)} be a measurable space, and {\mathcal A} be the collection of subsets of {{\mathbb R}\times\Omega} which are finite unions {\bigcup_kC_k\times E_k} over compact intervals {C_k\subseteq{\mathbb R}} and {E_k\in\mathcal F}. Then, for any {S\in\mathcal A_\delta}, we have {\pi_\Omega(S)\in\mathcal F}, and the debut

\displaystyle  \setlength\arraycolsep{2pt} \begin{array}{rl} &\displaystyle \tau\colon\Omega\rightarrow{\mathbb R}\cup\{\infty\},\smallskip\\ &\displaystyle \omega\mapsto\inf\left\{t\in{\mathbb R}\colon (t,\omega)\in S\right\}. \end{array}

is a measurable map with {[\tau]\subseteq S} and {\{\tau < \infty\}=\pi_\Omega(S)}.

Proof: Noting that {\mathcal F} and the collection of compact intervals in {{\mathbb R}} are closed under pairwise intersection, the same is true for {\mathcal A}. Then, for {S\in\mathcal A_\delta} there exists, by definition, {S_n\in\mathcal A} such that {S=\bigcap_nS_n}. Replacing {S_n} by {\bigcap_{m\le n}S_m} if necessary, we may suppose that {S_n} is a decreasing sequence.

Now, the slices {S_n(\omega)} defined by (2) are finite unions of compact intervals, so are compact. The compactness argument explained above implies that

\displaystyle  \pi_\Omega(S)=\bigcap\nolimits_n\pi_\Omega(S_n). (3)

As each {S_n} is a finite union {\bigcup_kC_k\times E_k} for {E_k\in\mathcal F} and nonempty {C_k}, the projection {\pi_\Omega(S_n)=\bigcup_kE_k} is in {\mathcal F}. Then, (3) shows that {\pi_\Omega(S)} is also in {\mathcal F}.

If {\tau} is the debut of S, then {\tau(\omega)=\inf S(\omega)}. This immediately implies {\{\tau < \infty\}=\pi_\Omega(S)} and, as nonempty compact sets contain their infimum, {[\tau]\subseteq S}. For every {t\in{\mathbb R}}, the set {((-\infty,t]\times\Omega)\cap S} is in {\mathcal A_\delta} and,

\displaystyle  \{\tau\le t\}=\pi_\Omega\left(((-\infty,t]\times\Omega)\cap S\right)\in\mathcal F,

showing that {\tau} is measurable. ⬜

When dealing with more general subsets of {{\mathbb R}\times\Omega}, it will not necessarily be the case that the projection onto {\Omega} is measurable. For that reason, we extend the probability measure to more general subsets of {\Omega}. For a probability space {(\Omega,\mathcal F,{\mathbb P})}, define an outer measure on the power set {\mathcal P(\Omega)} by approximating {A\subseteq\Omega} from above by measurable sets,

\displaystyle  {\mathbb P}^*(A)=\inf\left\{{\mathbb P}(B)\colon B\in\mathcal F,A\subseteq B\right\}. (4)

The outer measure has the following basic properties.

Lemma 2 For a probability space {(\Omega,\mathcal F,{\mathbb P})}, the outer measure {{\mathbb P}^*} is increasing and continuous along increasing sequences. That is, {{\mathbb P}^*(A)\le{\mathbb P}^*(B)} for {A\subseteq B}, and {{\mathbb P}^*(A_n)\rightarrow{\mathbb P}^*(A)} for sequences {A_n\subseteq\Omega} increasing to a limit A.

Furthermore, for any {A\subseteq\Omega}, there exists {B\supseteq A} in {\mathcal F} with {{\mathbb P}(B)={\mathbb P}^*(A)}.

Proof: The fact that {{\mathbb P}^*} is increasing is immediate from the definition. Now, let {A_n\subseteq\Omega} be increasing to the limit A. By the definition of {{\mathbb P}^*(A_n)}, there exists {B_n\supseteq A_n} in {\mathcal F} with

\displaystyle  {\mathbb P}(B_n)\le{\mathbb P}^*(A_n)+1/n.

Replacing {B_n} by {\bigcap_{m\ge n}B_m} if necessary, we may suppose that {B_n} is an increasing sequence. Then, {B=\bigcup_nB_n\supseteq A} is in {\mathcal F} and, by monotone convergence,

\displaystyle  \setlength\arraycolsep{2pt} \begin{array}{rl} \displaystyle{\mathbb P}^*(A)\le{\mathbb P}(B)&\displaystyle=\lim_n{\mathbb P}(B_n)\smallskip\\ &\displaystyle\le\lim_n({\mathbb P}^*(A_n)+1/n)\smallskip\\ &\displaystyle=\lim_n{\mathbb P}^*(A_n)\le{\mathbb P}^*(A). \end{array}

So, {{\mathbb P}^*(A)=\lim_n{\mathbb P}^*(A_n)} as required. Incidentally, this also shows that there is a {B\supseteq A} in {\mathcal F} with {{\mathbb P}(B)={\mathbb P}^*(A)}. ⬜

I now move on the the main component of the proof of the projection and section theorems. This will allow us to approximate measurable subsets of {{\mathbb R}\times\Omega} from below by sets in {\mathcal A_\delta}, as defined in lemma 1 above. While the statement of theorem 3 is simple enough, the proof can get a bit tricky. The method used here is elementary and, although the argument is a bit intricate, no advanced mathematics is required. The definition of {\mathcal{\bar A}} means that it is the minimal collection of subsets of X which contains {\mathcal A} and is closed under taking limits of increasing and decreasing sequences. I refer to the result as the `capacitability theorem’ as it is a version of Choquet’s capacitability theorem although, here, we do not involve the concept of analytic sets. A set {A\subseteq X} can be called capacitable if, for each {K < I(A)}, there exists a decreasing sequence {A_n\in\mathcal A} with {K < I(A_n)} and {\bigcap_nA_n\subseteq A}. So, theorem 3 is saying that all sets in {\mathcal{\bar A}} are capacitable.

Theorem 3 (Capacitability Theorem) Let X be a set, {\mathcal A\subseteq\mathcal P(X)} be closed under pairwise intersections, and {I\colon\mathcal P(X)\rightarrow{\mathbb R}^+} be increasing and continuous along increasing sequences. Denote the closure of {\mathcal A} under limits of increasing and of decreasing sequences by {\mathcal{\bar A}}.

Then, for any {A\in\mathcal{\bar A}} and {K\in{\mathbb R}} with {I(A) > K}, there exists a decreasing sequence {A_n\in\mathcal A} with {\bigcap_nA_n\subseteq A} and {I(A_n) > K} for all n.

Proof: Fixing {K\in{\mathbb R}}, let {\mathcal C} denote the collection of all {A\subseteq X} with {I(A) > K}. The assumptions on I mean that for any {A\in\mathcal C} then every {B\supseteq A} is in {\mathcal C} and, for any sequence {A_n\subseteq X} increasing to A, then {A_n\in\mathcal C} for large n.

The proof of the theorem amounts to finding a collection {\mathcal B\subseteq\mathcal P(X)} containing {\mathcal A} and closed under taking limits of increasing and decreasing sequences, such that, for every {A\in\mathcal B\cap\mathcal C}, we can construct a decreasing sequence {A_n\in\mathcal A\cap\mathcal C} with {\bigcap_nA_n\subseteq A}. In that case, every {A\in\mathcal{\bar A}} will also be in {\mathcal B}, and the claimed result will follow.

The main difficulty in the proof is to describe a collection {\mathcal B} with the required properties. One way of doing this is as follows, and can be described in terms of a game. For {A\in\mathcal C}, consider the following infinite game played between two players, who take turns choosing sets from {\mathcal C}. Starting with {T_0=A}, at rounds {n=1,2,\ldots}, the players make the following moves.

  1. Player 1 chooses an {S_n\subseteq T_{n-1}} in {\mathcal C}.
  2. Player 2 chooses a {T_n\subseteq S_n} in {\mathcal C}.

At each round, both players can, at least, make a valid move. For example, player 1 can set {S_n=T_{n-1}} and player 2 can set {T_n=S_n}. We say that player 2 wins the game if, once completed, she is able to find a sequence {A_n\supseteq T_n} in {\mathcal A} with {\bigcap_nA_n\subseteq A}.

For any {A\subseteq X}, denote the game described above by {\mathbb G_A}. A strategy (for player 2) is just a sequence of functions {f_n\colon\mathcal C^n\rightarrow\mathcal C} satisfying

\displaystyle  f_n(S_1,\ldots,S_n) \subseteq S_n. (5)

The idea is that {f_n(S_1,\ldots,S_n)} represents player 2’s choice for {T_n} at round n, given that player 1 has chosen {S_1,\ldots,S_n} so far. It is a winning strategy if, for any sequence {S_1,S_2,\ldots\in\mathcal C} satisfying

\displaystyle  \setlength\arraycolsep{2pt} \begin{array}{rl} &\displaystyle S_1\subseteq A,\smallskip\\ &\displaystyle S_{n+1}\subseteq f_n(S_1,\ldots,S_n) \end{array} (6)

for each {n\ge 1}, then there exists a sequence {A_n\in\mathcal A} with

\displaystyle  \setlength\arraycolsep{2pt} \begin{array}{rl} &\displaystyle f_n(S_1,\ldots,S_n)\subseteq A_n,\smallskip\\ &\displaystyle\bigcap_{n=1}^\infty A_n\subseteq A. \end{array} (7)

We note that, combining (5) and (6) shows that {S_n} must be a decreasing sequence of subsets of A.

Now, let {\mathcal B} be the collection of {A\subseteq X} for which the game {\mathbb G_A} has a winning strategy. The case with {A\in\mathcal A} is easy. Any strategy is a winning strategy simply by taking {A_n=A} in (7). For {f_n} we may as well take {f_n(S_1,\ldots,S_n)=S_n}, which is a valid strategy.

Now, consider a sequence {A_k\in\mathcal B} and let {\{f^k_n\}_{n=1,2,\ldots}} be winning strategies for {\mathbb G_{A_k}}. Construct a winning strategy for {\mathbb G_A}, with {A=\bigcap_kA_k}, as follows. Choose a bijection {\theta\colon{\mathbb N}^2\rightarrow{\mathbb N}} such that {\theta(r,s)} is increasing in s. For example, take {\theta(r,s)=(2r-1)2^{s-1}}. Then for {n=\theta(r,s)} and {S_1,\ldots,S_n\in\mathcal C}, define

\displaystyle  f_n(S_1,\ldots,S_n)=f^r_s(S_{\theta(r,1)},\ldots,S_{\theta(r,s)})\subseteq S_n.

It can be seen that this is a winning strategy. If (6) is satisfied then, writing {S^r_s\equiv S_{\theta(r,s)}}, we use the fact that the sequence {S_n} is decreasing and {\theta(r,s+1)\ge\theta(r,s)+1} to write

\displaystyle  \setlength\arraycolsep{2pt} \begin{array}{ll} \displaystyle S^r_s\subseteq A\subseteq A_r,\smallskip\\ \displaystyle S^r_{s+1}\subseteq S_{\theta(r,s)+1}&\displaystyle\subseteq f_n(S_1,\ldots,S_n)\smallskip\\ &\displaystyle=f^r_s(S^r_1,\ldots,S^r_s) \end{array}

for any {n=\theta(r,s)}. So, (6) is also satisfied for the sequence {S^r_1,S^r_2,\ldots} (for the strategy {f^r_\cdot} and game {\mathbb G_{A_r}}). As {f^r_\cdot} is a winning strategy for {\mathbb G_{A_r}}, there exists {A_{rs}\supseteq f^r_s(S^r_1,\ldots,S^r_s)} in {\mathcal A} satisfying {\bigcap_sA_{rs}\subseteq A_r}. In particular, writing {B_{\theta(r,s)}=A_{rs}} gives

\displaystyle  \bigcap_nB_n=\bigcap_r\bigcap_s A_{rs}\subseteq\bigcap_rA_r=A

so (7) is satisfied, and {A\in\mathcal B}.

If {A_k} is increasing, construct a winning strategy for {A=\bigcup_kA_k} as follows. For any {S_1,\ldots,S_n\in\mathcal C} with {S_1\subseteq A}, the sequence {S_1\cap A_k} increases to {S_1}. Hence, there is a minimum r such that {S_1\cap A_r\in\mathcal C}. Set,

\displaystyle  f_n(S_1,\ldots,S_n)=f^r_n(S_1\cap A_r,S_2,\ldots,S_n).

For {S_1\not\subseteq A} then we do not really care, so can just take {f_n(S_1,\ldots,S_n)=S_n}. This clearly gives a valid strategy. To see that it is a winning strategy, suppose that (6) is satisfied. Setting {S^\prime_1=S_1\cap A_r} and {S^\prime_n=S_n} for {n > 1}, we see that (6) is also satisfied with {S^\prime_n} in place of {S_n} and {f^r_n} in place of {f_n}. So, as {f^r_n} is a winning strategy for the game {\mathbb G_{A_r}}, there exists a sequence {B_n\in\mathcal A} with

\displaystyle  \setlength\arraycolsep{2pt} \begin{array}{rl} &\displaystyle f_n(S_1,\ldots,S_n)=f^r_n(S^\prime_1,\ldots,S^\prime_n)\subseteq B_n,\smallskip\\ &\displaystyle\bigcap_{n=1}^\infty B_n\subseteq A_r\subseteq A. \end{array}

So, {\{f_n\}} is a winning strategy for {\mathbb G_{A}} and, hence, {A\in\mathcal B}.

We have shown that {\mathcal B} contains {\mathcal A} and is closed under taking limits of increasing and decreasing sequences and, so, contains {\mathcal{\bar A}}. Finally, for any {A\in\mathcal{\bar A}\cap\mathcal C}, let {\{f_n\}_{n=1,2,\ldots}} be a winning strategy for {\mathbb G_A} and define a sequence {S_n\in\mathcal C} by {S_1=A} and

\displaystyle  S_{n+1}=f_n(S_1,\ldots,S_n)

for all {n\ge1}. As {\{f_n\}} is a winning strategy, there exists a sequence {A_n\in\mathcal A} satisfying (7). Replacing {A_n} by {\bigcap_{m\le n}A_n} if required, we can suppose that the sequence is decreasing. Finally, as {S_{n+1}\subseteq A_n}, we have {A_n\in\mathcal C} as required. ⬜

The argument above is along similar lines to the `rabotages de Sierpinski’ used by Dellacherie, Ensembles aléatoires II (1969). Although the description of the collection {\mathcal B} in terms of winning strategies of the games {\mathbb G_A} may not seem like an obvious approach, it is really quite natural. As a first attempt to prove the result, we could try defining {\mathcal B} to be the collection of sets for which the conclusion of the theorem holds. That is, the sets A for which there is a decreasing sequence {A_n\in\mathcal A\cap\mathcal C} with {\bigcap_nA_n\subseteq A}. We would then have to show that {\mathcal B} is closed under taking limits of increasing and decreasing sequences. While increasing sequences are easy to deal with, decreasing ones are problematic. Suppose that {A_n} decreases to A and that, for each n, there is a decreasing sequence {\{A_{nk}\}_{k=1,2,\ldots}\in\mathcal A\cap\mathcal C} with {\bigcap_kA_{nk}\subseteq A_n}. To construct a sequence of sets {B_n\in\mathcal A\cap\mathcal C} we could try to do the following. Reorder the doubly-indexed sequence {A_{nk}} into a singly-indexed one, {A_{n_1k_1},A_{n_2k_2},\ldots} and set {B_r=A_{n_rk_r}}. Then, it is clear that {B_r\in\mathcal A\cap\mathcal C} and {\bigcap_rB_r\subseteq A}. However, {B_r} is not decreasing. We could try and ensure that it is decreasing by setting

\displaystyle  B_r=A_{n_1k_1}\cap\cdots\cap A_{n_rk_r}.

Unfortunately, it is no longer necessarily true that {B_r} is in {\mathcal C}. When we take intersections {A_{n_rk_r}\cap A_{n_sk_s}} we need no longer be in {\mathcal C}. The easiest way around this, it seems, is to allow the choice of {A_{n_rk_r}} to depend on the previous choices of {A_{n_sk_s}}. That is, the choice of {A_{n_rk_r}} should depend on {B_{r-1}} so as to enforce the condition that {B_r=B_{r-1}\cap A_{n_rk_r}} is in {\mathcal C}. This leads, essentially, to the requirement of winning strategies for the games {\mathbb G_{A_n}} as described in the proof of theorem 3.

We use theorem 3 to show that measurable subsets of {\Omega\times{\mathbb R}} can be approximated from below by {\mathcal A_\delta}.

Corollary 4 Let {(\Omega,\mathcal F,{\mathbb P})} be a probability space and {\mathcal A} be the collection of subsets of {{\mathbb R}\times\Omega} given in lemma 1. Then, for any {S\in\mathcal B({\mathbb R})\otimes\mathcal F} and {\epsilon > 0}, there exists {A\subseteq S} in {\mathcal A_\delta} satisfying

\displaystyle  {\mathbb P}\left(\pi_\Omega(A)\right)\ge{\mathbb P}^*\left(\pi_\Omega(S)\right)-\epsilon.

Proof: Setting {X={\mathbb R}\times\Omega}, define

\displaystyle  \setlength\arraycolsep{2pt} \begin{array}{rl} &\displaystyle I\colon\mathcal P(X)\rightarrow{\mathbb R}^+\smallskip\\ &\displaystyle A\mapsto{\mathbb P}^*(\pi_\Omega(A)). \end{array}

This is clearly increasing. Also, if {A_n\subseteq X} is increasing to a limit A then {\pi_\Omega(A_n)} increases to {\pi_\Omega(A)}. Lemma 2 implies that {I(A_n)\rightarrow I(A)}, and I is continuous along increasing sequences.

As the complement of a compact interval in {{\mathbb R}} is a countable union of compact intervals, the complement of any {A\in\mathcal A} is a countable union of {\mathcal A}. The monotone class theorem then says that the closure of {\mathcal A} under limits of increasing and decreasing sequences is the entire sigma-algebra generated by {\mathcal A}. Hence,

\displaystyle  \mathcal{\bar A}=\mathcal B({\mathbb R})\otimes\mathcal F.

We apply theorem 3. For {S\in\mathcal B({\mathbb R})\otimes\mathcal F} and {\epsilon > 0}, setting {K=I(S)-\epsilon}, there exists a decreasing sequence {A_n\in\mathcal A} with {\bigcap_nA_n\subseteq S} and {I(A_n) > K}. Take {A=\bigcap_nA_n} which is in {\mathcal A_\delta}. As in the proof of lemma 1, {\pi_\Omega(A_n)\in\mathcal F} decreases to {\pi_\Omega(A)}. By monotone convergence,

\displaystyle  {\mathbb P}(\pi_\Omega(A))=\lim_n{\mathbb P}(\pi_\Omega(A_n))\ge K

as required. ⬜

Combining this result with the statement, in lemma 1, of measurable projection for sets in {\mathcal A_\delta} gives the measurable projection theorem.

Theorem 5 (Measurable Projection) Let {(\Omega,\mathcal F,{\mathbb P})} be a complete probability space, and {S\in\mathcal B({\mathbb R})\otimes\mathcal F}. Then, {\pi_\Omega(S)\in\mathcal F}.

Proof: By corollary 4, for each positive integer n, there is an {A_n\subseteq S} in {\in\mathcal A_\delta} with

\displaystyle  {\mathbb P}(\pi_\Omega(A_n))\ge{\mathbb P}^*(\pi_\Omega(S))-1/n. (8)

We know from lemma 1 that {\pi_\Omega(A_n)} are measurable, so {A\equiv\bigcup_n\pi_\Omega(A_n)} is in {\mathcal F}, is contained in {\pi_\Omega(S)}, and satisfies {{\mathbb P}(A)={\mathbb P}^*(\pi_\Omega(S))}. Lemma 2 states that there is a {B\supseteq\pi_\Omega(S)} in {\mathcal F} and satisfying {{\mathbb P}(B)={\mathbb P}^*(\pi_\Omega(S))}.

We have constructed sets {A\subseteq\pi_\Omega(S)\subseteq B} in {\mathcal F} and satisfying {{\mathbb P}(A)={\mathbb P}(B)}. By definition, this means that {\pi_\Omega(S)} is in the completion of {\mathcal F} and, if the probability space is complete, it is in {\mathcal F}. ⬜

In a similar way, corollary 4 combined with the statement of measurable section for sets in {\mathcal A_\delta}, given by lemma 1, gives the measurable section theorem.

Theorem 6 (Measurable Section) Let {(\Omega,\mathcal F,{\mathbb P})} be a probability space and {S\in\mathcal B({\mathbb R})\otimes\mathcal F}. Then, there exists a measurable {\tau\colon\Omega\rightarrow{\mathbb R}\cup\{\infty\}}, such that {[\tau]\subseteq S} and {\pi_\Omega(S)\setminus\{\tau < \infty\}} is {{\mathbb P}}-null.

Proof: As in the proof of theorem 5, there is a sequence {A_n\subseteq S} in {\mathcal A_\delta} satisfying (8). Replacing {A_n} by {\bigcup_{m\le n}A_m} if necessary, we suppose that the sequence {A_n} is increasing. Let {\tau_n} be the debut of {A_n}, Lemma 1 states that this is measurable and {[\tau_n]\subseteq A_n}. Define a random time {\tau} by,

\displaystyle  \tau(\omega)=\begin{cases} \tau_n(\omega),&{\rm\ for\ }\omega\in\pi_\Omega(A_n)\setminus\pi_\Omega(A_{n-1})\\ \infty,&{\rm\ for\ }\omega\in\Omega\setminus\bigcup_n\pi_\Omega(A_n) \end{cases}

(I am using {A_0=\emptyset}). This is measurable with graph {[\tau]} contained in S and,

\displaystyle  {\mathbb P}(\tau < \infty)={\mathbb P}\left(\bigcup\nolimits_n\pi_\Omega(A_n)\right)\ge{\mathbb P}^*(\pi_\Omega(S))

By lemma 2, there exists {B\in\mathcal F} containing {\pi_\Omega(S)} with {{\mathbb P}(B)={\mathbb P}^*(\pi_\Omega(S))}. So, {B\setminus\{\tau < \infty\}} has zero probability and contains {\pi_\Omega(S)\setminus\{\tau < \infty\}}, which is {{\mathbb P}}-null as required. ⬜

Finally, we state the theorem for complete probability spaces, in which case the section is defined on all of {\pi_\Omega(S)}, and not just up to a {{\mathbb P}}-null set.

Theorem 7 (Measurable Section) Let {(\Omega,\mathcal F,{\mathbb P})} be a complete probability space and {S\in\mathcal B({\mathbb R})\otimes\mathcal F}. Then, there exists a measurable {\tau\colon\Omega\rightarrow{\mathbb R}\cup\{\infty\}}, such that {[\tau]\subseteq S} and {\{\tau < \infty\}=\pi_\Omega(S)}.

Proof: By theorem 6 there exists a measurable map {\tau_0\colon\Omega\rightarrow{\mathbb R}\cup\{\infty\}} such that {[\tau_0]\subseteq S} and {\pi_\Omega(S)\setminus\{\tau_0 < \infty\}} is {{\mathbb P}}-null. Define {\tau\colon\Omega\rightarrow{\mathbb R}\cup\{\infty\}} by

\displaystyle  \displaystyle\tau(\omega)=\begin{cases} \tau_0(\omega),&{\rm\ if\ }\tau_0(\omega) < \infty,\\ \infty,&{\rm\ if\ }\omega\not\in\pi_\Omega(S),\\ t\in S(\omega),&{\rm\ if\ }\omega\in\pi_\Omega(S)\setminus\{\tau_0 < \infty\}. \end{cases}

Here, {S(\omega)} represents the slice of S defined as in (2). We do not care about which t is chosen in the third case but, as {S(\omega)} is nonempty on {\pi_\Omega(S)}, a choice does exist. By construction, {[\tau]\subseteq S}, {\{\tau < \infty\}=\pi_\Omega(S)}, and {\tau=\tau_0} almost surely. As {\tau_0} is measurable, completeness of the probability space implies that {\tau} is also measurable. ⬜

33 thoughts on “Proof of the Measurable Projection and Section Theorems

  1. Hi the formalization of \tau(\omega) is a little confusing to me. Do you mean something like the following ?
    For \forall S \in \mathcal F \otimes \mathcal B (\mathcal R) we set :
    \tau : \pi_S(\Omega)\in \Omega \to \mathcal B (\mathbb R )
    \omega  \to \tau(\omega), such that (\omega,\tau(\omega) )\in S
    In which case the “measurability” of \tau is not completely clear to me.
    Moreover the graph seems also a little ambiguous as what arrows point to are (\omega, \tau(\omega)) and not \tau(\omega) (which are parts of \mathcal B (\mathbb R) unless mistaken).

    1. I don’t follow what you are saying. In the second paragraph, \tau is a map from \pi_\Omega(S) (which is a subset of \Omega) to \mathbb R. So, \tau(\omega)\in\mathbb R and (\omega,\tau(\omega))\in\Omega\times\mathbb R. You seem to be suggesting that \pi_\Omega(S) is an element of \Omega, rather than a subset, and that \tau(\omega) is a subset of \mathbb R instead of an element.

      1. On reflection, maybe you are not suggesting that \pi_\Omega(S) is an element of \Omega and it is just a typo in your comment. However, it still looks like you are suggesting that \tau(\omega) is a subset of \mathbb R, which is not the case.

        1. As you spotted it’s a typo, I meant \subset intead of \in sorry about that. Coming back to my point, maybe I was confused about this quote :
          “…hat is, if S is in \mathcal F\otimes\mathcal B({\mathbb R}) then there is a measurable section, \tau\colon\pi_\Omega(S)\rightarrow\Omega\times{\mathbb R}
          So shouldn’t you write instead (as I understand your answer to my comment) ?
          \tau\colon\pi_\Omega(S)\rightarrow \mathbb R
          My second point in this regard is pointless and you can delete my other post. I also note that you make clear a “language abuse” shortly after all this when you write :
          “For brevity, the statement (\omega,\tau(\omega))\in S above will also be expressed by writing \tau\in S.” But I think it’s a bit early at this stage in your post to use this convention.
          Last let me correct you on one thing. It’s definitely not you who is happy to see me back (but I fill honored about that nevertheless so thanks), it is me indeed who is happy to see more posts from you on this amazing blog…, I have seen guys on MO forum who do not dare to quote and refer this blog in their papers as it is no “OK to refer a blog, but who can’t find in the literature equivalent theorems claimed and proved in such a clear and self contained manner… cela veut tout dire.

        2. Ah, I fixed the typo which caused your confusion, but it probably occurs elsewhere, so will fix properly later. I’ll also reread through and consider your suggestion regarding the notation when I have some time to properly edit this. Thanks!

        3. Last comment maybe would it be simpler to switch axis in your graph illustrating \tau, as it’s a function of \pi_\Omega (S)\subset \Omega \to \mathbb R and not the other way around.Regards

        4. Rather than changing the graph, maybe it would be better to change the order of the Cartesian products throughout. \mathbb R\times\Omega instead of \Omega\times\mathbb R. That would be consistent with earlier stochastic calculus posts.

        5. Another point is the ambiguity on the notation of sets S_n in the counterexample after equation (2) to illustrate the missing property needed for application of MCT. In some cases it’s a set in \mathbb R (S_n(\omega) and shortly after it is in \Omega \times \mathcal F unless mistaken. Regards

        6. Sorry a few more remarks (I am reading your post very slowly as you can notice;-) ):
          -In the end I think that it would be nice to formalize the notion of “section”, \tau by a fully fledged “definition 1” .

          -Using the S_n in your definition (2) of S_n(\omega) is a bit hard to follow for me as it is easy to forget that it’s only a compact of \mathbb R when \omega is used and a part of \mathbb R \times\Omega when \omega is dropped.

          -You say that a decreasing sequence of compact that’s unless mistaken a theorem from Cantor, could be worth mentioning to be self contained :https://en.wikipedia.org/wiki/Cantor%27s_intersection_theorem

          -The end of the argumentation for the “compact” example could be detailed a little bit more I think, I quote it his part :
          “For each {\omega\in\bigcap_n\pi_\Omega(S_n)}, the slices $latex{S_n(\omega)}$ give a decreasing sequence of nonempty compact sets, so has nonempty intersection. So, letting S be the intersection {\bigcap_n S_n}, the slice {S(\omega)=\bigcap_n S_n(\omega)} is nonempty. Hence, {\omega\in\pi_\Omega(S)}, and (1) follows.”
          So you proved \forall \omega \in \pi_\Omega(S_n) then the slice of the intersection S={\bigcap_n S_n}, namely S(\omega) is nonempty (part of \mathbb R ) in the first part and this is OK for me. But then the fact that this proves that \omega \in \pi_\Omega(S) still need a little more clarification even if it might seem trivial to you. So for your last claim to be true, I think you need to prove the following property :
          For all nonempty S \in \mathcal B(\mathbb R)\otimes\mathcal F and \forall \omega \in \Omega , we have :
          S(\omega) \not \phi \Leftrightarrow \omega \in \pi_\Omega(S).
          Proof :
          \Rightarrow , by definition of a slice if it’s not empty then for t \in S(\omega) then (t,\omega) \in S so that \omega \in \pi_\Omega(S).
          \Leftarrow let’s take a look at the “contrapositive” (i.e. non \Rightarrow), if \omega \not\in \pi_\Omega(S) then there is no t\in \mathbb R such that (t,\omega)\in S and S(\omega)=\phi and we are done. End of proof. Does that seems ok to you ?

        7. Hi I am reading the proof of theorem 3 and I was wondering about the fact that maybe f_n(S_1,\ldots,S_n)=f^r_s(S_{\theta(r,1)},\ldots,S_{\theta(r,s)}), at a fixed r, some of the sets S_{\theta(r,i)}, i =1,\ldots, s might not be valid, in the sense that even though the sequence f^r_s is a winning strategy, it is only for admissible sets S_s for the game \mathbb G_{A_r}. I don’t really see right now why this has to be the case . Regards

        8. Hi, I realized something quite trivial but still confusing about the conditions of applicability of theorem 3. If I(A) =0 for all A \in \mathcal A then by the properties of I it is also true that I(A) =0 for all A \in \bar {\mathcal A}. But then there exists no A \in \bar {\mathcal A} for which the conclusion of the theorem is applicable. I can’t figure out if that means that the theorem holds for such a case or not, my intuition is that it still holds because in the implicit “if” in the beginning of the conclusions of the theorem is not fulfilled, the end of the claim has no meaning which also means as it’s not applicable that the theorem still holds true in full generality, another way less elegant is to discard such bad behaved collection \mathcal A in the condition of the theorem. Regards

        9. Hi discussing the proof with a friend he has shown me an elementary argument under the conditions of the theorem of capacitability here it is. First \forall A \in \mathcal A \cap \mathcal C the assertion is trivial by taking a constant sequence equal to A. Now if A \in \bar {\mathcal A} is the limit of a decreasing sequence in \mathcal A then it is also trivial as the sequence A_n is in \mathcal C and its intersection (i.e. its limit) is equal to A. Last if A \in \bar {\mathcal A} is the limit of an increasing sequence in \mathcal A then by continuity of the capacity I, for N big enough we have I(A_N)>K so if we take the sequence $A’_n =A_N$ we are done unless mistaken as we have exhibited a sequence for elements of \mathcal A decreasing included in A and in \mathcal C. I think we might have missed something but we couldn’t see were we went wrong. Regards

        10. Hi again I am now almost sure that it has to do with the definition of “closure under increasing and decreasing sequences”. In the argument above it is supposed that it is two properties that considered one by one but not together so that every set of the closure is either the limit of a monotone sequence of \mathcal A. Under this assumption I think that the closure is not a “idempotent” operator which would be a bit odd (even if I don’t have an explicit counterexample). If we consider the “AND” as meaning that every monotone sequences of the closure itself are themselves in the closure then maybe it would be possible to define it as the minimal collection with the property that it includes \mathcal A and is “stable” by monotone sequence. Regards

        11. Yes, that is correct. The argument you gave does not apply to decreasing limits of increasing limits, or increasing limits of decreasing limits of increasing limits of sets in \mathcal A, etc. In fact, by results on the Borel Hierarchy of sets, if \mathcal A is the compact subsets of the reals, then it is not idempotent. In fact, the operation of taking increasing and decreasing limits does not stabilise until you get to the first uncountable ordinal.

        12. Thanks for your kind reply, I must confess that I begin to feel like a Russian troll farm on a reddit forum here …
          Anyway as your point applies then to my remark above for the case of I(A) = 0 for all A \in \mathcal A (which was not so lame after all), as for it to be true for all A \in \bar{\mathcal A} which I though was simple, we would need in fact transfinite induction to get the result as the monotone limits of sets A_n \in \mathcal A are not enough to conclude that I(A) = 0 for all A \in \bar{\mathcal A}. At last, I would really pleased to have a reference that shows that we have strict inclusion between the collection of monotone limits of sets in \mathcal A and the closure \bar{\mathcal A}. Once again best regards

  2. Dear George,
    thank you for the awesome blogpost. Since we can apply Theorem 5 for every probability measure in Theorem 6 we can actually say that the projection \pi(S) of S lies in the universal completion, i.e. the intersection of the completions w.r.t. all probability distributions. Do you know (or have any reference) if we then still have a universally measurable section (similar to Theorem 7)? If so could we then directly start with an S in the universal completion of the product space and then have a universally measurable projection and corresponding section?
    If you had any comments or pointers on this that would be great.

      1. Dear George,
        thank you very much for your answer.

        For everyone reading this I also found a reference for an even slightly more general version in Bogachev’s Measure Theory Corollary 6.9.12 (due to Leese):

        X a Souslin space (eg Polish), \Omega any measurable space and S an analytic/Souslin-B subset of \Omega \times X (eg any measurable set). Then:
        \pi(S) is Souslin-B in \Omega (thus universally measurable) and there exists a section that is measurable wrt the \sigma-algebra generated by Souslin-B sets (in particular universally measurable).

        It would be nice if one could get rid of all the topological conditions and just work with universally measurable spaces, maps and subsets.

        George, do you see any way to generalize further?

        Thank you for your great blog.

        1. Hi Rudolph. Regarding the more general versions of measurable section, there are a few points worth mentioning. First, allowing X to be Souslin is not much more general, as there will exist an onto Borel map f\colon\mathbb R\to X, allowing you to reduce the problem to the case X=\mathbb R. Similarly, if S is Souslin-B, then it will be the projection of a measurable B\subseteq\Omega\times X\times\mathbb R. This allows you to transfer the problem to the Borel set B. Then, as above, you can replace the Souslin space X\times\mathbb R by \mathbb R. So, extending from measurable subsets of \Omega\times\mathbb R to Souslin-B subsets of \Omega\times X for Souslin spaces X is not a big step.

          However, the fact that the section can be chosen measurable with respect to the sigma-algebra generated by the Souslin-B sets does seem to be a significant strengthening compared to the sigma-algebra of universally measurable sets. I do not know of any applications of this though.

          The suggestion in your first comments that maybe S could be taken in the universal completion of \mathcal F\otimes\mathcal B(\mathbb R) does seem unlikely to be true. This is significantly harder to deal with than measurable or Souslin-B sets. Would the measurable section result hold even for S assumed to be the complement of a Souslin-B set, for example? I doubt it, but constructing counterexamples is difficult. The uncountable axiom of choice would probably be needed — at least if the base space \Omega is the reals together with the Borel sigma-algebra — as there are versions of set theory in which countable dependent choice holds but every subset of the reals is universally measurable (Solovay model). Unfortunately non-universally measurable sets (Vitali sets) constructed using the AOC are going to be difficult to describe as projections of universally measurable sets, precisely because they were constructed using the AOC. Actually, I would not be surprised if such statements turn out to be dependent on the underlying logical axioms used, and may be independent of ZFC or of ZF+dependent choice.

        2. In fact, the answer to the following math.stackexchange question is relevant here, “Which sets are lebesgue measurable in ZFC?”.
          In particular “From 𝖬𝖠 (plus ¬𝖢𝖧) it follows that every \Sigma^1_2 set of reals is Lebesgue measurable”. This suggests that, to show that the projection of a co-analytic set is Lebesgue measurable, requires Martin’s Axiom and that the Continuum Hypothesis is false.

          Also, you cannot go much further up the projective hierarchy without requiring theories which are stronger than ZFC — “Nevertheless, you cannot go much further by restricting to 𝖹𝖥𝖢 consistencywise: Shelah showed that the measurability of the \Sigma^1_3 sets implies the existence of inaccessible cardinals in 𝐿.”
          .

  3. Dear George Lowther,

    Excellent post, interesting stuff. I have a suggestion on the arranging of proof 3 (Capacitability Theorem). As you correctly mention after it, the problem with the decreasing sequences can be dealt with if we are more strict and demand the choice of B_r to depend on B_{r -1}. In order to model this we make the following definitions:\\
    moves:= \{f: \mathcal{C} \rightarrow \mathcal{C}: \forall C \in \mathcal{C} f(C) \subseteq C\}\\   \mathcal{B}:= \{D \subseteq X: \exists \{f_n\}_{n \geq 1} \subseteq moves \> \text{such that} \> \forall \{C_n\}_{n \geq 1} \subseteq \mathcal{C} \> \text{with} \> C_1 \subseteq D, C_n \subseteq f_{n - 1}(C_{n - 1}) \forall n \geq 2 \> \exists \{A_n\}_{n \geq 1} \subseteq \mathcal{A} \> \text{with} \> C_n \subseteq A_n \forall n \geq 1 \> \text{and} \>  \bigcap_{n \geq 1} A_n \subseteq D\}.\\

    The rest of the proof can easily be transformed analogously with what you do, i think these approach makes it a bit tidier, what do you think.

      1. Dear George,

        You see the problem with the natural approach, as you mentioned, specifically was that when we merge the boubly-indexed sequence A_{nk} into a singly-indexed one with the help of function \theta the new sequence stops being decreasing. But with the definition that I gave for \mathcal{B} we can do the same process just fine for the corresponding functions f_{nk} in order to construct the new sequence f_n that we need for the set intersection. The reason of course is that there is no decreasing requirement for the sequences \{f_n\}_{n \geq 1} that appear in the definition of \mathcal{B}.

        1. Dear George,

          Sorry for the consecutive comments but I didn’t have time to write everything in one go. You mention after the proof of theorem 3 that we can begin our attempts by defining \mathcal{B}:= \{A \subseteq X: \exists \{A_n\}_{n \geq 1} \subseteq \mathcal{A} \cap \mathcal{C} \> \text{with} \bigcap_{n \geq 1}A_n \subseteq A\}. But this definition is problematic from the start, because there is no guarantee that \mathcal{A} \setminus \mathcal{C} \subseteq \mathcal{B}. We need a condition in the form of <> or equivalent <>. Thinking like this we quickly arrive to the following intuitive definition, \latex \mathcal{B}:= \{A \subseteq X: \forall \{C_n\}_{n \geq 1} \subseteq \mathcal{C} \text{decreasing, with} C_1 \subseteq A \exists \{A_n\}_{n \geq 1} \text{such that} C_n \subseteq A_n \forall n \geq 1 \text{and} \bigcap_{n \geq 1}A_n \subseteq A\}$. Unfortunately this doesn’t work either, the problem though lies in the increasing sequences instead of the decreasing ones. The usual merging process works fine with the decrasing, however in the increasing case we want something to guarantee that the subsequence {C_n\}_{n \geq 1} of the union falls fast enough so that we can see it as a subsequence of one of its members. This is the purpose of the functions \{f_n\}_{n \geq 1} in my first comment, they are “speed” conditions.

          P.S.
          There is a typo in the definition of \mathcal{B} in my first comment, the sequence \{C_n\}_{n \geq 1} should be a subset of \mathcal{C} \cup \{X\} instead of just \mathcal{C}.

      2. Dear George,

        It seems I misspelled in the use of latex at my last comment, it would be nice if you can fix it so it would be readable. Also in the PS I confused the notation, what I wanted to say is \{A_n\}_{n \geq 1} should be a subset of \mathcal{A} \cup \{X\}.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s