The aim of this post is to motivate the idea of representing probability spaces as states on a commutative algebra. We will consider how this abstract construction relates directly to classical probabilities.
In the standard axiomatization of probability theory, due to Kolmogorov, the central construct is a probability space . This consists of a state space , an event space , which is a sigma-algebra of subsets of , and a probability measure . The measure is defined as a map satisfying countable additivity and normalised as .
A measure space allows us to define integrals of real-valued measurable functions or, in the language of probability, expectations of random variables. We construct the set of all bounded measurable functions . This is a real vector space and, as it is closed under multiplication, is an algebra. Expectation, by definition, is the unique linear map , satisfying for and monotone convergence: if is a nonnegative sequence increasing to a bounded limit , then tends to .
In the opposite direction, any nonnegative linear map satisfying monotone convergence and defines a probability measure by . This is the unique measure with respect to which expectation agrees with the linear map, . So, probability measures are in one-to-one correspondence with such linear maps, and they can be viewed as one and the same thing. The Kolmogorov definition of a probability space can be thought of as representing the expectation on the subset of consisting of indicator functions . In practice, it is often more convenient to start with a different subset of . For example, probability measures on can be defined via their Laplace transform, , which represents the expectation on exponential functions . Generalising to complex-valued random variables, probability measures on are often represented by their characteristic function , which is just the expectation of the complex exponentials . In fact, by the monotone class theorem, we can uniquely represent probability measures on by the expectations on any subset which is closed under taking products and generates the sigma-algebra .
A simple corollary of the monotone class theorem states that there is a one-to-one correspondence between sigma-algebras on a set and algebras of bounded functions closed under monotone convergence, with the correspondence given by .
On the other hand, in quantum mechanics, we start with a Hilbert space , and observables are represented as self-adjoint operators. Restricting our consideration to bounded observables, these generate a subalgebra of , the space of bounded linear maps on . A pure state is represented by an element normalised so that , and the expectation of an observable is . This is a nonnegative linear map from a sub-algebra of to .
All of this suggests that it would be useful to consider an alternative approach to probability. Instead of a measurable space , we have an algebra . Instead of a probability measure , we have a positive linear map from to or . The underlying state space is not required at all — it is a pointless approach to probability, as we no longer include the points in the representation of the probability space. As multiplication of real (and complex) numbers is commutative, , algebras of the form are commutative. Hence, classical probability spaces will correspond to commutative algebras, with the generalisation to non-commutative algebras incorporating quantum probability.
As this post is primarily intended to motivate the algebraic approach to probability, rather than go into technical details, I will not give proofs of all theorems quoted here and, instead, will refer to the literature. We start with the definition of an algebra.
Definition 1 Let be a field. Then, a -algebra, or algebra over , is a -vector space equipped with a binary product, , and identity element satisfying the following for all .
- Associativity: .
- Compatability with scalars: for all .
- Left-distributivity: .
- Identity: .
If, furthermore, for all then the algebra is said to be commutative.
Strictly speaking, this defines a unitial associative algebra. Sometimes, the axiom of associativity is dropped, although I do not look at such non-associative algebras here. Similarly, the existence of the identity is sometimes dropped along with its corresponding axiom. In this post, whenever the unqualified term `algebra’ is used, then it refers to a structure satisfying definition 1, so is unitial. Also, I will use the symbol to denote the identity element. This creates some ambiguity as to whether an expression of the form refers to multiplication by the identity element or by the scalar . However, as they both evaluate to , it should not cause any confusion.
A subset is called commutative if for all . In particular, the algebra itself is commutative if and only if is commutative as a set of elements. It is also easy to show that the sub-algebra of generated by a commutative set (i.e., the smallest subalgebra containing ) is itself commutative. Note that this means that the subalgebra generated by a single element is commutative.
Examples of algebras abound in mathematics. A small set of examples is:
- Polynomial rings are commutative -algebras.
- For a set , the collection of functions is a commutative -algebra, where the operations of addition, scalar multiplication, and multiplication are defined point-wise.
- For a measurable space , the collection of -measurable functions is an -algebra.
- For a normed real vector space , the collection of bounded linear maps is an -algebra.
We define the notion of a state on a commutative real algebra.
Definition 2 Let be a commutative real algebra. Then, a linear map is
- positive if for all .
- a state if it is positive and .
Correspondence with classical probabilities
As discussed above, a classical probability space determines a commutative real algebra, consisting of the bounded random variables, and a state on this algebra given by expectation. The question is, can this process be inverted? When can a state on a commutative real algebra be represented as an expectation on a set of random variables on some probability space? We start by considering a single element . This defines a map taking any polynomial to its evaluation , and the image is the sub-algebra generated by . For to be considered as a random variable on a probability space, then its distribution is a probability measure on satisfying
for all polynomials . By linearity, (1) holds whenever it holds on monomials . That is, we require
for all positive integers and, for this to make sense, must have finite moments. This is the classical moment problem, to construct a probability measure from its moments. In the one factor case, it is known that the positivity of ensures existence of a solution.
Theorem 3 Let be a state on commutative real algebra . Then, for any , there exists a probability measure on satisfying (1).
The existence of a measure with specified moments is known as the Hamburger problem. Unfortunately, uniqueness need not hold, as there do exist distinct probability measures on with the same moments. As an example, consider the log-normal distribution on the nonnegative reals, and a perturbation of this,
These measures have all the same moments,
and therefore generate the same state on the algebra . On the other hand, it is not difficult to show that the distribution of a bounded random variable is uniquely determined by its moments. This follows from the Stone–Weierstrass theorem, which states that the polynomials are dense in the space of continuous functions on any closed bounded interval. Furthermore, the distribution will be supported by an interval if for all positive . It is possible to relax this condition to bounds on the growth of the moments, such as Carleman’s condition (2).
Theorem 4 Let be a state on commutative real algebra . If satisfies
then there exists a unique probability measure on satisfying (1).
This result goes back to T. Carleman, Les fonctions quasi analytiques, Gauthier–Villars, Paris, 1926. A proof of this result, and also of the Hamburger problem, is given in the lecture notes, The classical moment problem, by Sasha Sodin, 2019. See Theorem 3.1 and Corollary 2.12.
Moving to the mutifactor situation, where we have a sequence , the aim is to find a probability measure on satisfying
for all polynomials and, as in the single factor case, must have finite moments for this to make sense. Unlike the single factor case, this is not always possible, so theorem 3 does not generalise to . The reason for this is that there exists multifactor polynomials which are positive on all of , yet cannot be expressed as a sum of squares. Consider
The AM-GM inequality shows that everywhere on . However, it is not possible to express as for a finite sequence of polynomials . This means that the definition of positivity for a state on is insufficient to ensure that , and the hyperplane separation theorem implies the existence of states with . No such state can arise from the expectation under a probability measure.
Fortunately, if sufficient bounds are imposed on the growth of the moments , then it is possible to show that a unique measure exists satisfying (3). Again, in the case that for all and some real , then the Stone–Weierstrass theorem can be used to show uniqueness of , which must be supported on , with the Riesz representation theorem providing existence. These conditions can be weakened considerably and, in fact, it is known that Carleman’s condition for each of the individual elements is sufficient to guarantee existence and uniqueness.
This result originates from Nussbaum, A. E., Quasi-analytic vectors, Arkiv för Matematik. 6 (1965), no. 2, 179–191.
Taking the idea a step further, we can consider infinite subsets of . Let be the space of functions and denote the coordinate map . Let be the sigma-algebra on generated by . That is, is the smallest sigma-algebra on with respect to which each is measurable. In particular, it is generated by the sets for Borel . The collection generates an algebra of random variables, which can be expressed as real polynomials in . Evaluating the polynomials at the values gives an algebra homomorphism . The aim is to find a probability measure on satisfying
for all which, to make sense, requires to be integrable.
If we choose to be a generating set for , so that the smallest subalgebra of containing every is all of , then we obtain a representation of as an algebra of random variables on a probability space together with the expectation operator.
Proof: For each finite subset , theorem 5 uniquely determines a probability measure on satisfying
for all polynomials .
Define the projection map by for all and . For a probability measure on , the pushforward measure on is defined by . Condition (4) is then,
or, equivalently, . Existence and uniqueness of follows from Kolmogorov’s extension theorem. ⬜