The Rademacher distribution is probably the simplest nontrivial probability distribution that you can imagine. This is a discrete distribution taking only the two possible values , each occurring with equal probability. A random variable X has the Rademacher distribution if
A Randemacher sequence is an IID sequence of Rademacher random variables,
Recall that the partial sums of a Rademacher sequence is a simple random walk. Generalizing a bit, we can consider scaling by a sequence of real weights , so that . I will concentrate on infinite sums, as N goes to infinity, which will clearly include the finite Rademacher sums as the subset with only finitely many nonzero weights.
Rademacher series serve as simple prototypes of more general IID series, but also have applications in various areas. Results include concentration and anti-concentration inequalities, and the Khintchine inequality, which imply various properties of spaces and of linear maps between them. For example, in my notes constructing the stochastic integral starting from a minimal set of assumptions, the version of the Khintchine inequality was required. Rademacher series are also interesting in their own right, and a source of some very simple statements which are nevertheless quite difficult to prove, some of which are still open problems. See, for example, Some explorations on two conjectures about Rademacher sequences by Hu, Lan and Sun. As I would like to look at some of these problems in the blog, I include this post to outline the basic constructions. One intriguing aspect of Rademacher series, is the way that they mix discrete distributions with combinatorial aspects, and continuous distributions. On the one hand, by the central limit theorem, Rademacher series can often be approximated well by a Gaussian distribution but, on the other hand, they depend on the discrete set of signs of the individual variables in the sum.
In order to guarantee that the series are convergent, we will restrict the weights
to lie in the space of square summable (real) sequences. That is, is finite. As is standard, is a Hilbert space with norm and inner product,
The series will be denoted using `dot product’ notation, , which is indeed a convergent sum. Throughout, I am assuming that we have an underlying probability space on which the required random variables are defined.
Lemma 1 Let X be a Rademacher sequence and . Then, the series
converges both in and almost surely. Furthermore, is a square integrable random variable with mean zero and variance .
Proof: Convergence in is straightforward. If we write for the partial sums then these have zero mean and, using independence,
for . This vanishes as M, N go to infinity, so the sequence is -Cauchy and hence converges to a limit in . Similarly,
as N goes to infinity so, by convergence, has zero mean and variance .
It remains to show that almost surely. As we know that it converges in and, hence, in probability, it is sufficient to show that is almost surely Cauchy convergent.
We already know that almost surely, from martingale convergence. However, I give an alternative proof using the simpler Doob –inequality. For each positive integer N, this says that
as N goes to infinity. Hence, for positive integer L,
as L goes to infinity. As is a decreasing sequence with norm going to zero, it converges almost surely to zero, so is almost surely Cauchy convergent. ⬜
As has zero mean and variance , we immediately obtain that is an isometry.
Corollary 2 For a Rademacher sequence X, the map
is an isometry.
If has only finitely many nonzero terms, then is clearly a finite Rademacher sum, so has a discrete distribution supported by a finite subset of . On the other hand, if has infinitely many nonzero terms then it is also intuitively clear that must have zero probability of equalling any given real value, and this is not hard to prove.
Lemma 3 For a Rademacher sequence X and with infinitely many nonzero terms, then the distribution of is continuous. That is, for all .
Proof: Suppose to the contrary that . Note that, for any positive integer n, flipping the sign of does not impact the distribution of , but it changes its value by . Therefore, . As tends to zero, but has infinitely many nonzero terms, we can find a subsequence with distinct absolute values. Hence,
a contradiction. ⬜
A more difficult question is whether a Rademacher series has an absolutely continuous distribution — that is, whether it has a density with respect to the Lebesgue measure. Alternatively, can it have a singular distribution, so is supported on a set of zero Lebesgue measure? In fact, as the following example shows, both behaviours occur.
Example 1 For a Rademacher sequence X and ,
- if then has the uniform distribution on , so is absolutely continuous.
- if then is uniformly distributed on the Cantor middle thirds set on , so is singular.
For the first example with then is an IID sequence, each taking the values with probability 1/2. So,
which is a binary expansion in which each digit independently takes each of the values 0 and 1 with probability 1/2, so is uniformly distributed on the unit interval.
The Cantor middle thirds set on the unit interval can be defined as the set of real numbers in which have a ternary (base 3) expansion containing only the digits 0 and 2. This is well known to have zero Lebesgue measure, so any distribution supported on this set is singular. For the second example with , then is an IID sequence taking each of the values 0 and 2 with probability 1/2. So,
is a ternary expansion where each digit independently has value 0 or 2, both with probability 1/2. This is the uniform distribution on the Cantor set.
The characteristic function of a Rademacher series is straightforward to compute.
Lemma 4 The Rademacher series has characteristic function
(1) which is valid for all .
Proof: Letting then, using independence,
We only need to show that we can commute the limit with the expectation on the LHS. In case that is real, this is bounded convergence. More generally, when is complex, we need to show that the sequence is uniformly integrable. We use the following simple inequality,
I prove this in lemma 5 below. In particular, is -bounded, so is uniformly integrable. ⬜
In the proof above, in order to take the limit when is not real, we made use of the following inequality.
Lemma 5 For , the Rademacher series satisfies the inequality,
(2)
Proof: Using , independence gives,
However, (for example, compare power series terms). So,
Letting N go to infinity and applying Fatou’s lemma on the left gives the result. ⬜
Just as an aside, computing the characteristic function of the uniform distribution in example 1 and comparing with (1) gives the following interesting infinite product trigonometric relation,
As noted above, the distribution of a Rademacher series can be approximated by a Gaussian. Denoting the supremum norm by , then this approximation works best when is small in comparison to . I use the notation for the Gaussian distribution of mean and variance .
Lemma 6 Let be sequences in with and . Then, the distributions of the Rademacher series converge in distribution to .
Proof: I will make use of the standard result that a sequence of probability measures converges (weakly) in distribution to a limit if and only if the characteristic functions converge. Using for the characteristic function, we have
In particular, if then each of the terms on the right is positive, and we can take logarithms,
By Taylor expansion, the terms inside the summation are of order . That is, there are positive reals such that the terms are bounded by whenever . So, for ,
Hence, if we let go to zero and go to , then tends to the normal characteristic function . ⬜
In the opposite direction, when is large, then the Rademacher series is affected by a small number of relatively large terms, so behaves more like a discrete distribution rather than a Gaussian. As , the extreme case is , in which case has only a single nonzero term and takes the values , each with probability .
One direct consequence of lemma 6 is the following special case of the central limit theorem. I use to denote convergence in distribution.
Corollary 7 If X is a Rademacher sequence and is a bounded sequence of real numbers satisfying then, using ,
as N goes to infinity.
Proof: As the sequence is bounded, suppose that . For each N, define by for and otherwise. Then, . Furthermore, and , which vanishes as N go to infinity. So, lemma 6 gives . ⬜
Above, in lemma 1, we showed that square summability of the sequence is sufficient to guarantee convergence of the Rademacher series. This is, in fact, also a necessary condition. If the sequence is not square summable, then the Rademacher series will diverge in probability and, hence, also diverge almost surely.
Lemma 8 If X is a Rademacher sequence and is a sequence of real numbers satisfying then, for each ,
as N goes to infinity.
Proof: Write and let . In the case that is bounded by some value K then we will apply corollary 7, so let Z be a standard normal random variable (on some probability space). Then, for each , using the condition that , we have for large N so,
Letting go to zero, the right hand side goes to one, giving the result.
It remains to prove the result when is an unbounded sequence, in which case we can apply a similar argument to that used in lemma 3. Then there exists a subsequence satisfying for all . For each , flipping the sign of does not impact the distribution of , but it shifts its value by so, if then it will be shifted to lie in the set . Hence,
The sets are disjoint so,
As the sum on the right tends to infinity when N goes to infinity, we have . ⬜
Usually, we are concerned with the distribution of a Rademacher series , rather than its particular representation as a random variable on a specific probability space. However, different values for can lead to the same distribution. For example, flipping the signs of some elements of or permuting its elements has no effect on the distribution of . This is because applying such operations has the same effect on as applying the operations to the terms of X which, again, do not affect its distribution. Alternatively, we can see that the characteristic function (1) is unaffected by such operations.
I will use to denote the sequences which are nonnegative and decreasing,
Any can be put in standard form by flipping the sign of any negative terms and then arranging in decreasing order. Using for this standard form, then and the terms are the same as when arranged in decreasing order. Also,
where denotes equality in distribution. So, when looking at the distributions of Rademacher series, we can always assume that the weights are in . Once we do this, then there is no remaining degeneracy.
Lemma 9 Let X be a Rademacher series. Then, any is uniquely determined by the distribution of .
Proof: From expression (1) for the characteristic function , we can express in terms of . Factoring
then is equal to with equal to the smallest positive zero of , or is equal to zero if is everywhere nonzero. ⬜
Another benefit of restricting to is that we can bound the individual terms of the sequence.
Lemma 10 If then
Proof: As for , we have
⬜
Now consider the map from a sequence to the distribution of its Rademacher series. Using to denote the set of Borel probability measures on ,
We can use convergence in distribution (weak topology) on , under which a sequence of measures tends to a limit iff for all continuous bounded functions . Restricting the domain to , lemma 9 says that is one-to-one. By corollary 2, it is also continuous with respect to the norm topology on and the weak topology on . However, the norm topology is too strong for many purposes. For example, in the context of lemma 6, the sequence is not norm convergent when , but does converge weakly to zero. I will finish this post by considering how we can extend the idea of Rademacher series to naturally incorporate limits such as that given by lemma 6 into the domain of the map.
The weak topology on , by definition, is the weakest topology making the maps continuous for each fixed . In particular, a sequence converges weakly to the limit if and only if for each . On any bounded set, it can be seen that weak convergence is equivalent to the pointwise convergence of the terms in the sequences (or, the product topology).
Looking again at lemma 6, we see that the sequence is weakly convergent, but that convergences in distribution to a Gaussian. This suggests completing the space of Rademacher series distributions by adding in a Gaussian term. I do this as follows. First, let consist of the pairs for and with . Weak convergence in this space is the same as weak convergence separately for and , which the same as pointwise convergence of and the terms of . Next, let X be a Rademacher sequence and Y be an independent normal random variable with zero mean and unit variance. Then, define the map
(3) |
In the case that , this is just the distribution of the Rademacher series . However, we are allowing an additional normal term by taking , and it is straightforward to see that has variance .
Lemma 11 The map is weakly continuous and one-to-one, and is a weak homeomorphism onto its image.
Proof: Using (1), we can compute the characteristic function,
To see that this is one-to-one, we need to show that can be recovered from knowledge of the characteristic function. This is straightforward: is equal to the variance, and can be recovered in exactly the same way as in the proof of lemma 9.
We next show continuity. So, suppose that converges weakly to . We need to show that for each fixed . First, as it converges, is bounded above by some . Then, so, fixing positive , for sufficiently large we have . As is nonnegative over , we can express the characteristic function as
The terms on the right hand side manifestly converge as , except for the term in the exponential where we need to commute the limit with the infinite sum. As is of order , it is bounded by for some positive . So, the terms inside the sum on the right hand side are bounded by , which has finite sum over . Hence, by dominated convergence, we can commute the limit with the infinite sum and we obtain
as required.
It just remains to show that the inverse of is weakly continuous. For each fixed , the space of with is compact (by Tychonoff’s theorem) so, restricting to such bounded sets, the result is immediate. Every continuous one-to-one map from a compact space to a Hausdorff space has continuous inverse. Hence, we just need to show that if is a sequence such that weakly, then is a bounded sequence. I will use the fact that if is a sequence tending to zero then .
First, we can show that is a bounded sequence. If not then, by passing to a subsequence, we can suppose that . Setting gives , a contradiction. Next, we show that is bounded. If not then, by passing to a subsequence, we can suppose that . Taking then for large n, so we obtain the contradiction
⬜
In fact, for any , it is not difficult to construct a sequence converging weakly to such that . Then, tends weakly to and, by the above result,
Hence, the definition above given for is the unique continuous extension of the Rademacher series distribution to all of . Furthermore, we see that the set of measures
is the weak closure of the distributions of Rademacher series, and is weakly homeomorphic to .
One benefit of using the map (3) together with the weak topology is that the unit ball of (and, hence, of ) is weakly compact. Using to denote the unit ball, we obtain a continuous map
This is equal to the distribution of when , and is a weak homeomorphism onto its image.
One thought on “Rademacher Series”