The Rademacher distribution is probably the simplest nontrivial probability distribution that you can imagine. This is a discrete distribution taking only the two possible values , each occurring with equal probability. A random variable X has the Rademacher distribution if
A Randemacher sequence is an IID sequence of Rademacher random variables,
Recall that the partial sums of a Rademacher sequence is a simple random walk. Generalizing a bit, we can consider scaling by a sequence of real weights
, so that
. I will concentrate on infinite sums, as N goes to infinity, which will clearly include the finite Rademacher sums as the subset with only finitely many nonzero weights.
Rademacher series serve as simple prototypes of more general IID series, but also have applications in various areas. Results include concentration and anti-concentration inequalities, and the Khintchine inequality, which imply various properties of spaces and of linear maps between them. For example, in my notes constructing the stochastic integral starting from a minimal set of assumptions, the
version of the Khintchine inequality was required. Rademacher series are also interesting in their own right, and a source of some very simple statements which are nevertheless quite difficult to prove, some of which are still open problems. See, for example, Some explorations on two conjectures about Rademacher sequences by Hu, Lan and Sun. As I would like to look at some of these problems in the blog, I include this post to outline the basic constructions. One intriguing aspect of Rademacher series, is the way that they mix discrete distributions with combinatorial aspects, and continuous distributions. On the one hand, by the central limit theorem, Rademacher series can often be approximated well by a Gaussian distribution but, on the other hand, they depend on the discrete set of signs of the individual variables in the sum.
In order to guarantee that the series are convergent, we will restrict the weights
to lie in the space of square summable (real) sequences. That is,
is finite. As is standard,
is a Hilbert space with norm and inner product,
The series will be denoted using `dot product’ notation,
, which is indeed a convergent sum. Throughout, I am assuming that we have an underlying probability space
on which the required random variables are defined.
Lemma 1 Let X be a Rademacher sequence and
. Then, the series
converges both in
and almost surely. Furthermore,
is a square integrable random variable with mean zero and variance
.
Proof: Convergence in is straightforward. If we write
for the partial sums then these have zero mean and, using independence,
for . This vanishes as M, N go to infinity, so the sequence is
-Cauchy and hence converges to a limit
in
. Similarly,
as N goes to infinity so, by convergence,
has zero mean and variance
.
It remains to show that almost surely. As we know that it converges in
and, hence, in probability, it is sufficient to show that
is almost surely Cauchy convergent.
We already know that almost surely, from martingale convergence. However, I give an alternative proof using the simpler Doob
–inequality. For each positive integer N, this says that
as N goes to infinity. Hence, for positive integer L,
as L goes to infinity. As is a decreasing sequence with
norm going to zero, it converges almost surely to zero, so
is almost surely Cauchy convergent. ⬜
As has zero mean and variance
, we immediately obtain that
is an isometry.
Corollary 2 For a Rademacher sequence X, the map
is an isometry.
If has only finitely many nonzero terms, then
is clearly a finite Rademacher sum, so has a discrete distribution supported by a finite subset of
. On the other hand, if
has infinitely many nonzero terms then it is also intuitively clear that
must have zero probability of equalling any given real value, and this is not hard to prove.
Lemma 3 For a Rademacher sequence X and
with infinitely many nonzero terms, then the distribution of
is continuous. That is,
for all
.
Proof: Suppose to the contrary that . Note that, for any positive integer n, flipping the sign of
does not impact the distribution of
, but it changes its value by
. Therefore,
. As
tends to zero, but has infinitely many nonzero terms, we can find a subsequence
with distinct absolute values. Hence,
a contradiction. ⬜
A more difficult question is whether a Rademacher series has an absolutely continuous distribution — that is, whether it has a density with respect to the Lebesgue measure. Alternatively, can it have a singular distribution, so is supported on a set of zero Lebesgue measure? In fact, as the following example shows, both behaviours occur.
Example 1 For a Rademacher sequence X and
,
- if
then
has the uniform distribution on
, so is absolutely continuous.
- if
then
is uniformly distributed on the Cantor middle thirds set on
, so is singular.
For the first example with then
is an IID sequence, each taking the values
with probability 1/2. So,
which is a binary expansion in which each digit independently takes each of the values 0 and 1 with probability 1/2, so is uniformly distributed on the unit interval.
The Cantor middle thirds set on the unit interval can be defined as the set of real numbers in which have a ternary (base 3) expansion containing only the digits 0 and 2. This is well known to have zero Lebesgue measure, so any distribution supported on this set is singular. For the second example with
, then
is an IID sequence taking each of the values 0 and 2 with probability 1/2. So,
is a ternary expansion where each digit independently has value 0 or 2, both with probability 1/2. This is the uniform distribution on the Cantor set.
The characteristic function of a Rademacher series is straightforward to compute.
Lemma 4 The Rademacher series
has characteristic function
(1) which is valid for all
.
Proof: Letting then, using independence,
We only need to show that we can commute the limit with the expectation on the LHS. In case that
is real, this is bounded convergence. More generally, when
is complex, we need to show that the sequence
is uniformly integrable. We use the following simple inequality,
I prove this in lemma 5 below. In particular, is
-bounded, so is uniformly integrable. ⬜
In the proof above, in order to take the limit when is not real, we made use of the following inequality.
Lemma 5 For
, the Rademacher series
satisfies the inequality,
(2)
Proof: Using , independence gives,
However, (for example, compare power series terms). So,
Letting N go to infinity and applying Fatou’s lemma on the left gives the result. ⬜
Just as an aside, computing the characteristic function of the uniform distribution in example 1 and comparing with (1) gives the following interesting infinite product trigonometric relation,
As noted above, the distribution of a Rademacher series can be approximated by a Gaussian. Denoting the supremum norm by , then this approximation works best when
is small in comparison to
. I use the notation
for the Gaussian distribution of mean
and variance
.
Lemma 6 Let
be sequences in
with
and
. Then, the distributions of the Rademacher series
converge in distribution to
.
Proof: I will make use of the standard result that a sequence of probability measures converges (weakly) in distribution to a limit if and only if the characteristic functions converge. Using for the characteristic function, we have
In particular, if then each of the terms on the right is positive, and we can take logarithms,
By Taylor expansion, the terms inside the summation are of order . That is, there are positive reals
such that the terms are bounded by
whenever
. So, for
,
Hence, if we let go to zero and
go to
, then
tends to the normal characteristic function
. ⬜
In the opposite direction, when is large, then the Rademacher series is affected by a small number of relatively large terms, so behaves more like a discrete distribution rather than a Gaussian. As
, the extreme case is
, in which case
has only a single nonzero term and
takes the values
, each with probability
.
One direct consequence of lemma 6 is the following special case of the central limit theorem. I use to denote convergence in distribution.
Corollary 7 If X is a Rademacher sequence and
is a bounded sequence of real numbers satisfying
then, using
,
as N goes to infinity.
Proof: As the sequence is bounded, suppose that . For each N, define
by
for
and
otherwise. Then,
. Furthermore,
and
, which vanishes as N go to infinity. So, lemma 6 gives
. ⬜
Above, in lemma 1, we showed that square summability of the sequence is sufficient to guarantee convergence of the Rademacher series. This is, in fact, also a necessary condition. If the sequence is not square summable, then the Rademacher series will diverge in probability and, hence, also diverge almost surely.
Lemma 8 If X is a Rademacher sequence and
is a sequence of real numbers satisfying
then, for each
,
as N goes to infinity.
Proof: Write and let
. In the case that
is bounded by some value K then we will apply corollary 7, so let Z be a standard normal random variable (on some probability space). Then, for each
, using the condition that
, we have
for large N so,
Letting go to zero, the right hand side goes to one, giving the result.
It remains to prove the result when is an unbounded sequence, in which case we can apply a similar argument to that used in lemma 3. Then there exists a subsequence satisfying
for all
. For each
, flipping the sign of
does not impact the distribution of
, but it shifts its value by
so, if
then it will be shifted to lie in the set
. Hence,
The sets are disjoint so,
As the sum on the right tends to infinity when N goes to infinity, we have . ⬜
Usually, we are concerned with the distribution of a Rademacher series , rather than its particular representation as a random variable on a specific probability space. However, different values for
can lead to the same distribution. For example, flipping the signs of some elements of
or permuting its elements has no effect on the distribution of
. This is because applying such operations has the same effect on
as applying the operations to the terms of X which, again, do not affect its distribution. Alternatively, we can see that the characteristic function (1) is unaffected by such operations.
I will use to denote the sequences
which are nonnegative and decreasing,
Any can be put in standard form by flipping the sign of any negative terms and then arranging in decreasing order. Using
for this standard form, then
and the terms
are the same as
when arranged in decreasing order. Also,
where denotes equality in distribution. So, when looking at the distributions of Rademacher series, we can always assume that the weights
are in
. Once we do this, then there is no remaining degeneracy.
Lemma 9 Let X be a Rademacher series. Then, any
is uniquely determined by the distribution of
.
Proof: From expression (1) for the characteristic function , we can express
in terms of
. Factoring
then is equal to
with
equal to the smallest positive zero of
, or is equal to zero if
is everywhere nonzero. ⬜
Another benefit of restricting to is that we can bound the individual terms of the sequence.
Lemma 10 If
then
Proof: As for
, we have
⬜
Now consider the map from a sequence to the distribution of its Rademacher series. Using
to denote the set of Borel probability measures on
,
We can use convergence in distribution (weak topology) on , under which a sequence of measures
tends to a limit
iff
for all continuous bounded functions
. Restricting the domain to
, lemma 9 says that
is one-to-one. By corollary 2, it is also continuous with respect to the norm topology on
and the weak topology on
. However, the norm topology is too strong for many purposes. For example, in the context of lemma 6, the sequence
is not norm convergent when
, but does converge weakly to zero. I will finish this post by considering how we can extend the idea of Rademacher series to naturally incorporate limits such as that given by lemma 6 into the domain of the map.
The weak topology on , by definition, is the weakest topology making the maps
continuous for each fixed
. In particular, a sequence
converges weakly to the limit
if and only if
for each
. On any bounded set, it can be seen that weak convergence is equivalent to the pointwise convergence of the terms in the sequences (or, the product topology).
Looking again at lemma 6, we see that the sequence is weakly convergent, but that
convergences in distribution to a Gaussian. This suggests completing the space of Rademacher series distributions by adding in a Gaussian term. I do this as follows. First, let
consist of the pairs
for
and
with
. Weak convergence in this space is the same as weak convergence separately for
and
, which the same as pointwise convergence of
and the terms of
. Next, let X be a Rademacher sequence and Y be an independent normal random variable with zero mean and unit variance. Then, define the map
(3) |
In the case that , this is just the distribution of the Rademacher series
. However, we are allowing an additional normal term by taking
, and it is straightforward to see that
has variance
.
Lemma 11 The map
is weakly continuous and one-to-one, and is a weak homeomorphism onto its image.
Proof: Using (1), we can compute the characteristic function,
To see that this is one-to-one, we need to show that can be recovered from knowledge of the characteristic function. This is straightforward:
is equal to the variance, and
can be recovered in exactly the same way as in the proof of lemma 9.
We next show continuity. So, suppose that converges weakly to
. We need to show that
for each fixed
. First, as it converges,
is bounded above by some
. Then,
so, fixing positive
, for sufficiently large
we have
. As
is nonnegative over
, we can express the characteristic function as
The terms on the right hand side manifestly converge as , except for the term in the exponential where we need to commute the limit with the infinite sum. As
is of order
, it is bounded by
for some positive
. So, the terms inside the sum on the right hand side are bounded by
, which has finite sum over
. Hence, by dominated convergence, we can commute the limit with the infinite sum and we obtain
as required.
It just remains to show that the inverse of is weakly continuous. For each fixed
, the space of
with
is compact (by Tychonoff’s theorem) so, restricting to such bounded sets, the result is immediate. Every continuous one-to-one map from a compact space to a Hausdorff space has continuous inverse. Hence, we just need to show that if
is a sequence such that
weakly, then
is a bounded sequence. I will use the fact that if
is a sequence tending to zero then
.
First, we can show that is a bounded sequence. If not then, by passing to a subsequence, we can suppose that
. Setting
gives
, a contradiction. Next, we show that
is bounded. If not then, by passing to a subsequence, we can suppose that
. Taking
then
for large n, so we obtain the contradiction
⬜
In fact, for any , it is not difficult to construct a sequence
converging weakly to
such that
. Then,
tends weakly to
and, by the above result,
Hence, the definition above given for is the unique continuous extension of the Rademacher series distribution to all of
. Furthermore, we see that the set of measures
is the weak closure of the distributions of Rademacher series, and is weakly homeomorphic to .
One benefit of using the map (3) together with the weak topology is that the unit ball of (and, hence, of
) is weakly compact. Using
to denote the unit ball, we obtain a continuous map
This is equal to the distribution of when
, and is a weak homeomorphism onto its image.
One thought on “Rademacher Series”