
The famous Riemann zeta function was first introduced by Riemann in order to describe the distribution of the prime numbers. It is defined by the infinite sum
(1) |
which is absolutely convergent for all complex s with real part greater than one. One of the first properties of this is that, as shown by Riemann, it extends to an analytic function on the entire complex plane, other than a simple pole at . By the theory of analytic continuation this extension is necessarily unique, so the importance of the result lies in showing that an extension exists. One way of doing this is to find an alternative expression for the zeta function which is well defined everywhere. For example, it can be expressed as an absolutely convergent integral, as performed by Riemann himself in his original 1859 paper on the subject. This leads to an explicit expression for the zeta function, scaled by an analytic prefactor, as the integral of
multiplied by a function of x over the range
. In fact, this can be done in a way such that the function of x is a probability density function, and hence expresses the Riemann zeta function over the entire complex plane in terms of the generating function
of a positive random variable X. The probability distributions involved here are not the standard ones taught to students of probability theory, so may be new to many people. Although these distributions are intimately related to the Riemann zeta function they also, intriguingly, turn up in seemingly unrelated contexts involving Brownian motion.
In this post, I derive two probability distributions related to the extension of the Riemann zeta function, and describe some of their properties. I also show how they can be constructed as the sum of a sequence of gamma distributed random variables. For motivation, some examples are given of where they show up in apparently unrelated areas of probability theory, although I do not give proofs of these statements here. For more information, see the 2001 paper Probability laws related to the Jacobi theta and Riemann zeta functions, and Brownian excursions by Biane, Pitman, and Yor.
The functional equation discovered by Riemann relates the values of to those of
. Note that, as the series (1) is only defined on
, there are no values of s for which it gives well-defined values of both
and
. So, extending the zeta function is a prerequisite. The functional equation is often expressed in terms of the xi function, defined as
Here, is the gamma function, which is an analytic function defined by
over . A simple application of integration by parts shows that it satisfies the functional equation
and, hence, extends to an analytic function on the complex plane, other than for simple poles at the non-positive integers. The xi function extends to an analytic function on the entire complex plane, and the Riemann functional equation is
It can be shown that has no zeros in the half-plane
, for example by using product expansions of the gamma and zeta functions which are valid on this range. So, the functional equation shows that all of its zeros lie in the critical strip
. In fact, there are infinitely many zeros in this range which, according to the long unsolved Riemann hypothesis, all lie on the line of symmetry
.
I will now outline a method of extending the zeta and xi functions and of deriving the functional equation, showing how it involves a probability distribution. We start with a simple rearrangement of a particular integral. Fixing a positive constant , we evaluate the following using the substitution
,
(2) |
Hence, the definition of the zeta function over gives,
Before taking this further, I mention the Jacobi identity, which applies to the summation inside the integral above. This is,
for all real . For brevity, where I write a summation over n with no limits specified, it should be understood to be the sum over all integers, from
to
. This is a special case of the following identity at
,
(3) |
Note that the left hand side is periodic in y with period 1, and its Fourier series expansion can be computed, giving the right hand side. In particular, setting , the Jacobi identity is
. Hence, we take
above to obtain,
Now, multiply through by and use the fact that
is the second derivative of
with respect to
to obtain,
The third equality is just integration by parts, so we have obtained
(4) |
where is defined by
(5) |
We note that this vanishes quickly as x goes to infinity.
Lemma 1 The following asymptotic limit holds as x tends to infinity,
(6)
Proof: Definition (5) gives
and, by dominated convergence, we can take the limit inside the summation. The
term tends to 1 and the other terms tend to zero, giving the result. ⬜
In particular, vanishes faster than any power of x as x goes to infinity. Differentiating the Jacobi identity
twice gives the identity,
(7) |
So, also vanishes faster than any power of x as x goes to zero. Hence, the integral (4) is well-defined and gives an analytic function of s over the complex plane. This proves the analytic extension of
and, hence, of
. Using the substitution
in (4) and applying identity (7) gives the functional equation,
Looking at the function appearing in (4), we can show that it is positive and has integral equal to one, so is a probability density.
Theorem 2 The function
defines a probability density over the positive reals, and any random variable X with this density satisfies,
(8) for all
.
Proof: For all and nonzero integer n then,
and, hence, (5) gives . By the Jacobi identity (7),
for
, showing that
is positive everywhere. To show that it is a probability density, it must have integral equal to 1. Applying (4) and the functional equation,
The identity is standard, as is the fact that
as
. This shows that
has unit integral as required. Finally, (8) is just a different way of expressing (4). ⬜
Equation (8) describes the moments of a random variable X with probability density . In particular, using the special values
and
,
As an example of an apparently unrelated situation where this distribution occurs, consider a standard Brownian bridge . This is standard Brownian motion conditioned on hitting zero at time
. Look at the difference between its maximum and minimum values.
Up to a constant scaling factor, this has probability density . To be precise, if we scale it by
, then its probability density is
. As a little bit of trivia, suppose that now B is a standard Brownian motion. Then,
is a Brownian bridge over
. Applying the result just stated together with (8) for the moments gives
At the time of writing, these are the equations on the notepad in the banner image of this site.
Another example is given by normalised Brownian excursions, which can be constructed as a Brownian bridge conditioned on being nonnegative. The maximum value of the excursion, again after scaling by , has probability density
.
Riemann’s functional equation can alternatively be expressed as a symmetry of of the probability distribution introduced above.
Theorem 3 (The Functional Equation) Let X be a positive random variable with probability density
. Then,
(9) for all measurable functions
.
Proof: We can define a new measure on the underlying probability space by
for all measurable functions . This has generating function,
for real s. Taking gives
, so that it is a probability measure. Then, as the distribution of a positive random variable is uniquely determined by its generating function,
, giving (9). ⬜
Actually, identity (9) is equivalent to the functional equation. By linearity, it extends to any such that
is integrable. Taking
gives,
Interestingly, we have seen the symmetry (9) previously in these notes for an entirely different distribution. Lemma 10 of the post on the normal distribution states that it holds for any lognormal random variable X of mean 1.
Another Distribution
Instead of starting with the usual series (1) defining the Riemann zeta function, we can instead consider a similar series which has alternating signs, giving the Dirichlet eta function
This has the benefit of converging for all which, importantly, includes the critical strip. The new function can be expressed in terms of the original zeta function,
In a similar fashion to the derivation above, we apply identity (2) with ,
where we set
(10) |
Applying integration by parts, we have obtained
(11) |
where I define
A Jacobi style identity is also satisfied by , as can be seen either by taking
in (3), or by using (10) to express in terms
and applying the Jacobi identity for
,
(12) |
Hence, satisfies the identity,
(13) |
These expressions show that is asymptotic to
as x goes to infinity, and
as x goes to zero. In either case,
vanishes faster than any power of x so the integral in (11) is defined for all
, giving an alternative extension of
to the complex plane. Also, using the expressions above for
and
in terms of
gives the identity,
This is a differential equation for in terms of
, which can be integrated to give
(14) |
This leads us to another probability density.
Theorem 4 The function
defines a probability density over the positive reals, and any random variable X with this density satisfies,
(15) for all
.
Proof: Equation (14) together with positivity of shows that
is positive. To be a probability density, it must integrate to 1. Applying (11),
as required. Finally, (15) follows from (11). ⬜
As an example of where this second probability distribution appears in a context apparently unrelated to the Riemann zeta function, consider an IID sequence of random variables with continuous distribution function
. We also consider the sample approximations to this given by the proportion of the first n values of the
which are less than x,
The maximum discrepancy between this sample distribution function and the true underlying one is,
This is a random quantity which, after scaling by the factor , converges in distribution as
to the probability measure with density
.
Another example is given by a standard Brownian bridge B. The scaled absolute maximum can be shown to have probability density
. Additionally, the ‘sample standard deviation’
after scaling by a factor , also has probability density
. A proof of this is given below, in lemma 12.
A further example is provided by a Brownian meander . This is standard Brownian motion conditioned on being nonnegative over the interval
, and it is known that
has probability density
.
Properties of the Distributions
I will now describe and prove some of the properties of the distributions, including computing their cumulative distribution and moment generating functions. I also show how they can be realized as a sum of gamma distributed random variables. First, the two probability densities are related in the following simple but surprising way.
Lemma 5 Let X have probability density
and, independently, let Y be uniformly distributed on the interval
. Then,
has probability density
.
Proof: The generating function of can be computed,
which, according to (15), is the generating function corresponding to a variable with probability density . ⬜
Less surprisingly, the cumulative distribution functions can be computed directly from the definitions above.
Lemma 6 If X has probability density
, then its distribution function is,
(16) If it has probability density
, then its distribution function is,
(17)
Proof: For the first equation, if X has probability density then, as
vanishes at zero, the cumulative distribution is given by
For equation (17), equality of each of the three expressions on the right hand side is given by differentiating the defining series of and of
term by term. In particular this shows that
vanishes at zero, and the distribution function is given by integrating,
⬜
The calculation of the moment generating functions will require the following integral identity due to Lévy.
Lemma 7 For any positive reals
, the integral identity
(18) holds.
Proof: Using the substitution ,
The integral in the lemma can be split into the ranges and
. Applying the substitution above over
gives,
Next, apply the substitution so that,
which transforms the integral to,
as required. ⬜
We now compute the moment generating functions for squares of random variables with the distributions above. The results are surprisingly simple, and it is not clear why this should be so.
Lemma 8 If X has probability density
then,
for all real
. If X has probability density
then,
Proof: As the exponential term inside the expectations tends to 1 as X goes to zero, rather than vanishing, we use the form for the probability densities which explicitly vanish at zero, to ensure integrability of all the terms. If X has probability density , we use expression (12) for
, and integration by parts,
as required. Here, (18) was used to perform the integral.
Next, suppose that X has probability density . We use the form,
Using integration by parts, and the notation for the derivative with respect to
,
as required. ⬜
Knowing the moment generating functions opens up a world of possibilities. For example, we find the following simple relation between the two distributions.
Lemma 9 Let X and Y be independent random variables with probability density
. Then,
has probability density
.
Proof: The moment generating function of can be computed using lemma 8.
Applying lemma 8 again and using the fact that the distribution of a nonnegative random variable is uniquely determined by its moment generating function shows that Z has probability density . ⬜
We now have lemma 5 which give us a method of converting a random variable with density to one with density
. In the other direction, lemma 9 goes from a pair of independent variables with density
to one with density
. Combining these gives a characterization of these distributions.
Corollary 10 Let X,Y,U,V be independent random variables with U,V uniform on
.
- If X,Y have probability density
, then so does
.
- If X,Y have probability density
, then so does
.
Proof: For the first statement, has density
by lemma 9, so the result follows from lemma 5.
For the second statement, and
are independent and have density
by lemma 5, so the result follows from lemma 9. ⬜
In fact, it is not difficult to show that the properties given by corollary 10 uniquely determine the probability distributions, up to a constant scaling factor. This can be done by iteratively applying the statements to approximate the distributions in terms of a sum over products of inverse squares of uniform random variables. I do not go through this here though, and instead show how we can construct random variables with the given densities as a sum of gamma distributed random variables.
Recall that, for real , the
distribution on the positive reals is given by the probability density
.
Theorem 11 Let
be constant and
be an IID sequence of nonnegative random variables with the
distribution, and set
This has moment generating function
(19) for real
. In particular,
- if
then
has probability density
.
- if
then
has probability density
.
Proof: As it has the gamma distribution with parameter ,
has moment generating function
So, using independence of the , we compute
Substituting in the product expansion for
gives (19) as required.
Finally, setting then, by what we have shown above,
Using the fact that the distribution is uniquely determined by the moment generating function, lemma 8 says that Y has density if
and
if
. ⬜
Theorem 11 enables us to describe the distribution of the sample standard deviation of a Brownian bridge in terms of the probability density , as was promised above.
Lemma 12 Let B be a standard Brownian bridge with sample mean
and sample variance
Then,
has probability density
.
Proof: By the Fourier expansion of the Brownian bridge,
for IID standard normals . By Parseval’s theorem,
As are IID
random variables, the result is given by theorem 11. ⬜
A Family of Distributions
In light of theorem 11, we see that the distributions introduced above with densities and
are just two out of an infinite family, one for each positive real number
. I briefly look at this family, and show how they appear as integrals of Bessel bridges.
Definition 13 For real
, a nonnegative random variable X will be said to have distribution
if
for all real
.
See the 2001 paper, Infinitely divisible laws associated with hyperbolic functions by Pitman and Yor for a study of this and other related families of distributions. Using this definition, theorem 11 states the following.
Lemma 14 A nonnegative random variable X
- has distribution
if and only if
has density
.
- has distribution
if and only if
has density
.
Expressing this back in terms of the Riemann zeta function, (8) gives the moments
for and X with distribution
. Similarly, equation (15) gives
when X has distribution .
Theorem 11 also realizes the distributions as sums of gamma distributed random variables.
Lemma 15 Let
be constant and
be an IID sequence of nonnegative random variables with the
distribution. Then,
has distribution
.
It is standard that the moment generating function of the sum of a pair of nonnegative variables is equal to the product of their moment generating functions if and only if they are independent. So, definition 13 immediately gives the following result.
Lemma 16 Let X and Y be independent with distributions
and
respectively. Then
has distribution
.
An example of the occurrence of law is provided by the integral of a squared Brownian bridge.
Lemma 17 Let B be a Brownian bridge. Then
has distribution
.
Proof: We use the sine series representation of the Brownian bridge,
where is an IID sequence of standard normals. By Parseval’s theorem,
However, are IID
random variables, so the result follows from lemma 15. ⬜
Finally, I show how all of the laws occur as integrals of squared Bessel bridges. Recall that for positive integer a, the sum of the squares of a independent Brownian motions is a squared Bessel process of order a. This generalizes to noninteger
, and such processes are denoted as
. I will not be concerned with a detailed description here, but note that they have the properties:
- If B is standard Brownian motion, then
is a
process.
- If independent processes X,Y are respectively
and
, then
is a
process.
Bessel bridges are Bessel processes conditioned on hitting zero at time . Again, I will not be concerned with a full description, and just require the following properties. Using the terminology ‘
bridge’ for a squared Bessel bridge of order a:
- If B is a Brownian bridge, then
is a
bridge.
- If independent processes X,Y are respectively
and
bridges, then
is a
bridge.
Integrals of squared Bessel bridges give random variables in the family .
Theorem 18 For any
, let X be a
bridge. Then
has distribution
.
Proof: Fixing , define
by
where X is a bridge, defined on some probability space. As the sum of independent
and
bridges is a
bridge, we have
. The only solutions to this functional equation on the positive reals are of the form
for constant
.
As the square of a Brownian bridge is a bridge, lemma 17 gives
. We have shown that, for a
bridge X,
as required. ⬜
Wu’s lemma asserts that \\half(\exp(2x – 1) has 2-adic integral coefficients and reduces mod two to \sum_{k geq 0} x^{2^k) , cf https://arxiv.org/abs/1608.04702 lemma 2.1.3. May I send you a short note related this? It seems to bw related to the Stefan-Boltzmann relation in statistical mechanics \dots