The Riemann Zeta Function and Probability Distributions

The famous Riemann zeta function was first introduced by Riemann in order to describe the distribution of the prime numbers. It is defined by the infinite sum

 \displaystyle \begin{aligned} \zeta(s) &=1+2^{-s}+3^{-s}+4^{-s}+\cdots\\ &=\sum_{n=1}^\infty n^{-s}, \end{aligned} (1)

which is absolutely convergent for all complex s with real part greater than one. One of the first properties of this is that, as shown by Riemann, it extends to an analytic function on the entire complex plane, other than a simple pole at ${s=1}$. By the theory of analytic continuation this extension is necessarily unique, so the importance of the result lies in showing that an extension exists. One way of doing this is to find an alternative expression for the zeta function which is well defined everywhere. For example, it can be expressed as an absolutely convergent integral, as performed by Riemann himself in his original 1859 paper on the subject. This leads to an explicit expression for the zeta function, scaled by an analytic prefactor, as the integral of ${x^s}$ multiplied by a function of x over the range ${ x > 0}$. In fact, this can be done in a way such that the function of x is a probability density function, and hence expresses the Riemann zeta function over the entire complex plane in terms of the generating function ${{\mathbb E}[X^s]}$ of a positive random variable X. The probability distributions involved here are not the standard ones taught to students of probability theory, so may be new to many people. Although these distributions are intimately related to the Riemann zeta function they also, intriguingly, turn up in seemingly unrelated contexts involving Brownian motion.

In this post, I derive two probability distributions related to the extension of the Riemann zeta function, and describe some of their properties. I also show how they can be constructed as the sum of a sequence of gamma distributed random variables. For motivation, some examples are given of where they show up in apparently unrelated areas of probability theory, although I do not give proofs of these statements here. For more information, see the 2001 paper Probability laws related to the Jacobi theta and Riemann zeta functions, and Brownian excursions by Biane, Pitman, and Yor. Continue reading “The Riemann Zeta Function and Probability Distributions”

The Riemann Zeta Function and the Functional Equation

For these initial posts of this blog, I will look at one of the most fascinating objects in mathematics, the Riemann zeta function. This is defined by the infinite sum

 $\displaystyle \zeta(s)=\sum_{n=1}^\infty n^{-s},$ (1)

which can be shown to be uniformly convergent for all ${s\in{\mathbb C}}$ with ${\Re(s) > 1}$. It was Bernhard Riemann who showed that it can be analytically continued to the entire complex plane with a single pole at ${s=1}$, derived a functional equation, and showed how its zeros are closely linked to the distribution of the prime numbers. Riemann’s seminal 1859 paper is still an excellent introduction to this subject, an English translation of which (On the Number of Prime Numbers less than a Given Quantity) can be found on the Clay Mathematics Institute website, who included the conjecture that the non-trivial’ zeros all lie on the line ${\Re(s)=1/2}$ among their million dollar millenium problems.

In this post I will give a brief introduction to the zeta function and look at its functional equation. In particular, the functional equation can be generalized and reinterpreted as an identity of Mellin transforms, which links the additive Fourier transform on ${{\mathbb R}}$ with the multiplicative Mellin transform on the nonzero reals ${{\mathbb R}^*}$. The aim is to the prove the generalized functional equation and some properties of the zeta function, working from first principles, and discuss at a high level how this relates to the ideas in Tate’s thesis. Some standard complex analysis and Fourier transform theory will be used, but no prior understanding of the Riemann zeta function is assumed.

The zeta function has a long history, going back to the Basel problem which was posed by Pietro Mengoli in 1644. This asked for the exact value of the sum of the reciprocals of the square numbers or, equivalently, the value of ${\zeta(2)}$. This was eventually solved by Leonard Euler in 1734 who discovered the famous identity

 $\displaystyle \zeta(2)=\frac{\pi^2}{6}.$

Euler found the values of ${\zeta}$ at all positive even numbers, although I will not be concerned with this here. More pertinent to the current discussion is the product expression also found by Euler,

 $\displaystyle \zeta(s)=\prod_p(1-p^{-s})^{-1}.$ (2)

The product is taken over all prime numbers ${p}$, and converges on ${\Re(s) > 1}$. Proving (2) is straightforward. The formula for summing a geometric series gives

 $\displaystyle (1-p^{-s})^{-1}=1+p^{-s}+p^{-2s}+p^{-3s}+\cdots.$

Substituting this into (2) and expanding the product gives an infinite sum over terms of the form

 $\displaystyle (p_1^{r_1}p_2^{r_2}\cdots p_k^{r_k})^{-s}$

for ${k\ge0}$, primes ${p_1 < p_2 < \cdots < p_k}$, and integers ${r_i > 0}$. Using the fact that every positive integer has a unique expression as a product of powers of distinct primes, we see that the Euler product expands as a sum of terms of the form ${n^{-s}}$ as ${n}$ ranges over the positive integers. This is just the right hand side of (1) and shows that the Euler product converges and is equal to ${\zeta(s)}$ whenever the sum (1) is absolutely convergent.

The Euler product provides a link between the zeta function and the prime numbers, with far-reaching consequences. For example, the prime number theorem describing the asymptotic distribution of the prime numbers was originally proved using the Euler product, and the strongest known error terms available for this theorem still rely on the link between the prime numbers and the zeta function given by (2). Euler used the fact that (1) diverges at ${s=1}$ to argue that (2) also diverges at ${s=1}$. From this, it is immediately deduced that there are infinitely many primes and, more specifically, the reciprocals of the primes sum to infinity.

The Euler product can also be expressed in terms of the logarithm of the zeta function. Using the Taylor series expansion of ${\log(1-p^{-s})}$, we obtain

 $\displaystyle \log\zeta(s)=\sum_p\sum_{k=1}^\infty \frac{p^{-ks}}{k}.$ (3)

As the terms on the right hand side are bounded by ${\lvert n^{-s}\rvert}$ as ${n}$ runs through the subset of the natural numbers consisting of prime powers, it will be absolutely convergent whenever (1) is. In particular, (3) converges on the half-plane ${\Re(s) > 1}$. Although the complex logarithm is generally only defined up to integer multiples of ${2\pi i}$, (3) gives the unique continuous version of ${\log\zeta(s)}$ over ${\Re(s) > 1}$ which takes real values on the real line.

We will first look at the zeta functional equation as described by Riemann. This involves the gamma function defined on ${\Re(s) > 0}$ by the absolutely convergent integral

 $\displaystyle \Gamma(s)=\int_0^\infty x^{s-1}e^{-s}\,dx.$ (4)

This is easily evaluated at ${s=1}$ to get ${\Gamma(1)=1}$, and an integration by parts gives the functional equation of ${\Gamma}$,

 $\displaystyle \Gamma(s+1)=s\Gamma(s).$

This can be used to evaluate the gamma function at the positive integers, ${\Gamma(n)=(n-1)!}$. Also, by expressing ${\Gamma(s)}$ in terms of ${\Gamma(s+1)}$, it allows us to extend ${\Gamma(s)}$ to a meromorphic function on ${\Re(s) > -1}$ with a single simple pole at ${s=0}$. Repeatedly applying this idea extends ${\Gamma(s)}$ as a meromorphic function on the entire complex plane with a simple pole at each non-positive integer. Furthermore, it is known that ${\Gamma(s)}$ is non-zero everywhere on ${{\mathbb C}}$.

The functional equation can now be stated as follows.

Theorem 1 (Riemann) The function ${\zeta(s)}$ defined by (1) uniquely extends to a meromorphic function on ${{\mathbb C}}$ with a single simple pole at ${s=1}$ of residue ${1}$. Setting

 $\displaystyle \Lambda(s)=\pi^{-s/2}\Gamma(s/2)\zeta(s)$ (5)

this satisfies the identity

 $\displaystyle \Lambda(s)=\Lambda(1-s).$ (6)

Riemann actually gave two independent proofs of this, the first using contour integration and the second using an identity of Jacobi. Many alternative proofs have been discovered since, with Titchmarsh listing seven (The Theory of the Riemann Zeta Function, 1986, second edition). I will not replicate these here, but will show an alternative formulation as an identity of Mellin transforms, from which Riemann’s functional equation (6) follows as a special case.

As an example of the use of the functional equation to derive properties of the zeta function on the left half-plane ${\Re(s)\le0}$, we evaluate ${\zeta(0)}$. Using the special value ${\Gamma(1/2)=\sqrt{\pi}}$ and the fact that ${\zeta(s)}$ has a pole of residue ${1}$ at ${s=1}$, we see that ${\Lambda(s)\sim1/(s-1)}$ as ${s}$ approaches ${1}$. Similarly, using the fact that the gamma function has a pole of residue ${1}$ at ${s=0}$, we see that ${\Lambda(s)\sim2\zeta(0)/s}$ as ${s}$ approaches ${0}$. Putting these limits into the functional equation gives

 $\displaystyle \zeta(0)=-1/2.$

Similarly, the functional equation expresses the values of ${\zeta}$ at negative odd integers in terms of its values at positive even integers. For example, taking ${s=-1}$,

 $\displaystyle \pi^{1/2}\Gamma(-1/2)\zeta(-1)=\pi^{-1}\Gamma(1)\zeta(2).$

Plugging in the values ${\Gamma(-1/2)=-2\sqrt{\pi}}$, ${\Gamma(1)=1}$ and Euler’s value of ${\zeta(2)=\pi^2/6}$,

 $\displaystyle \zeta(-1)=-\frac{1}{12}.$

Via a process known as zeta function regularization, these special values of ${\zeta}$ are sometimes written as the famous, but rather confusing, expressions

 $\displaystyle \setlength\arraycolsep{2pt} \begin{array}{rl} &\displaystyle 1+1+1+1+\cdots = -\frac12,\smallskip\\ &\displaystyle 1+2+3+4+\cdots = -\frac1{12}. \end{array}$

Next, using standard properties of the gamma function, theorem 1 can be used to investigate the zeros of the zeta function. The Euler product implies that ${\zeta(s)}$ has no zeros on ${\Re(s) > 0}$, as I will show below, and it is well known that the gamma function has no zeros at all. So, ${\Lambda(s)}$ has no zeros or poles anywhere on ${\Re(s) > 1}$, and the functional equation extends this statement to ${\Re(s) < 0}$. It follows that, on ${\Re(s) < 0}$, the zeros of the zeta function must cancel with the poles of ${\Gamma(s/2)}$, which are at the negative even integers.

On the strip ${0\le\Re(s)\le1}$, the precise location of the zeros of ${\zeta(s)}$ are not known. However, as ${\Gamma(s/2)}$ has no poles or zeros on this domain (other than at ${s=0}$), they must coincide with the zeros of ${\Lambda(s)}$. From the definition (1) of the zeta function, it satisfies ${\zeta(\bar s)=\overline{\zeta(s)}}$ (using a bar to denote complex conjugation). So, its zeros are preserved by reflection ${s\mapsto\bar s}$ about the real line. Also, by the functional equation, they are preserved by the map ${s\mapsto1-s}$ in the aforementioned strip. We have arrived at the following.

Lemma 2 The function ${\zeta(s)}$ has zeros at the (strictly) negative even integers. The only remaining zeros lie in the vertical strip ${0\le\Re(s)\le1}$ and are preserved by the maps

 $\displaystyle \setlength\arraycolsep{2pt} \begin{array}{rl} &\displaystyle s\mapsto \bar s,\\ &\displaystyle s\mapsto 1-s. \end{array}$ (7)

The zeros at the negative even integers are called the trivial zeros of ${\zeta}$, with the remaining ones referred to as the non-trivial zeros. The domain ${0\le\Re(s)\le1}$ is known as the critical strip. So, the non-trivial zeros of the Riemann zeta function are precisely those lying in the critical strip, and are the same as the zeros of the function ${\Lambda(s)}$ defined by (5). The vertical line ${\Re(s)=\frac12}$ lying along the center of the critical strip is called the critical line. Then, (7) says that the non-trivial Riemann zeta zeros are symmetric under reflection about both the real line and the critical line. The Riemann hypothesis, as originally conjectured by Riemann in his 1859 paper, states that the non-trivial zeros all lie on the critical line. This, however, remains unknown and is one of the great open problems of mathematics.

We will now move on to the alternative interpretation of the functional equation relating additive Fourier transforms with multiplicative transforms. We will use the following convention for the Fourier transform of a function ${f\colon{\mathbb R}\rightarrow{\mathbb C}}$,

 $\displaystyle \hat f(y)=\int_{-\infty}^\infty e^{-2\pi ixy}f(x)\,dx.$ (8)

For this to make sense, it should at least be required that ${f}$ is integrable. I will restrict to the particularly nice class of Schwartz functions. These are the infinitely differentiable functions from ${{\mathbb R}}$ to ${{\mathbb C}}$ which vanish faster than polynomially at infinity, along with their derivatives to all orders. That is, ${x^rf^{(s)}(x)\rightarrow0}$ as ${\lvert x\rvert\rightarrow\infty}$, for all integers ${r,s\ge0}$. Denote the space of Schwartz functions by ${\mathcal S}$. Schwartz functions are integrable and it is known that their Fourier transforms are again in ${\mathcal{S}}$. Then, for any ${f\in\mathcal S}$, its Fourier transform is inverted by

 $\displaystyle f(x)=\int_{-\infty}^\infty e^{2\pi ixy}\hat f(y)\,dy.$

I’ll explain now why the Fourier transform (8) is an additive transform of ${f}$. For each fixed ${y}$, the map ${x\mapsto e^{2\pi ixy}}$ is a continuous homomorphism from the additive group of real numbers to the multiplicative group of nonzero complex numbers ${{\mathbb C}^*}$,

 $\displaystyle e^{2\pi i(x_1+x_2)y}=e^{2\pi ix_1y}e^{2\pi ix_2y}.$

So, ${x\mapsto e^{2\pi ixy}}$ is a character of the reals under addition. Furthermore, integration is invariant under additive translation,

 $\displaystyle \int_{-\infty}^\infty f(x)\,dx=\int_{-\infty}^\infty f(x+a)\,dx.$

That is, the standard (Riemann or Lesbesgue) integral is the Haar measure of the additive group of reals, and the Fourier transform (8) is the integral of ${f(x)}$ against additive characters with respect to the additive Haar measure.

The Mellin transform of ${f\colon{\mathbb R}\rightarrow{\mathbb C}}$ is

 $\displaystyle M(f,s)=\int_{-\infty}^\infty f(x)\lvert x\rvert^{s-1}\,dx,$ (9)

which is defined for any ${s\in{\mathbb C}}$ for which the integral is absolutely convergent. (This differs slightly from the usual definition where a lower limit of ${0}$ is used for the integral. See the note on Mellin transforms at the end of this post.) Now, the map ${x\mapsto\lvert x\rvert^{s}}$ is a continuous homomorphism from the multiplicative group of nonzero reals ${{\mathbb R}^*}$ to ${{\mathbb C}^*}$,

 $\displaystyle \lvert x_1x_2\rvert^s=\lvert x_1\rvert^s\lvert x_2\rvert^s.$

Denoting ${d^*x=dx/\lvert x\rvert}$, integration with respect to ${d^*x}$ is invariant under multiplicative rescaling by any ${a\in{\mathbb R}^*}$,

 $\displaystyle \int_{-\infty}^{\infty} f(ax)\,d^*x=\int_{-\infty}^\infty f(x)\,d^*x.$

That is, ${\int\cdot\,d^*x}$ is the multiplicative Haar measure on ${{\mathbb R}^*}$, and the Mellin transform is the integral of ${f(x)}$ against multiplicative characters with respect to the multiplicative Haar measure. This explains why the Fourier transform (8) is additive and the Mellin transform (9) is multiplicative.

For an explicit example of a Mellin transform, consider ${f(x)=e^{-\pi x^2}}$,

 $\displaystyle \setlength\arraycolsep{2pt} \begin{array}{rl} \displaystyle M(f,s)&\displaystyle=\int_{-\infty}^\infty \lvert x\rvert^{s-1}e^{-\pi x^2}\,dx\smallskip\\ &\displaystyle=\int_0^\infty \pi^{-s/2}y^{s/2-1}e^{-y}\,dy. \end{array}$

Here, the substitution ${y=\pi x^2}$ was used. Comparing with the definition (4) of the gamma function,

 $\displaystyle M(f,s)=\pi^{-s/2}\Gamma(s/2).$ (10)

For Schwartz functions, the integral defining the Mellin transform is absolutely convergent on the right half-plane ${\Re(s) > 0}$, and can be analytically continued to the entire complex plane.

Theorem 3 If ${f\in\mathcal{S}}$, then ${M(f,s)}$ is well defined over ${\Re(s) > 0}$, and uniquely extends to a meromorphic function on ${{\mathbb C}}$ with only simple poles at the non-positive even numbers ${-2n}$, with residue

 $\displaystyle {\rm Res}({M(f,\cdot)},-2n)=2\frac {f^{(2n)}(0)}{(2n)!}.$

I’ll give a proof of theorem 3 below. For now, we will move straight on to the statement of the functional equation relating the Mellin transform of ${f}$ to that of its Fourier transform ${\hat f}$.

Theorem 4 If ${f\in\mathcal{S}}$ then ${M(f,s)\zeta(s)}$ extends to a meromorphic function with only simple poles at ${0}$ and ${1}$ of residue ${-f(0)}$ and ${\hat f(0)}$ respectively. The functional equation

 $\displaystyle M(f,s)\zeta(s)=M(\hat f,1-s)\zeta(1-s)$ (11)

holds everywhere.

A proof of this will be given further down. It can be shown that the specific case ${f(x)=e^{-\pi x^2}}$ is equal to its own Fourier transform, ${\hat f=f}$. So, using expression (10) for the Mellin transform ${M(f,s)}$, we see that Riemann’s functional equation follows directly from (11).

Above, we discussed how Riemann’s functional equation allows values of ${\zeta(s)}$ to be determined on the left half-plane ${\Re(s)\le0}$ and restricts the locations of its zeros to be as described in lemma 2. This made use of properties of the gamma function, specifically the locations of its poles and the fact that it has no zeros. These arguments can be made by instead using version (11) of the functional equation, and the gamma function need not be referred to at all. For an arbitrary smooth function ${f}$, theorem 3 gives the poles of ${M(f,s)}$. Also, by choosing ${f}$ with compact support in ${{\mathbb R}^*}$, ${M(f,s)}$ will be well-defined everywhere by (9) and is analytic. It is also easy to choose ${f}$ such that the Mellin transform does not vanish at any specified point, which is enough to apply the arguments above.

I now briefly consider the relation between the functional equation in the form (11) and the ideas of John Tate’s 1950 thesis. Theorem 4 can be viewed primarily as relating the Mellin transform of the Fourier transform to the Mellin transform of the original function. The zeta function plays more of an ancillary role as a multiplicative factor in this identity. This treatment of the Mellin transform as the primary object of interest was taken much further in Tate’s thesis. Tate refocussed attention from the rational and real number fields (or algebraic number field) to the larger ring of adeles, ${\mathbb A}$. This is outside of the scope of this post, but the important properties are that, just like the embedding of the rationals inside the reals, ${{\mathbb Q}}$ embeds in ${\mathbb A}$, and the theory of Fourier and Mellin transforms extends to functions defined on the adeles. In the adelic case, he obtained the functional equation

 $\displaystyle M(f,s)=M(\hat f,1-s).$ (12)

Now, the zeta function does not appear at all! Digging a bit deeper, the ring of adeles over the rational numbers can be expressed as a product of the reals and the ring of finite’ adeles,

 $\displaystyle \mathbb A ={\mathbb R}\times\mathbb A_f.$

That is, every element ${x}$ of the adelic ring can be expressed as a pair ${(x_\infty,x_f)}$ consisting of a real number ${x_\infty}$ and a finite adele ${x_f}$. For a function ${f\colon\mathbb A\rightarrow{\mathbb C}}$ which is a product of the real and finite parts, ${f(x)=g(x_\infty)h(x_f)}$, the Mellin and Fourier transforms are also products of the transforms of the individual components.

 $\displaystyle \setlength\arraycolsep{2pt} \begin{array}{rl} &\displaystyle \hat f(x)=\hat g(x_\infty)\hat h(x_f),\smallskip\\ &\displaystyle M(f,s)=M(g,s)M(h,s). \end{array}$

Applying this to the adelic functional equation (12),

 $\displaystyle M(g,s)M(h,s)=M(\hat g,1-s)M(\hat h,1-s).$

Just as the special case ${g(x)=e^{-\pi x^2}}$ lead to the gamma factor in Riemann’s functional equation, choosing a particular example for the function ${h}$ on the finite adeles which equals its own Fourier transform leads to the appearance of the zeta function in (5) and (11). This places the gamma term and the zeta term in the functional equation on a roughly equal footing.

We can go further. The finite adeles can be broken down into a restricted product of fields corresponding to each prime number — the ${p}$-adic numbers,

 $\displaystyle \mathbb A_f = {\prod_p}^\prime{\mathbb Q}_p.$

In the case where the function ${h(x_f)}$ factors into a product of functions on the ${p}$-adic components, ${h(x_f)=\prod_ph_p(x_p)}$, then the Mellin transform commutes with this factorisation,

 $\displaystyle M(h,s)=\prod_pM(h_p,s).$

For a particular choice of ${h}$, specifically the indicator function of the integral adeles, this is just the Euler product described above. From this viewpoint, the gamma factor in Riemann’s functional equation, the Mellin transform appearing in (11), the Riemann zeta function, and the ${(1-p^{-s})^{-1}}$ terms in the Euler product, are all just manifestations of factorizations of the Mellin transform on the adeles, which itself satisfies the functional equation (12).

Finally, we note that there is an intimate connection between additive and multiplicative structures pervading the above discussion. The natural numbers, which are generated under addition by the unit element ${1}$, are also generated under multiplication by the prime numbers. This is reflected in the definition of the zeta function as a sum over the natural numbers (1), which is equivalent to the multiplicative definition given by the Euler product (2). Then, the functional equation (11) ties together the additive Fourier transform over ${{\mathbb R}}$ with the multiplicative Mellin transform over ${{\mathbb R}^*}$.

Elementary Inequalities

Above, I stated a few results but, now, let’s move on and actually prove a few things regarding the Riemann zeta function. As this post is not assuming any prior understanding of ${\zeta(s)}$, I start at a very basic level and will derive a few elementary inequalities. By elementary, I mean things which can be proved straight from the definition (1) of the zeta function. These will be rather basic and far from optimal — especially in the critical strip — but are easy to prove and give some understanding of what the zeta function looks like.

First, for any positive real ${s}$, the function ${x\mapsto x^{-s}}$ is decreasing, giving

 $\displaystyle (n+1)^{-s} < x^{-s} < n^{-s}$

for any positive integer ${n}$ with ${n < x < n +1}$. Integrating over ${x\ge1}$ and substituting in the definition of ${\zeta(s)}$ for ${s > 1}$,

 $\displaystyle \zeta(s)-1=\sum_{n=1}^\infty(n+1)^{-s} < \int_1^\infty x^{-s}\,dx < \sum_{n=1}^\infty n^{-s}=\zeta(s).$

Substituting in the value ${1/(s-1)}$ for the integral gives the following bounds.

Lemma 5 The sum (1) converges absolutely at all real ${s > 1}$, and satisfies the bound

 $\displaystyle \frac1{s-1} < \zeta(s) < \frac s{s-1}.$ (13)

In particular, ${\zeta(s)\sim1/(s-1)}$ as ${s}$ approaches ${1}$ from above.

Moving on to ${s\in{\mathbb C}}$, we can use the identity ${\lvert x^s\rvert=x^\sigma}$, where ${\sigma}$ is the real part of ${s}$, to write,

 $\displaystyle \left\lvert\zeta(s)-1\right\rvert\le\sum_{n=2}^\infty\lvert n^{-s}\rvert =\sum_{n=2}^\infty n^{-\sigma}.$

Comparing the right hand side with the definition of ${\zeta}$, we get,

Lemma 6 The sum (1) converges absolutely at all ${s\in{\mathbb C}}$ with ${\Re(s) > 1}$ and satisfies the bound

 $\displaystyle \lvert\zeta(s)-1\rvert\le\zeta(\sigma)-1 < \frac1{\sigma-1}$

where ${\sigma=\Re(s)}$.

The right-hand inequality here is just an application of (13). In particular, lemma 6 implies that ${\zeta(s)}$ is uniformly bounded on the half-plane ${\Re(s)\ge\sigma_0}$, any ${\sigma_0 > 1}$, with the bound ${\sigma_0/(\sigma_0-1)}$. It also shows that ${\zeta(s)\rightarrow1}$ uniformly as ${\Re(s)\rightarrow\infty}$.

Next, the Euler product expansion (2) can be utilized to show that ${\zeta}$ has no zeros on the open right half-plane ${\Re(s) > 1}$. Applying the inequality ${\lvert 1-x\rvert^{-1} > 1-\lvert x\rvert}$, which applies for all ${0 < \lvert x\rvert < 1}$,

 $\displaystyle \lvert\zeta(s)\rvert > \prod_p(1-\lvert p^{-s}\rvert)=\prod_p(1-p^{-\sigma})$

with ${\sigma=\Re(s)}$. Noting that the right hand side is just the reciprocal of the Euler product of ${\zeta(\sigma)}$ gives a lower bound.

Lemma 7 The zeta function ${\zeta(s)}$ is nonzero everywhere on the domain ${\Re(s) > 1}$ and satisfies the lower bound,

 $\displaystyle \lvert\zeta(s)\rvert > \zeta(\sigma)^{-1} > 1-\sigma^{-1}$ (14)

with ${\sigma=\Re(s)}$.

The right-hand inequality here is another direct application of (13). Lemma 7 shows that ${\zeta(s)}$ is uniformly bounded away from zero on the half-plane ${\Re(s)\ge\sigma_0}$, for any ${\sigma_0 > 1}$.

Expression (3) for the logarithm of the zeta function can be used to obtain further bounds. On the half-plane ${\Re(s)=\sigma > 1}$, we use ${\lvert p^{-ks}\rvert=p^{-k\sigma}}$,

 $\displaystyle \lvert\log\zeta(s)\rvert\le\sum_p\sum_{k=1}^\infty\frac{p^{-k\sigma}}{k}=\log\zeta(\sigma).$

So, we have obtained the following.

Lemma 8 The logarithm of the zeta function over ${\Re(s) = \sigma > 1}$ satisfies the bound

 $\displaystyle \lvert\log\zeta(s)\rvert\le\log\zeta(\sigma) < \log(1-\sigma^{-1})^{-1}.$

Applying this bound to ${-\log\lvert\zeta(s)\rvert}$ gives (14) as a special case.

Using ${\lfloor\cdot\rfloor}$ to denote the floor function, we can use the equality ${\lfloor x\rfloor^{-s}=n^{-s}}$ for ${n\le x < n+1}$ and each positive integer ${n}$ to rewrite the summation (1) as an integral

 $\displaystyle \setlength\arraycolsep{2pt} \begin{array}{rl} \displaystyle \zeta(s)&\displaystyle=\int_1^\infty\lfloor x\rfloor^{-s}\,dx\smallskip\\ &\displaystyle=\int_1^\infty x^{-s}\,dx +\int_1^\infty(\lfloor x\rfloor^{-s}-x^{-s})\,dx. \end{array}$

Substituting in the value ${1/(s-1)}$ for the first integral on the right hand side gives the following expression for the zeta function,

 $\displaystyle \zeta(s)=\frac1{s-1}+\int_1^\infty\left(\lfloor x\rfloor^{-s}-x^{-s}\right)\,dx.$ (15)

The idea is of that the integrand here is small in comparison to ${x^{-s}}$, so that we can expect it to converge on a larger domain than the sum (1). In fact, as we will show, it is absolutely integrable on ${\Re(s) > 0}$. As uniform limits of analytic functions are analytic, this will extend ${\zeta(s)-1/(s-1)}$ to an analytic function on this domain.

To bound the integrand in (15), note that ${x^{-s}}$ has the derivative ${-sx^{-s-1}}$ with respect to ${x}$. Using ${\sigma=\Re(s)}$, this has norm ${\lvert s\rvert x^{-\sigma-1}}$ and, as it is decreasing in ${x}$, is bounded above by ${\lvert s\rvert\lfloor x\rfloor^{-\sigma-1}}$. Hence, the mean value theorem gives the inequality

 $\displaystyle \left\lvert\lfloor x\rfloor^{-s}-x^{-s}\right\rvert\le\lvert s\rvert\lfloor x\rfloor^{-\sigma-1}(x-\lfloor x\rfloor),$

which will be strict whenever ${x}$ is not an integer. For any positive integer ${n}$, we have ${\lfloor x\rfloor=n}$ on the interval ${[n,n+1)}$ and,

 $\displaystyle \setlength\arraycolsep{2pt} \begin{array}{rl} \displaystyle \int_n^{n+1}\left\lvert\lfloor x\rfloor^{-s}-x^{-s}\right\rvert\,dx &\displaystyle < \lvert s\rvert n^{-\sigma-1}\int_n^{n+1}(x-n)\,dx\smallskip\\ &\displaystyle=\frac12\lvert s\rvert n^{-\sigma-1}. \end{array}$

Summing over ${n}$ and comparing with the definition of ${\zeta(\sigma+1)}$ gives a finite value, so we have proved the following.

Lemma 9 The zeta function ${\zeta(s)}$ extends to a meromorphic function on ${\Re(s) > 0}$ with a simple pole of residue ${1}$ at ${s=1}$, and is given by the absolutely convergent integral (15). Furthermore, on this domain, it satisfies the bound

 $\displaystyle \left\lvert\zeta(s)-\frac1{s-1}\right\rvert < \frac12\lvert s\rvert\zeta(\sigma+1) < \frac12\lvert s\rvert(1+\sigma^{-1})$ (16)

with ${\sigma=\Re(s)}$.

The final inequality here is yet another application of (13). So, we have a bound for ${\zeta(s)}$ of size ${O(\lvert s\rvert)}$ on the right half-plane ${\Re(s)\ge\sigma_0}$, for any ${\sigma_0 > 0}$. This is by no means optimal, and it can be improved to ${O(\lvert s\rvert^{1/2})}$, with even better bounds given by the — as yet — unproven Lindelöf hypothesis.

Applying inequality (16) for real ${s}$ in the interval ${0 < s < 1}$ shows that ${\zeta(s)}$ does not vanish,

 $\displaystyle 2\zeta(s) < \frac2{s-1}+s(1+s^{-1})=\frac{-1-s^2}{1-s}.$

Corollary 10 ${\zeta(s) < -1/2}$, so is nonzero, on the real line segment ${0 < s < 1}$.

Interestingly, this bound is optimal, as ${\zeta(0)=-1/2}$.

Elementary Extension of the Zeta Function

I will now describe an elementary method of analytically continuing the Riemann zeta function to the entire complex plane. Nothing in this section is required for the results discussed above, so can be skipped if required. The reason for including it here is to gain an intuitive understanding why the definition (1) of ${\zeta(s)}$ given on the half-plane ${\Re(s) > 1}$ should continue to all of ${{\mathbb C}}$, without using any `magic’ formulas such as Poisson summation or the functional equation. Instead, we can use the Euler-Maclaurin formula. Rather than just stating and applying this equation, I will derive it, as it is straightforward to do and gives a better understanding of why the zeta function necessarily extends to the complex plane.

We can apply a similar ideas to that which was used to express ${\zeta(s)}$ over ${\Re(s) > 0}$ by identity (15). To do this in more generality, I will look at a sum ${\sum_nf(n)}$ for a smooth function ${f}$. Some assumptions will be required on ${f}$ in order that the sums and integrals converge, so suppose that it is smooth and that its derivatives to all orders are integrable over ${[1,\infty)}$. This is the case for the Riemann zeta function where ${f(x)=x^{-s}}$.

For a differentiable function ${u}$ defined on the interval ${[0,1]}$, an integration by parts gives

 $\displaystyle u(0)=\int\limits_0^1u(x)\,dx+(1+c)(u(0)-u(1))+\int\limits_0^1(x+c)u^\prime(x)\,dx,$

for any constant ${c}$. The idea is to replace ${u(x)}$ by ${f(n+x)}$ in this identity and sum over ${n}$. In order that ${x+c}$ has average value ${0}$ over the unit interval, we will take ${c=-1/2}$. So, setting ${p_1(x)=x-1/2}$,

 $\displaystyle \sum_{n=1}^\infty f(n)=\int\limits_1^\infty f(x)\,dx+p_1(1)f(1)+\int\limits_1^\infty p_1(\{x\})f^\prime(x)\,dx$ (17)

with ${\{x\}}$ denoting the fractional part of ${x}$. That is ${\{x\}=x-n}$ on the interval ${n\le x < n+1}$. The hope here is that ${f^\prime}$ is sufficiently smaller then ${f}$, so that the right-hand integral converges even when the sum on the left diverges.

We take this a step further and express the integral over ${f^\prime}$ as an integral over ${f^{\prime\prime}}$. Again, consider a function ${u}$ defined on the unit interval and, choosing ${p_2}$ to be the integral of ${p_1}$, another integration by parts gives

 $\displaystyle \int\limits_0^1p_1(x)u(x)\,dx=p_2(1)u(1)-p_2(0)u(0)-\int\limits_0^1p_2(x)u^\prime(x)\,dx.$

As ${p_1}$ has zero integral, ${p_2(1)=p_2(0)}$, so replacing ${u(x)}$ by ${f^\prime(x+n)}$ and summing over ${n}$,

 $\displaystyle \int\limits_1^\infty p_1(\{x\})f^\prime(x)\,dx=-p_2(1)f^{\prime}(1)-\int\limits_1^\infty p_2(\{x\})f^{\prime\prime}(x)\,dx.$

Again, ${p_2}$ is only defined up to an arbitrary constant, so can be chosen to have zero integral over the unit interval.

We repeat this procedure ${r}$ times and substitute into (17),

 $\displaystyle \setlength\arraycolsep{2pt} \begin{array}{rl} \displaystyle \sum_{n=1}^\infty f(n)=&\displaystyle\int\limits_1^\infty f(x)\,dx+p_1(1)f(1)-p_2(1)f^\prime(1)+\cdots\smallskip\\ &\displaystyle\quad-(-1)^rp_r(1)f^{(r-1)}(1)-(-1)^r\int\limits_1^\infty p_r(\{x\})f^{(r)}(x)\,dx. \end{array}$ (18)

Here, ${p_{k+1}}$ is defined as the integral of ${p_k}$ with constant of integration chosen such that it has zero integral over the unit interval,

 $\displaystyle \setlength\arraycolsep{2pt} \begin{array}{rl} &\displaystyle p_1(x)=x-\frac12,\smallskip\\ &\displaystyle p_{k+1}(x)=\int_0^1yp_k(y)\,dy-\int_x^1p_k(y)\,dy. \end{array}$

From this definition, it can be seen that ${p_k(x)=B_k(x)/k!}$ where ${B_k(x)}$ are the Bernoulli polynomials, ${p_k(1)=B_k/k!}$ for Bernoulli numbers ${B_k}$, and (18) is the Euler-Maclaurin formula.

In particular, the derivatives of ${f(x)=x^{-s}}$ can be computed as

 $\displaystyle \setlength\arraycolsep{2pt} \begin{array}{rl} \displaystyle f^{(k)}(x)&\displaystyle=(-1)^ks(s+1)\cdots(s+k-1)x^{-s-k}\smallskip\\ &\displaystyle=(-1)^ks^{\overline k}x^{-s-k} \end{array}$

with ${s^{\overline k}}$ denoting the rising factorial, which is just a polynomial in ${s}$.

Taking ${f(x)=x^{-s}}$ for ${\Re(s) > 1}$, the left hand side of identity (18) is ${\zeta(s)}$ and the first integral on the right is ${1/(s-1)}$. Applying this to definition (1) of the zeta function,

 $\displaystyle \setlength\arraycolsep{2pt} \begin{array}{rl} \displaystyle \zeta(s)=&\displaystyle\frac1{s-1}+p_1(1)s^{\overline 0}+p_2(1)s^{\overline 1}+\cdots\smallskip\\ &\displaystyle\quad+p_r(1)s^{\overline{r-1}}-s^{\overline r}\int\limits_1^\infty p_r(\{x\})x^{-s-r}\,dx. \end{array}$ (19)

As the integral on the right converges absolutely on ${\Re(s) > -r-1}$, and ${r}$ is an arbitrary positive integer, we have the analytic extension.

Theorem 11 The function ${\zeta(s)}$ defined on ${\Re(s) > 1}$ by (1) continues to a meromorphic function on ${{\mathbb C}}$ with a single simple pole at ${s=1}$ of residue ${1}$.

Furthermore, the integral on the right of (19) is uniformly bounded over ${\Re(s)\ge\alpha}$, any ${\alpha > -r-1}$, and the ${s^{\overline k}}$ terms are polynomials, so we have the following bound on the growth of the zeta function.

Lemma 12 For every real number ${\alpha}$, there exists an ${A}$ such that

 $\displaystyle \zeta(s)=O\left(\lvert s\rvert^A\right)$

over ${\Re(s)\ge\alpha}$, as ${\lvert s\rvert\rightarrow\infty}$.

I purposefully did not put in any specific value for ${A}$ here, as the point is that the zeta function is polynomially bounded on each right half-plane and, in any case, there are more optimal values available from applying the functional equation.

Mellin Transforms

We show that the Mellin transform of a Schwartz function ${f\in\mathcal S}$ can be continued from the region ${\Re(s) > 0}$ to the entire complex plane, proving theorem 3. Choosing a positive integer ${N}$, write

 $\displaystyle R_f(x) = f(x)-1_{\{\lvert x\rvert < 1\}}\sum_{n=0}^{N-1} \frac{f^{(n)}(0)}{n!}x^n.$

This is bounded, and for ${\lvert x\rvert < 1}$ is just the remainder term in the Taylor polynomial approximation of ${f}$. By Taylor’s theorem, ${R_f(x)=O(\lvert x\rvert^N)}$ as ${x}$ approaches ${0}$. So, ${R_f(x)\lvert x\rvert^{s-1}}$ is absolutely integrable on ${\Re(s) > -N}$ and, hence, ${M(R_f,s)}$ is a well-defined analytic function on this domain. The transform of the polynomial term can be computed,

 $\displaystyle \setlength\arraycolsep{2pt} \begin{array}{rl} \displaystyle M(1_{\{\lvert x\rvert < 1\}}x^n,s) &\displaystyle= \int_{-1}^1 x^n\lvert x\rvert^{s-1}\,dx\smallskip\\ &\displaystyle= (1+(-1)^n)\int_0^1x^{s+n-1}\,dx\smallskip\\ &\displaystyle=\frac{1+(-1)^n}{s+n}. \end{array}$

The Mellin transform of ${f}$ is then,

 $\displaystyle M(f,s)=M(R_f,s)+2\sum_{n=0}^{N-1}1_{\{n{\rm\ is\ even}\}}\frac{f^{(n)}(0)}{n!}\frac1{s+n}.$

This statement holds for ${\Re(s) > 0}$ but, as the right hand side is a well-defined meromorphic function on ${\Re(s) > -N}$, it extends ${M(f,s)}$ to a meromorphic function on this domain. The poles arise from the ${1/(s+n)}$ terms with the residue stated in theorem 3. By choosing ${N}$ arbitrarily large, we have the extension to the complex plane.

Poisson Summation

The second proof of the functional equation given by Riemann in his 1859 paper made use of the following identity of Jacobi,

 $\displaystyle 2\sum_{n=1}^\infty e^{-n^2\pi x}+1=x^{-\frac12}\left(2\sum_{n=1}^\infty e^{-n^2\pi/x}+1\right).$

For the Mellin transform version of the functional equation, we make use of the Poisson summation formula. To avoid having to explicitly write limits everywhere, the notation ${\sum_n}$ is used to denote the sum as ${n}$ ranges over the integers ${{\mathbb Z}}$.

Theorem 13 If ${f\in \mathcal S}$ has Fourier transform ${\hat f}$ then,

 $\displaystyle \sum_n f(n)=\sum_n\hat f(n).$ (20)

Jacobi’s identity is just a special case of this using ${f(u)=e^{-u^2\pi x}}$. The Poisson summation formula can be proved using Fourier series. The idea is that, for any Schwartz function ${f}$, we can define a periodic ${g\colon{\mathbb R}\rightarrow{\mathbb C}}$ by

 $\displaystyle g(x)=\sum_nf(x+n)$ (21)

Since ${f}$ and its derivatives vanish rapidly at ${\infty}$, this sum is uniformly convergent, with smooth limit. Writing out its Fourier expansion,

 $\displaystyle \setlength\arraycolsep{2pt} \begin{array}{rl} &\displaystyle g(x)=\sum_nc_ne^{2\pi inx}\smallskip\\ &\displaystyle c_n=\int_0^1 g(x)e^{-2\pi i nx}\,dx, \end{array}$

the Fourier coefficients can be evaluated,

 $\displaystyle \setlength\arraycolsep{2pt} \begin{array}{rl} \displaystyle c_n &\displaystyle=\sum_m\int_0^1f(x+m)e^{-2\pi i n x}dx\smallskip\\ &\displaystyle=\sum_m\int_m^{m+1}f(x)e^{-2\pi i n x}\,dx\smallskip\\ &\displaystyle=\int f(x)e^{-2\pi i nx}\,dx=\hat f(n). \end{array}$

Substituting into (21) proves theorem 13.

 $\displaystyle \sum_nf(n)=g(0)=\sum_n c_n=\sum_n\hat f(n).$

In practise, it is convenient to express the Poisson summation formula in a slightly more general way. For each fixed ${x\in{\mathbb R}^*}$, the Fourier transform of ${y\mapsto f(yx)}$ is equal to ${\lvert x\rvert^{-1}\hat f(y/x)}$ and, putting this in (20), gives the following alternative statement of Poisson summation.

Theorem 14 If ${f\in \mathcal S}$ has Fourier transform ${\hat f}$ then, for any ${x\in{\mathbb R}^*}$,

 $\displaystyle \sum_n f(nx)=\frac1{\lvert x\rvert}\sum_n\hat f(n/x)$ (22)

The Functional Equation

The proof of the functional equation starts with the following identity

 $\displaystyle \int f(nx)\lvert x\rvert^s\,d^*x = \lvert n\rvert^{-s}\int f(x)\lvert x\rvert^s\,d^*x.$

Here, ${n}$ is any nonzero integer, ${f}$ is a Schwartz function on the reals, ${\Re(s) > 0}$, and ${d^*x=dx/\lvert x\rvert}$ represents the Haar measure on the multiplicative group ${{\mathbb R}^*}$. The identity is achieved simply by substituting ${x}$ with ${x/n}$. Restricting to ${\Re(s) > 1}$, we can sum over ${n}$,

 $\displaystyle \int \sum_{n\not=0}f(nx)\lvert x\rvert^s\,d^*x = 2M(f,s)\zeta(s).$ (23)

What we would really like to do here is to simply substitute in (22) for the sum of ${f(nx)}$, substitute ${x}$ by ${x^{-1}}$ in the integral, and immediately derive the functional equation (11). Unfortunately this leads to divergent sums and integrals. Instead, start by rearranging (22) as

 $\displaystyle \sum_{n\not=0}f(nx)=\frac1{\lvert x\rvert}\sum_{n\not=0}\hat f(n/x) + \frac1{\lvert x\rvert}\hat f(0)-f(0).$

We will apply this to the integrand in (23), but only over the range with ${\lvert x\rvert < 1}$.

 $\displaystyle \setlength\arraycolsep{2pt} \begin{array}{rl} \displaystyle \int\limits_{\lvert x\rvert < 1}\sum_{n\not=0}f(nx)\lvert x\rvert^s d^*x&\displaystyle=\int\limits_{\lvert x\rvert < 1}\left(\sum_{n\not=0}\hat f(n/x)\lvert x\rvert^{s-1}+\hat f(0)\lvert x\rvert^{s-1}-f(0)\lvert x\rvert^s\right)\,d^*x\smallskip\\ &\displaystyle=\int\limits_{\lvert x\rvert > 1}\sum_{n\not=0}\hat f(nx)\lvert x\rvert^{1-s}\,d^*x+2\frac{\hat f(0)}{s-1}-2\frac{f(0)}{s} \end{array}$

Here, we substituted ${x^{-1}}$ for ${x}$ in the first term on the right hand side, and used the exact value for the integral in the other two terms. Using this in (23),

 $\displaystyle \setlength\arraycolsep{2pt} \begin{array}{rl} \displaystyle M(f,s)\zeta(s)=&\displaystyle\frac12\int\limits_{\lvert x\rvert > 1}\sum_{n\not=0}\left(f(nx)\lvert x\rvert^s+\hat f(nx)\lvert x\rvert^{1-s}\right)\,d^*x\smallskip\\ &\displaystyle\qquad+\frac{\hat f(0)}{s-1}-\frac{f(0)}{s}. \end{array}$ (24)

As ${f\in\mathcal S}$ vanishes faster than any power of ${x}$ at infinity, the sum ${\sum_{n\not=0}f(nx)}$ is absolutely convergent and also vanishes faster than any power of ${x}$. The same statement holds for ${\hat f}$, so the integral in (24) is defined for all ${s\in{\mathbb C}}$ and is analytic. This extends ${M(f,s)\zeta(s)}$ to a meromorphic function on the complex plane with poles and residues as stated in theorem 4. Finally, noting that the Fourier transform of ${\hat f(x)}$ is ${f(-x)}$, the right hand side of (24) is unchanged if ${s}$ is replaced by ${1-s}$ and ${f}$ is replaced by ${\hat f}$, proving the functional equation (11).

A Note on Mellin Transforms

The Mellin transform was defined above as an integral over the real numbers (9), which deviates slightly from the more usual definition as an integral over the positive reals,

 $\displaystyle M_{{\mathbb R}^+}(f,s)=\int_0^\infty f(x) x^{s-1}\,dx.$

The reason for using the alternative definition is that we were interested in the functional equation relating it to the Fourier transform defined over the reals, so also required the Mellin transform to be defined with the same domain of integration. However, in doing so, we lose some properties, such as the existence of an inversion formula which, for the usual Mellin transform, is

 $\displaystyle f(x)=\frac1{2\pi i}\int\limits_{c-i\infty}^{c+i\infty}M_{{\mathbb R}^+}(f,s) x^{-s}\,ds,$

for any fixed ${c}$ in the domain where the Mellin transform is absolutely integrable.

The transform defined by (9) is unchanged if ${f(x)}$ is replaced by ${f(-x)}$, so is not one-to-one and cannot be inverted. The best that can be done is to recover the even part of ${f}$,

 $\displaystyle f(x)+f(-x)=\frac1{2\pi i}\int\limits_{c-i\infty}^{c+i\infty}M(f,s)\lvert x\rvert^{-s}\,ds.$

An explanation for the non-invertibility of the Mellin transform defined over ${{\mathbb R}^*}$ is that we did not consider the full set of characters. We only looked at characters of the form ${x\mapsto\lvert x\rvert^{s}}$ but, for example, this excludes the function ${{\rm sgn}(\cdot)}$ mapping positive reals to ${1}$ and negative reals to ${-1}$. More generally, for any ${s\in{\mathbb C}}$ and ${\epsilon=\pm1}$, a character ${\chi_{\epsilon,s}}$ can be defined by

 $\displaystyle \chi_{\epsilon,s}(x)=\begin{cases} \lvert x\rvert^s,&{\rm if\ }\epsilon=1,\\ {\rm sgn}(x)\lvert x\rvert^s,&{\rm if\ }\epsilon=-1. \end{cases}$ (25)

It is immediate that this is a continuous map from the nonzero reals to ${{\mathbb C}^*}$ satisfying ${\chi_{\epsilon,s}(xy)=\chi_{\epsilon,s}(x)\chi_{\epsilon,s}(y)}$. In fact, it can shown that (25) gives the full set of characters on ${{\mathbb R}^*}$ . The characters given by ${\epsilon=1}$, which we made use of in the discussion above, are precisely those which are trivial on the roots of unity ${\{\pm1\}}$ and are called unramified characters. Those given by ${\epsilon=-1}$ are called ramified.

The Mellin transform with respect to an arbitrary character ${\chi\colon{\mathbb R}^*\rightarrow{\mathbb C}^*}$ is

 $\displaystyle M(f,\chi)=\int_{-\infty}^\infty f(x)\chi(x)\,\frac{dx}{\lvert x\rvert}.$

The transform defined using the full set of characters can be inverted,

 $\displaystyle f(x)=\frac{1}{4\pi i}\int\limits_{c-i\infty}^{c+i\infty}\sum_{\epsilon=\pm1}M(f,\chi_{\epsilon,s})\chi_{\epsilon,s}(x)^{-1}\,ds.$

The argument given above, including the proof of the functional equation (11), could have been carried out with the full set of characters. In that case, the zeta function is replaced by

 $\displaystyle \zeta(\epsilon,s)=\frac12\sum_{n\not=0}\chi_{\epsilon,s}(n)^{-1}.$

For the unramified characters, ${\epsilon=1}$, this is the usual Riemann zeta function. For ramified characters, ${\zeta}$ equals ${0}$ and the functional equation reduces to the trivial statement ${0=0}$. So, there was nothing to be gained by including ramified characters in the discussion.