An Unexpected Quartic Solution

Many years ago, while in high school, I tried my hand at solving cubic and quartic formulas. Although there are entirely systematic approaches, using Galois theory, this was not something that I was familiar with at the time. I had just heard that it is possible. Here, ‘solving’ means to find an expression for the roots of the polynomial in terms of its coefficients, involving the standard arithmetical operations of addition, subtraction, multiplication and division, as well as extracting square roots, cube roots, etc.

The solution for cubics went very well. In class one day, the teacher wrote a specific example of a quartic on the blackboard, and proceeded to solve it by reducing to two easy quadratics. The reason that his example worked so easily is because the coefficients formed a palindrome. That is, they were the same when written in reverse order. As an example, consider the equation,

\displaystyle  x^4+2x^3-x^2+2x+1=0.

If we divide through by {x^2} then, with a little rearranging, this gives,

\displaystyle  (x+1/x)^2+2(x+1/x)-3=0.

As a quadratic in {x+1/x}, this is easily solved. One solution is {x+1/x=-3}. Multiplying by x and rearranging gives a new quadratic,

\displaystyle  x^2+3x+1=0.

By the standard formula for quadratics, we obtain

\displaystyle  x=(-3\pm\sqrt{5})/2.

It can be checked that this does give two real solutions to the original quartic.

Now, the approach that I attempted for the general quartic was to apply a substitution in order to simplify it, so that a similar method can be applied. Unfortunately, this resulted in a very messy equation, which seemed to be giving a sextic. That is, I went from the original fourth order polynomial, to what was looking like a sixth order one. This was complicating the problem, and getting further away from the goal than where I had started. I am not sure why I did not give up at that point, but I continued. Then, something amazing happened. Computing the coefficients of the sixth, fifth and fourth powers in this sextic, they all vanished! In fact, I had succeeded in reducing the quartic to a cubic, which can be solved. This still seems surprising, that such a messy looking expression should cancel out like this, in just the way that was needed. See equation (2) below for what I am talking about. As this was such a surprise at the time, and is still so now, I have decided to write it up in this post. It just demonstrates that, even if something seems hopeless, if you continue regardless then everything might just fall into place. Continue reading “An Unexpected Quartic Solution”

Logical Consequence

The aim of these notes is to achieve a basic understanding of the concepts of mathematical logic. The process of logical deduction is clearly a central theme of mathematics, where the idea is to prove a stated result by the means of an argument which is broken down into small steps, each of which should be obviously valid to the intended audience. This is typically done in a slightly informal fashion, where the validity of each stage of the proof is supposed to be clear, but does not necessary follow a fixed and clearly stated framework. The study of logic is meant to clarify how this process works, and give a clearly defined framework for logical reasoning. While the history of logic goes back thousands of years, a solid mathematical foundation was only developed in the late nineteenth and early twentieth centuries. For example, the Principia Mathematica published in 1910 by Whitehead and Russell was an attempt to show that all of mathematics can be derived from some precisely stated set of axioms together with inference rules for obtaining conclusions from lists of premises. From this standpoint, logic becomes another field of mathematical study. Just as rings, fields, measure spaces, etc., are objects of mathematical study, so is logic. As in any of these other areas, we use standard mathematical reasoning to prove results about logical systems. However, these results can then shed light on the process of mathematical reasoning itself, as well as providing justification for the underlying frameworks used for much of mathematics.

This has been a very successful endeavor, with theories such as Zermelo-Fraenkel set theory (ZF) providing the standard set of axioms used by mathematicians in many different fields of study. Many important ideas and results have arisen from the study of logic, such as the relative consistency of different theories and independence of certain statements. The axiom of choice, for example, is often considered to be intuitively obvious but, at other times, has been considered controversial. Thanks to the mathematical study of logic, it is now known to be independent of the other axioms of ZFC. Similarly, the continuum hypothesis which Georg Cantor spent many years trying to prove, has also been shown to be independent of ZFC. It is also known that, although the axiom of choice can be used to construct sets which are not Lebesgue measurable, it is consistent with the axiom of dependent choice that all sets of real numbers are measurable. In the other direction, reverse mathematics has been successful in determining precisely which axioms are really required for various mathematical theorems. At a higher level, results such as the completeness theorem have put the theory on a solid footing, establishing the equivalence of semantic truth and syntactic provabilty in first order logics. The incompleteness theorem, on the other hand, shows that no recursively enumerable proof system can prove all true statements about the arithmetic of the natural numbers and, furthermore, no sufficiently strong proof system is able to prove its own consistency (unless it is actually inconsistent). For example, it is not possible to prove the consistency of ZFC just by using the axioms and rules of ZFC itself, although it is possible in the presence of additional large cardinal axioms. Similarly, Peano arithmetic is not able to prove its own consistency, but it is possible if the well-ordering property of the ordinal {\epsilon_0} is added.

The mathematical study of logic has also made clear the distinction between classical and intuitionistic or constructive logics where the law of the excluded middle does not hold. Another consequence of putting logic on a solid mathematical foundation is that it should be possible to check the validity of mathematical proofs in an entirely systematic way and could, in theory, be checked by computer. There are various proof-checkers available, although they are currently much more difficult and tedious to use than writing out proofs for human readers, so tend not to be used for most mathematics. There is also a large intersection between mathematical logic and computer science. For example, the Curry-Howard correspondence gives a one-to-one relation between statements of intuitionistic implicational logic and the types of valid programs in simply typed lambda calculus, with the programs or lambda expressions playing the part of proofs. This correspondence has been used as the basis for various computer implementations of proof systems.

The starting point for most logical theories is a language in which statements can be formed according to rules for what constitutes a valid statement (wffs, or well-formed formulas). This generally includes certain special logical connectives which allow statements to be formed which express a logical connection between its component parts. For example, if `P’ and `Q’ are valid sentences, then `{P\rightarrow Q}‘ is also valid (meaning, `if P then Q’). These logical connectives come with prescribed rules of inference and, together with a fixed list of axioms, we can prove theorems. This does, however, raise various questions. What is the `correct’ set of connectives and what is the correct set of rules of inference that they should follow. It is possible that, with different connectives or rules of inference, an entirely separate set of theorems would result. In this post, I take a step back from such specific theories or rules. The idea is to first look at the most general concept of what a logic is, and only after that, can we can determine exactly what connectives or rules of inference are possible or desirable.

Consequence Relations

Possibly the most basic concept in logic is that of entailment or inference. We start with a collection of premises, which are statements in some language and, according to some rules, we establish a result. The starting point is a set {L}, which can be thought of as a set of well-formed statements in some language although, to be as general as possible, we just assume that {L} is a set with no specific restriction or interpretation assumed of its elements. We write

\displaystyle  a_1,a_2,a_3,\ldots,a_n\vdash b, (1)

for {a_i,b\in L}, to mean that {a_1,\ldots,a_n} entails {b} or, equivalently, that {b} is a logical consequence of {a_1,a_2\ldots,a_n}. Quite what the relation {\vdash} really means is left open at this stage. For example, considering a collection of interpretations or models, each of which assigns truth values to the elements of {L}, (1) can be taken to mean that any model assigning the truth value 1 to the {a_i} also assigns the value 1 to {b}. Alternatively, (1) could mean that, with respect to some formal proof system, there exists a proof of {b} from the premises {a_i}. More generally,

\displaystyle  \Gamma\vdash a (2)

means that {a\in L} is a logical consequence of the set of premises {\Gamma\subseteq L}. For convenience, we often put a list of subsets or elements of {L} as the list of premises, so that

\displaystyle  \Gamma_1,\Gamma_2,\ldots,\Gamma_m,a_1,a_2,\ldots,a_n\vdash b,

for {\Gamma_i\subseteq L} and {a_i,b\in L}, is just another way of writing

\displaystyle  \Gamma_1\cup\Gamma_2\cup\ldots\cup\Gamma_m\cup\{a_1,a_2,\ldots,a_n\}\vdash b.

Similarly, the statement {\vdash a} with no premises is just another way of writing {\emptyset\vdash a}.

As a matter of terminology, expression (2) is a sequent, and the {\vdash} symbol is referred to as a turnstile. The left hand side, {\Gamma}, is the set of premises or the antecedent and {b} is the consequent, and we say that {\Gamma} entails {a} or that {a} is a consequence of {\Gamma}. Although we are being rather general here and not considering any particular interpretation of the set {L} or of the relation {\vdash}, there is a short list of `obvious’ properties which should hold for logical inference.

Definition 1 A consequence relation {\vdash} on {L} is a relation on {\mathcal{P}L\times L} satisfying

  1. {\Gamma\!,a\vdash a} (reflexivity/axiom of identity),
  2. if {\Gamma\vdash a} then {\Gamma\!,\Delta\vdash a} (weakening),
  3. if {\Gamma\vdash x} for all {x\in\Delta} and {\Gamma\!,\Delta\vdash a} then {\Gamma\vdash a} (transitivity/rule of cut),

for all {\Gamma,\Delta\subseteq L} and {a\in L}. We will say that {\vdash} is finitary if, whenever {\Gamma\vdash a} then {\Delta\vdash a} for some finite {\Delta\subseteq\Gamma}.

A pair {(L,\vdash)} consisting of a set {L} together with a consequence relation {\vdash} on {L} will be known as a logic. We will often just use {L} to denote the logic {(L,\vdash)}.

Continue reading “Logical Consequence”


Welcome to Absolutely Sure! This is my new blog, and a companion to the already-existing existing Almost Sure, which is a `random mathematical blog’ concentrating on probability theory and stochastic calculus. In contrast, Absolutely Sure will focus on pure mathematics and,more generally, any mathematical content which does not fit into the category of probability theory.

While the potential scope of this new blog is quite wide, encapsulating all of mathematics, there are some subjects which I plan to kick off with.

  • The Riemann Zeta function, Dirichlet Series, and L-series.
  • The prime number theorem and Dirichlet’s theorem on primes in an arithmetic progression.
  • The Riemann Hypothesis.
  • p-adic numbers, Valuation Theory, and Adelic numbers.

In particular, I would like to look at approaches to the Riemann hypothesis, which is one of the great unsolved problems of mathematics. It has been solved in the case of zeta functions over function fields, or algebraic varieties over finite fields. We will look at some of the known proofs for function fields, which many researchers have tried to extend to the number field case. In particular, Enrico Bombieri’s proof for the function field case can be understood with just a knowledge of valuation theory, whereas other methods require some algebraic geometry. Although the Riemann hypothesis is unsolved, there are some partial results which we will look at — such as zero-free regions of the zeta function in the critical strip and the proof that a positive proportion of the zeros do lie on on the critical line.

Besides the ideas suggested above, there are many different topics which could be covered here.

George Lowther

The Riemann Zeta Function and the Functional Equation

For these initial posts of this blog, I will look at one of the most fascinating objects in mathematics, the Riemann zeta function. This is defined by the infinite sum

\displaystyle  \zeta(s)=\sum_{n=1}^\infty n^{-s}, (1)

which can be shown to be uniformly convergent for all {s\in{\mathbb C}} with {\Re(s) > 1}. It was Bernhard Riemann who showed that it can be analytically continued to the entire complex plane with a single pole at {s=1}, derived a functional equation, and showed how its zeros are closely linked to the distribution of the prime numbers. Riemann’s seminal 1859 paper is still an excellent introduction to this subject, an English translation of which (On the Number of Prime Numbers less than a Given Quantity) can be found on the Clay Mathematics Institute website, who included the conjecture that the `non-trivial’ zeros all lie on the line {\Re(s)=1/2} among their million dollar millenium problems.

In this post I will give a brief introduction to the zeta function and look at its functional equation. In particular, the functional equation can be generalized and reinterpreted as an identity of Mellin transforms, which links the additive Fourier transform on {{\mathbb R}} with the multiplicative Mellin transform on the nonzero reals {{\mathbb R}^*}. The aim is to the prove the generalized functional equation and some properties of the zeta function, working from first principles, and discuss at a high level how this relates to the ideas in Tate’s thesis. Some standard complex analysis and Fourier transform theory will be used, but no prior understanding of the Riemann zeta function is assumed.

The zeta function has a long history, going back to the Basel problem which was posed by Pietro Mengoli in 1644. This asked for the exact value of the sum of the reciprocals of the square numbers or, equivalently, the value of {\zeta(2)}. This was eventually solved by Leonard Euler in 1734 who discovered the famous identity

\displaystyle  \zeta(2)=\frac{\pi^2}{6}.

Euler found the values of {\zeta} at all positive even numbers, although I will not be concerned with this here. More pertinent to the current discussion is the product expression also found by Euler,

\displaystyle  \zeta(s)=\prod_p(1-p^{-s})^{-1}. (2)

The product is taken over all prime numbers {p}, and converges on {\Re(s) > 1}. Proving (2) is straightforward. The formula for summing a geometric series gives

\displaystyle  (1-p^{-s})^{-1}=1+p^{-s}+p^{-2s}+p^{-3s}+\cdots.

Substituting this into (2) and expanding the product gives an infinite sum over terms of the form

\displaystyle  (p_1^{r_1}p_2^{r_2}\cdots p_k^{r_k})^{-s}

for {k\ge0}, primes {p_1 < p_2 < \cdots < p_k}, and integers {r_i > 0}. Using the fact that every positive integer has a unique expression as a product of powers of distinct primes, we see that the Euler product expands as a sum of terms of the form {n^{-s}} as {n} ranges over the positive integers. This is just the right hand side of (1) and shows that the Euler product converges and is equal to {\zeta(s)} whenever the sum (1) is absolutely convergent.

The Euler product provides a link between the zeta function and the prime numbers, with far-reaching consequences. For example, the prime number theorem describing the asymptotic distribution of the prime numbers was originally proved using the Euler product, and the strongest known error terms available for this theorem still rely on the link between the prime numbers and the zeta function given by (2). Euler used the fact that (1) diverges at {s=1} to argue that (2) also diverges at {s=1}. From this, it is immediately deduced that there are infinitely many primes and, more specifically, the reciprocals of the primes sum to infinity.

The Euler product can also be expressed in terms of the logarithm of the zeta function. Using the Taylor series expansion of {\log(1-p^{-s})}, we obtain

\displaystyle  \log\zeta(s)=\sum_p\sum_{k=1}^\infty \frac{p^{-ks}}{k}. (3)

As the terms on the right hand side are bounded by {\lvert n^{-s}\rvert} as {n} runs through the subset of the natural numbers consisting of prime powers, it will be absolutely convergent whenever (1) is. In particular, (3) converges on the half-plane {\Re(s) > 1}. Although the complex logarithm is generally only defined up to integer multiples of {2\pi i}, (3) gives the unique continuous version of {\log\zeta(s)} over {\Re(s) > 1} which takes real values on the real line.

We will first look at the zeta functional equation as described by Riemann. This involves the gamma function defined on {\Re(s) > 0} by the absolutely convergent integral

\displaystyle  \Gamma(s)=\int_0^\infty x^{s-1}e^{-s}\,dx. (4)

This is easily evaluated at {s=1} to get {\Gamma(1)=1}, and an integration by parts gives the functional equation of {\Gamma},

\displaystyle  \Gamma(s+1)=s\Gamma(s).

This can be used to evaluate the gamma function at the positive integers, {\Gamma(n)=(n-1)!}. Also, by expressing {\Gamma(s)} in terms of {\Gamma(s+1)}, it allows us to extend {\Gamma(s)} to a meromorphic function on {\Re(s) > -1} with a single simple pole at {s=0}. Repeatedly applying this idea extends {\Gamma(s)} as a meromorphic function on the entire complex plane with a simple pole at each non-positive integer. Furthermore, it is known that {\Gamma(s)} is non-zero everywhere on {{\mathbb C}}.

The functional equation can now be stated as follows.

Theorem 1 (Riemann) The function {\zeta(s)} defined by (1) uniquely extends to a meromorphic function on {{\mathbb C}} with a single simple pole at {s=1} of residue {1}. Setting

\displaystyle  \Lambda(s)=\pi^{-s/2}\Gamma(s/2)\zeta(s) (5)

this satisfies the identity

\displaystyle  \Lambda(s)=\Lambda(1-s). (6)

Riemann actually gave two independent proofs of this, the first using contour integration and the second using an identity of Jacobi. Many alternative proofs have been discovered since, with Titchmarsh listing seven (The Theory of the Riemann Zeta Function, 1986, second edition). I will not replicate these here, but will show an alternative formulation as an identity of Mellin transforms, from which Riemann’s functional equation (6) follows as a special case.

As an example of the use of the functional equation to derive properties of the zeta function on the left half-plane {\Re(s)\le0}, we evaluate {\zeta(0)}. Using the special value {\Gamma(1/2)=\sqrt{\pi}} and the fact that {\zeta(s)} has a pole of residue {1} at {s=1}, we see that {\Lambda(s)\sim1/(s-1)} as {s} approaches {1}. Similarly, using the fact that the gamma function has a pole of residue {1} at {s=0}, we see that {\Lambda(s)\sim2\zeta(0)/s} as {s} approaches {0}. Putting these limits into the functional equation gives

\displaystyle  \zeta(0)=-1/2.

Similarly, the functional equation expresses the values of {\zeta} at negative odd integers in terms of its values at positive even integers. For example, taking {s=-1},

\displaystyle  \pi^{1/2}\Gamma(-1/2)\zeta(-1)=\pi^{-1}\Gamma(1)\zeta(2).

Plugging in the values {\Gamma(-1/2)=-2\sqrt{\pi}}, {\Gamma(1)=1} and Euler’s value of {\zeta(2)=\pi^2/6},

\displaystyle  \zeta(-1)=-\frac{1}{12}.

Via a process known as zeta function regularization, these special values of {\zeta} are sometimes written as the famous, but rather confusing, expressions

\displaystyle  \setlength\arraycolsep{2pt} \begin{array}{rl} &\displaystyle 1+1+1+1+\cdots = -\frac12,\smallskip\\ &\displaystyle 1+2+3+4+\cdots = -\frac1{12}. \end{array}

Next, using standard properties of the gamma function, theorem 1 can be used to investigate the zeros of the zeta function. The Euler product implies that {\zeta(s)} has no zeros on {\Re(s) > 0}, as I will show below, and it is well known that the gamma function has no zeros at all. So, {\Lambda(s)} has no zeros or poles anywhere on {\Re(s) > 1}, and the functional equation extends this statement to {\Re(s) < 0}. It follows that, on {\Re(s) < 0}, the zeros of the zeta function must cancel with the poles of {\Gamma(s/2)}, which are at the negative even integers.

On the strip {0\le\Re(s)\le1}, the precise location of the zeros of {\zeta(s)} are not known. However, as {\Gamma(s/2)} has no poles or zeros on this domain (other than at {s=0}), they must coincide with the zeros of {\Lambda(s)}. From the definition (1) of the zeta function, it satisfies {\zeta(\bar s)=\overline{\zeta(s)}} (using a bar to denote complex conjugation). So, its zeros are preserved by reflection {s\mapsto\bar s} about the real line. Also, by the functional equation, they are preserved by the map {s\mapsto1-s} in the aforementioned strip. We have arrived at the following.

Lemma 2 The function {\zeta(s)} has zeros at the (strictly) negative even integers. The only remaining zeros lie in the vertical strip {0\le\Re(s)\le1} and are preserved by the maps

\displaystyle  \setlength\arraycolsep{2pt} \begin{array}{rl} &\displaystyle s\mapsto \bar s,\\ &\displaystyle s\mapsto 1-s. \end{array} (7)

The zeros at the negative even integers are called the trivial zeros of {\zeta}, with the remaining ones referred to as the non-trivial zeros. The domain {0\le\Re(s)\le1} is known as the critical strip. So, the non-trivial zeros of the Riemann zeta function are precisely those lying in the critical strip, and are the same as the zeros of the function {\Lambda(s)} defined by (5). The vertical line {\Re(s)=\frac12} lying along the center of the critical strip is called the critical line. Then, (7) says that the non-trivial Riemann zeta zeros are symmetric under reflection about both the real line and the critical line. The Riemann hypothesis, as originally conjectured by Riemann in his 1859 paper, states that the non-trivial zeros all lie on the critical line. This, however, remains unknown and is one of the great open problems of mathematics.

We will now move on to the alternative interpretation of the functional equation relating additive Fourier transforms with multiplicative transforms. We will use the following convention for the Fourier transform of a function {f\colon{\mathbb R}\rightarrow{\mathbb C}},

\displaystyle  \hat f(y)=\int_{-\infty}^\infty e^{-2\pi ixy}f(x)\,dx. (8)

For this to make sense, it should at least be required that {f} is integrable. I will restrict to the particularly nice class of Schwartz functions. These are the infinitely differentiable functions from {{\mathbb R}} to {{\mathbb C}} which vanish faster than polynomially at infinity, along with their derivatives to all orders. That is, {x^rf^{(s)}(x)\rightarrow0} as {\lvert x\rvert\rightarrow\infty}, for all integers {r,s\ge0}. Denote the space of Schwartz functions by {\mathcal S}. Schwartz functions are integrable and it is known that their Fourier transforms are again in {\mathcal{S}}. Then, for any {f\in\mathcal S}, its Fourier transform is inverted by

\displaystyle  f(x)=\int_{-\infty}^\infty e^{2\pi ixy}\hat f(y)\,dy.

I’ll explain now why the Fourier transform (8) is an additive transform of {f}. For each fixed {y}, the map {x\mapsto e^{2\pi ixy}} is a continuous homomorphism from the additive group of real numbers to the multiplicative group of nonzero complex numbers {{\mathbb C}^*},

\displaystyle  e^{2\pi i(x_1+x_2)y}=e^{2\pi ix_1y}e^{2\pi ix_2y}.

So, {x\mapsto e^{2\pi ixy}} is a character of the reals under addition. Furthermore, integration is invariant under additive translation,

\displaystyle  \int_{-\infty}^\infty f(x)\,dx=\int_{-\infty}^\infty f(x+a)\,dx.

That is, the standard (Riemann or Lesbesgue) integral is the Haar measure of the additive group of reals, and the Fourier transform (8) is the integral of {f(x)} against additive characters with respect to the additive Haar measure.

The Mellin transform of {f\colon{\mathbb R}\rightarrow{\mathbb C}} is

\displaystyle  M(f,s)=\int_{-\infty}^\infty f(x)\lvert x\rvert^{s-1}\,dx, (9)

which is defined for any {s\in{\mathbb C}} for which the integral is absolutely convergent. (This differs slightly from the usual definition where a lower limit of {0} is used for the integral. See the note on Mellin transforms at the end of this post.) Now, the map {x\mapsto\lvert x\rvert^{s}} is a continuous homomorphism from the multiplicative group of nonzero reals {{\mathbb R}^*} to {{\mathbb C}^*},

\displaystyle  \lvert x_1x_2\rvert^s=\lvert x_1\rvert^s\lvert x_2\rvert^s.

Denoting {d^*x=dx/\lvert x\rvert}, integration with respect to {d^*x} is invariant under multiplicative rescaling by any {a\in{\mathbb R}^*},

\displaystyle  \int_{-\infty}^{\infty} f(ax)\,d^*x=\int_{-\infty}^\infty f(x)\,d^*x.

That is, {\int\cdot\,d^*x} is the multiplicative Haar measure on {{\mathbb R}^*}, and the Mellin transform is the integral of {f(x)} against multiplicative characters with respect to the multiplicative Haar measure. This explains why the Fourier transform (8) is additive and the Mellin transform (9) is multiplicative.

For an explicit example of a Mellin transform, consider {f(x)=e^{-\pi x^2}},

\displaystyle  \setlength\arraycolsep{2pt} \begin{array}{rl} \displaystyle M(f,s)&\displaystyle=\int_{-\infty}^\infty \lvert x\rvert^{s-1}e^{-\pi x^2}\,dx\smallskip\\ &\displaystyle=\int_0^\infty \pi^{-s/2}y^{s/2-1}e^{-y}\,dy. \end{array}

Here, the substitution {y=\pi x^2} was used. Comparing with the definition (4) of the gamma function,

\displaystyle  M(f,s)=\pi^{-s/2}\Gamma(s/2). (10)

For Schwartz functions, the integral defining the Mellin transform is absolutely convergent on the right half-plane {\Re(s) > 0}, and can be analytically continued to the entire complex plane.

Theorem 3 If {f\in\mathcal{S}}, then {M(f,s)} is well defined over {\Re(s) > 0}, and uniquely extends to a meromorphic function on {{\mathbb C}} with only simple poles at the non-positive even numbers {-2n}, with residue

\displaystyle  {\rm Res}({M(f,\cdot)},-2n)=2\frac {f^{(2n)}(0)}{(2n)!}.

I’ll give a proof of theorem 3 below. For now, we will move straight on to the statement of the functional equation relating the Mellin transform of {f} to that of its Fourier transform {\hat f}.

Theorem 4 If {f\in\mathcal{S}} then {M(f,s)\zeta(s)} extends to a meromorphic function with only simple poles at {0} and {1} of residue {-f(0)} and {\hat f(0)} respectively. The functional equation

\displaystyle  M(f,s)\zeta(s)=M(\hat f,1-s)\zeta(1-s) (11)

holds everywhere.

A proof of this will be given further down. It can be shown that the specific case {f(x)=e^{-\pi x^2}} is equal to its own Fourier transform, {\hat f=f}. So, using expression (10) for the Mellin transform {M(f,s)}, we see that Riemann’s functional equation follows directly from (11).

Above, we discussed how Riemann’s functional equation allows values of {\zeta(s)} to be determined on the left half-plane {\Re(s)\le0} and restricts the locations of its zeros to be as described in lemma 2. This made use of properties of the gamma function, specifically the locations of its poles and the fact that it has no zeros. These arguments can be made by instead using version (11) of the functional equation, and the gamma function need not be referred to at all. For an arbitrary smooth function {f}, theorem 3 gives the poles of {M(f,s)}. Also, by choosing {f} with compact support in {{\mathbb R}^*}, {M(f,s)} will be well-defined everywhere by (9) and is analytic. It is also easy to choose {f} such that the Mellin transform does not vanish at any specified point, which is enough to apply the arguments above.

I now briefly consider the relation between the functional equation in the form (11) and the ideas of John Tate’s 1950 thesis. Theorem 4 can be viewed primarily as relating the Mellin transform of the Fourier transform to the Mellin transform of the original function. The zeta function plays more of an ancillary role as a multiplicative factor in this identity. This treatment of the Mellin transform as the primary object of interest was taken much further in Tate’s thesis. Tate refocussed attention from the rational and real number fields (or algebraic number field) to the larger ring of adeles, {\mathbb A}. This is outside of the scope of this post, but the important properties are that, just like the embedding of the rationals inside the reals, {{\mathbb Q}} embeds in {\mathbb A}, and the theory of Fourier and Mellin transforms extends to functions defined on the adeles. In the adelic case, he obtained the functional equation

\displaystyle  M(f,s)=M(\hat f,1-s). (12)

Now, the zeta function does not appear at all! Digging a bit deeper, the ring of adeles over the rational numbers can be expressed as a product of the reals and the ring of `finite’ adeles,

\displaystyle  \mathbb A ={\mathbb R}\times\mathbb A_f.

That is, every element {x} of the adelic ring can be expressed as a pair {(x_\infty,x_f)} consisting of a real number {x_\infty} and a finite adele {x_f}. For a function {f\colon\mathbb A\rightarrow{\mathbb C}} which is a product of the real and finite parts, {f(x)=g(x_\infty)h(x_f)}, the Mellin and Fourier transforms are also products of the transforms of the individual components.

\displaystyle  \setlength\arraycolsep{2pt} \begin{array}{rl} &\displaystyle \hat f(x)=\hat g(x_\infty)\hat h(x_f),\smallskip\\ &\displaystyle M(f,s)=M(g,s)M(h,s). \end{array}

Applying this to the adelic functional equation (12),

\displaystyle  M(g,s)M(h,s)=M(\hat g,1-s)M(\hat h,1-s).

Just as the special case {g(x)=e^{-\pi x^2}} lead to the gamma factor in Riemann’s functional equation, choosing a particular example for the function {h} on the finite adeles which equals its own Fourier transform leads to the appearance of the zeta function in (5) and (11). This places the gamma term and the zeta term in the functional equation on a roughly equal footing.

We can go further. The finite adeles can be broken down into a restricted product of fields corresponding to each prime number — the {p}-adic numbers,

\displaystyle  \mathbb A_f = {\prod_p}^\prime{\mathbb Q}_p.

In the case where the function {h(x_f)} factors into a product of functions on the {p}-adic components, {h(x_f)=\prod_ph_p(x_p)}, then the Mellin transform commutes with this factorisation,

\displaystyle  M(h,s)=\prod_pM(h_p,s).

For a particular choice of {h}, specifically the indicator function of the integral adeles, this is just the Euler product described above. From this viewpoint, the gamma factor in Riemann’s functional equation, the Mellin transform appearing in (11), the Riemann zeta function, and the {(1-p^{-s})^{-1}} terms in the Euler product, are all just manifestations of factorizations of the Mellin transform on the adeles, which itself satisfies the functional equation (12).

Finally, we note that there is an intimate connection between additive and multiplicative structures pervading the above discussion. The natural numbers, which are generated under addition by the unit element {1}, are also generated under multiplication by the prime numbers. This is reflected in the definition of the zeta function as a sum over the natural numbers (1), which is equivalent to the multiplicative definition given by the Euler product (2). Then, the functional equation (11) ties together the additive Fourier transform over {{\mathbb R}} with the multiplicative Mellin transform over {{\mathbb R}^*}.

Elementary Inequalities

Above, I stated a few results but, now, let’s move on and actually prove a few things regarding the Riemann zeta function. As this post is not assuming any prior understanding of {\zeta(s)}, I start at a very basic level and will derive a few elementary inequalities. By elementary, I mean things which can be proved straight from the definition (1) of the zeta function. These will be rather basic and far from optimal — especially in the critical strip — but are easy to prove and give some understanding of what the zeta function looks like.

First, for any positive real {s}, the function {x\mapsto x^{-s}} is decreasing, giving

\displaystyle  (n+1)^{-s} < x^{-s} < n^{-s}

for any positive integer {n} with {n < x < n +1}. Integrating over {x\ge1} and substituting in the definition of {\zeta(s)} for {s > 1},

\displaystyle  \zeta(s)-1=\sum_{n=1}^\infty(n+1)^{-s} < \int_1^\infty x^{-s}\,dx < \sum_{n=1}^\infty n^{-s}=\zeta(s).

Substituting in the value {1/(s-1)} for the integral gives the following bounds.

Lemma 5 The sum (1) converges absolutely at all real {s > 1}, and satisfies the bound

\displaystyle  \frac1{s-1} < \zeta(s) < \frac s{s-1}. (13)

In particular, {\zeta(s)\sim1/(s-1)} as {s} approaches {1} from above.

Moving on to {s\in{\mathbb C}}, we can use the identity {\lvert x^s\rvert=x^\sigma}, where {\sigma} is the real part of {s}, to write,

\displaystyle  \left\lvert\zeta(s)-1\right\rvert\le\sum_{n=2}^\infty\lvert n^{-s}\rvert =\sum_{n=2}^\infty n^{-\sigma}.

Comparing the right hand side with the definition of {\zeta}, we get,

Lemma 6 The sum (1) converges absolutely at all {s\in{\mathbb C}} with {\Re(s) > 1} and satisfies the bound

\displaystyle  \lvert\zeta(s)-1\rvert\le\zeta(\sigma)-1 < \frac1{\sigma-1}

where {\sigma=\Re(s)}.

The right-hand inequality here is just an application of (13). In particular, lemma 6 implies that {\zeta(s)} is uniformly bounded on the half-plane {\Re(s)\ge\sigma_0}, any {\sigma_0 > 1}, with the bound {\sigma_0/(\sigma_0-1)}. It also shows that {\zeta(s)\rightarrow1} uniformly as {\Re(s)\rightarrow\infty}.

Next, the Euler product expansion (2) can be utilized to show that {\zeta} has no zeros on the open right half-plane {\Re(s) > 1}. Applying the inequality {\lvert 1-x\rvert^{-1} > 1-\lvert x\rvert}, which applies for all {0 < \lvert x\rvert < 1},

\displaystyle  \lvert\zeta(s)\rvert > \prod_p(1-\lvert p^{-s}\rvert)=\prod_p(1-p^{-\sigma})

with {\sigma=\Re(s)}. Noting that the right hand side is just the reciprocal of the Euler product of {\zeta(\sigma)} gives a lower bound.

Lemma 7 The zeta function {\zeta(s)} is nonzero everywhere on the domain {\Re(s) > 1} and satisfies the lower bound,

\displaystyle  \lvert\zeta(s)\rvert > \zeta(\sigma)^{-1} > 1-\sigma^{-1} (14)

with {\sigma=\Re(s)}.

The right-hand inequality here is another direct application of (13). Lemma 7 shows that {\zeta(s)} is uniformly bounded away from zero on the half-plane {\Re(s)\ge\sigma_0}, for any {\sigma_0 > 1}.

Expression (3) for the logarithm of the zeta function can be used to obtain further bounds. On the half-plane {\Re(s)=\sigma > 1}, we use {\lvert p^{-ks}\rvert=p^{-k\sigma}},

\displaystyle  \lvert\log\zeta(s)\rvert\le\sum_p\sum_{k=1}^\infty\frac{p^{-k\sigma}}{k}=\log\zeta(\sigma).

So, we have obtained the following.

Lemma 8 The logarithm of the zeta function over {\Re(s) = \sigma > 1} satisfies the bound

\displaystyle  \lvert\log\zeta(s)\rvert\le\log\zeta(\sigma) < \log(1-\sigma^{-1})^{-1}.

Applying this bound to {-\log\lvert\zeta(s)\rvert} gives (14) as a special case.

Using {\lfloor\cdot\rfloor} to denote the floor function, we can use the equality {\lfloor x\rfloor^{-s}=n^{-s}} for {n\le x < n+1} and each positive integer {n} to rewrite the summation (1) as an integral

\displaystyle  \setlength\arraycolsep{2pt} \begin{array}{rl} \displaystyle \zeta(s)&\displaystyle=\int_1^\infty\lfloor x\rfloor^{-s}\,dx\smallskip\\ &\displaystyle=\int_1^\infty x^{-s}\,dx +\int_1^\infty(\lfloor x\rfloor^{-s}-x^{-s})\,dx. \end{array}

Substituting in the value {1/(s-1)} for the first integral on the right hand side gives the following expression for the zeta function,

\displaystyle  \zeta(s)=\frac1{s-1}+\int_1^\infty\left(\lfloor x\rfloor^{-s}-x^{-s}\right)\,dx. (15)

The idea is of that the integrand here is small in comparison to {x^{-s}}, so that we can expect it to converge on a larger domain than the sum (1). In fact, as we will show, it is absolutely integrable on {\Re(s) > 0}. As uniform limits of analytic functions are analytic, this will extend {\zeta(s)-1/(s-1)} to an analytic function on this domain.

To bound the integrand in (15), note that {x^{-s}} has the derivative {-sx^{-s-1}} with respect to {x}. Using {\sigma=\Re(s)}, this has norm {\lvert s\rvert x^{-\sigma-1}} and, as it is decreasing in {x}, is bounded above by {\lvert s\rvert\lfloor x\rfloor^{-\sigma-1}}. Hence, the mean value theorem gives the inequality

\displaystyle  \left\lvert\lfloor x\rfloor^{-s}-x^{-s}\right\rvert\le\lvert s\rvert\lfloor x\rfloor^{-\sigma-1}(x-\lfloor x\rfloor),

which will be strict whenever {x} is not an integer. For any positive integer {n}, we have {\lfloor x\rfloor=n} on the interval {[n,n+1)} and,

\displaystyle  \setlength\arraycolsep{2pt} \begin{array}{rl} \displaystyle \int_n^{n+1}\left\lvert\lfloor x\rfloor^{-s}-x^{-s}\right\rvert\,dx &\displaystyle < \lvert s\rvert n^{-\sigma-1}\int_n^{n+1}(x-n)\,dx\smallskip\\ &\displaystyle=\frac12\lvert s\rvert n^{-\sigma-1}. \end{array}

Summing over {n} and comparing with the definition of {\zeta(\sigma+1)} gives a finite value, so we have proved the following.

Lemma 9 The zeta function {\zeta(s)} extends to a meromorphic function on {\Re(s) > 0} with a simple pole of residue {1} at {s=1}, and is given by the absolutely convergent integral (15). Furthermore, on this domain, it satisfies the bound

\displaystyle  \left\lvert\zeta(s)-\frac1{s-1}\right\rvert < \frac12\lvert s\rvert\zeta(\sigma+1) < \frac12\lvert s\rvert(1+\sigma^{-1}) (16)

with {\sigma=\Re(s)}.

The final inequality here is yet another application of (13). So, we have a bound for {\zeta(s)} of size {O(\lvert s\rvert)} on the right half-plane {\Re(s)\ge\sigma_0}, for any {\sigma_0 > 0}. This is by no means optimal, and it can be improved to {O(\lvert s\rvert^{1/2})}, with even better bounds given by the — as yet — unproven Lindelöf hypothesis.

Applying inequality (16) for real {s} in the interval {0 < s < 1} shows that {\zeta(s)} does not vanish,

\displaystyle  2\zeta(s) < \frac2{s-1}+s(1+s^{-1})=\frac{-1-s^2}{1-s}.

Corollary 10 {\zeta(s) < -1/2}, so is nonzero, on the real line segment {0 < s < 1}.

Interestingly, this bound is optimal, as {\zeta(0)=-1/2}.

Elementary Extension of the Zeta Function

I will now describe an elementary method of analytically continuing the Riemann zeta function to the entire complex plane. Nothing in this section is required for the results discussed above, so can be skipped if required. The reason for including it here is to gain an intuitive understanding why the definition (1) of {\zeta(s)} given on the half-plane {\Re(s) > 1} should continue to all of {{\mathbb C}}, without using any `magic’ formulas such as Poisson summation or the functional equation. Instead, we can use the Euler-Maclaurin formula. Rather than just stating and applying this equation, I will derive it, as it is straightforward to do and gives a better understanding of why the zeta function necessarily extends to the complex plane.

We can apply a similar ideas to that which was used to express {\zeta(s)} over {\Re(s) > 0} by identity (15). To do this in more generality, I will look at a sum {\sum_nf(n)} for a smooth function {f}. Some assumptions will be required on {f} in order that the sums and integrals converge, so suppose that it is smooth and that its derivatives to all orders are integrable over {[1,\infty)}. This is the case for the Riemann zeta function where {f(x)=x^{-s}}.

For a differentiable function {u} defined on the interval {[0,1]}, an integration by parts gives

\displaystyle  u(0)=\int\limits_0^1u(x)\,dx+(1+c)(u(0)-u(1))+\int\limits_0^1(x+c)u^\prime(x)\,dx,

for any constant {c}. The idea is to replace {u(x)} by {f(n+x)} in this identity and sum over {n}. In order that {x+c} has average value {0} over the unit interval, we will take {c=-1/2}. So, setting {p_1(x)=x-1/2},

\displaystyle  \sum_{n=1}^\infty f(n)=\int\limits_1^\infty f(x)\,dx+p_1(1)f(1)+\int\limits_1^\infty p_1(\{x\})f^\prime(x)\,dx (17)

with {\{x\}} denoting the fractional part of {x}. That is {\{x\}=x-n} on the interval {n\le x < n+1}. The hope here is that {f^\prime} is sufficiently smaller then {f}, so that the right-hand integral converges even when the sum on the left diverges.

We take this a step further and express the integral over {f^\prime} as an integral over {f^{\prime\prime}}. Again, consider a function {u} defined on the unit interval and, choosing {p_2} to be the integral of {p_1}, another integration by parts gives

\displaystyle  \int\limits_0^1p_1(x)u(x)\,dx=p_2(1)u(1)-p_2(0)u(0)-\int\limits_0^1p_2(x)u^\prime(x)\,dx.

As {p_1} has zero integral, {p_2(1)=p_2(0)}, so replacing {u(x)} by {f^\prime(x+n)} and summing over {n},

\displaystyle  \int\limits_1^\infty p_1(\{x\})f^\prime(x)\,dx=-p_2(1)f^{\prime}(1)-\int\limits_1^\infty p_2(\{x\})f^{\prime\prime}(x)\,dx.

Again, {p_2} is only defined up to an arbitrary constant, so can be chosen to have zero integral over the unit interval.

We repeat this procedure {r} times and substitute into (17),

\displaystyle  \setlength\arraycolsep{2pt} \begin{array}{rl} \displaystyle \sum_{n=1}^\infty f(n)=&\displaystyle\int\limits_1^\infty f(x)\,dx+p_1(1)f(1)-p_2(1)f^\prime(1)+\cdots\smallskip\\ &\displaystyle\quad-(-1)^rp_r(1)f^{(r-1)}(1)-(-1)^r\int\limits_1^\infty p_r(\{x\})f^{(r)}(x)\,dx. \end{array} (18)

Here, {p_{k+1}} is defined as the integral of {p_k} with constant of integration chosen such that it has zero integral over the unit interval,

\displaystyle  \setlength\arraycolsep{2pt} \begin{array}{rl} &\displaystyle p_1(x)=x-\frac12,\smallskip\\ &\displaystyle p_{k+1}(x)=\int_0^1yp_k(y)\,dy-\int_x^1p_k(y)\,dy. \end{array}

From this definition, it can be seen that {p_k(x)=B_k(x)/k!} where {B_k(x)} are the Bernoulli polynomials, {p_k(1)=B_k/k!} for Bernoulli numbers {B_k}, and (18) is the Euler-Maclaurin formula.

In particular, the derivatives of {f(x)=x^{-s}} can be computed as

\displaystyle  \setlength\arraycolsep{2pt} \begin{array}{rl} \displaystyle f^{(k)}(x)&\displaystyle=(-1)^ks(s+1)\cdots(s+k-1)x^{-s-k}\smallskip\\ &\displaystyle=(-1)^ks^{\overline k}x^{-s-k} \end{array}

with {s^{\overline k}} denoting the rising factorial, which is just a polynomial in {s}.

Taking {f(x)=x^{-s}} for {\Re(s) > 1}, the left hand side of identity (18) is {\zeta(s)} and the first integral on the right is {1/(s-1)}. Applying this to definition (1) of the zeta function,

\displaystyle  \setlength\arraycolsep{2pt} \begin{array}{rl} \displaystyle \zeta(s)=&\displaystyle\frac1{s-1}+p_1(1)s^{\overline 0}+p_2(1)s^{\overline 1}+\cdots\smallskip\\ &\displaystyle\quad+p_r(1)s^{\overline{r-1}}-s^{\overline r}\int\limits_1^\infty p_r(\{x\})x^{-s-r}\,dx. \end{array} (19)

As the integral on the right converges absolutely on {\Re(s) > -r-1}, and {r} is an arbitrary positive integer, we have the analytic extension.

Theorem 11 The function {\zeta(s)} defined on {\Re(s) > 1} by (1) continues to a meromorphic function on {{\mathbb C}} with a single simple pole at {s=1} of residue {1}.

Furthermore, the integral on the right of (19) is uniformly bounded over {\Re(s)\ge\alpha}, any {\alpha > -r-1}, and the {s^{\overline k}} terms are polynomials, so we have the following bound on the growth of the zeta function.

Lemma 12 For every real number {\alpha}, there exists an {A} such that

\displaystyle  \zeta(s)=O\left(\lvert s\rvert^A\right)

over {\Re(s)\ge\alpha}, as {\lvert s\rvert\rightarrow\infty}.

I purposefully did not put in any specific value for {A} here, as the point is that the zeta function is polynomially bounded on each right half-plane and, in any case, there are more optimal values available from applying the functional equation.

Mellin Transforms

We show that the Mellin transform of a Schwartz function {f\in\mathcal S} can be continued from the region {\Re(s) > 0} to the entire complex plane, proving theorem 3. Choosing a positive integer {N}, write

\displaystyle  R_f(x) = f(x)-1_{\{\lvert x\rvert < 1\}}\sum_{n=0}^{N-1} \frac{f^{(n)}(0)}{n!}x^n.

This is bounded, and for {\lvert x\rvert < 1} is just the remainder term in the Taylor polynomial approximation of {f}. By Taylor’s theorem, {R_f(x)=O(\lvert x\rvert^N)} as {x} approaches {0}. So, {R_f(x)\lvert x\rvert^{s-1}} is absolutely integrable on {\Re(s) > -N} and, hence, {M(R_f,s)} is a well-defined analytic function on this domain. The transform of the polynomial term can be computed,

\displaystyle  \setlength\arraycolsep{2pt} \begin{array}{rl} \displaystyle M(1_{\{\lvert x\rvert < 1\}}x^n,s) &\displaystyle= \int_{-1}^1 x^n\lvert x\rvert^{s-1}\,dx\smallskip\\ &\displaystyle= (1+(-1)^n)\int_0^1x^{s+n-1}\,dx\smallskip\\ &\displaystyle=\frac{1+(-1)^n}{s+n}. \end{array}

The Mellin transform of {f} is then,

\displaystyle  M(f,s)=M(R_f,s)+2\sum_{n=0}^{N-1}1_{\{n{\rm\ is\ even}\}}\frac{f^{(n)}(0)}{n!}\frac1{s+n}.

This statement holds for {\Re(s) > 0} but, as the right hand side is a well-defined meromorphic function on {\Re(s) > -N}, it extends {M(f,s)} to a meromorphic function on this domain. The poles arise from the {1/(s+n)} terms with the residue stated in theorem 3. By choosing {N} arbitrarily large, we have the extension to the complex plane.

Poisson Summation

The second proof of the functional equation given by Riemann in his 1859 paper made use of the following identity of Jacobi,

\displaystyle  2\sum_{n=1}^\infty e^{-n^2\pi x}+1=x^{-\frac12}\left(2\sum_{n=1}^\infty e^{-n^2\pi/x}+1\right).

For the Mellin transform version of the functional equation, we make use of the Poisson summation formula. To avoid having to explicitly write limits everywhere, the notation {\sum_n} is used to denote the sum as {n} ranges over the integers {{\mathbb Z}}.

Theorem 13 If {f\in \mathcal S} has Fourier transform {\hat f} then,

\displaystyle  \sum_n f(n)=\sum_n\hat f(n). (20)

Jacobi’s identity is just a special case of this using {f(u)=e^{-u^2\pi x}}. The Poisson summation formula can be proved using Fourier series. The idea is that, for any Schwartz function {f}, we can define a periodic {g\colon{\mathbb R}\rightarrow{\mathbb C}} by

\displaystyle  g(x)=\sum_nf(x+n) (21)

Since {f} and its derivatives vanish rapidly at {\infty}, this sum is uniformly convergent, with smooth limit. Writing out its Fourier expansion,

\displaystyle  \setlength\arraycolsep{2pt} \begin{array}{rl} &\displaystyle g(x)=\sum_nc_ne^{2\pi inx}\smallskip\\ &\displaystyle c_n=\int_0^1 g(x)e^{-2\pi i nx}\,dx, \end{array}

the Fourier coefficients can be evaluated,

\displaystyle  \setlength\arraycolsep{2pt} \begin{array}{rl} \displaystyle c_n &\displaystyle=\sum_m\int_0^1f(x+m)e^{-2\pi i n x}dx\smallskip\\ &\displaystyle=\sum_m\int_m^{m+1}f(x)e^{-2\pi i n x}\,dx\smallskip\\ &\displaystyle=\int f(x)e^{-2\pi i nx}\,dx=\hat f(n). \end{array}

Substituting into (21) proves theorem 13.

\displaystyle  \sum_nf(n)=g(0)=\sum_n c_n=\sum_n\hat f(n).

In practise, it is convenient to express the Poisson summation formula in a slightly more general way. For each fixed {x\in{\mathbb R}^*}, the Fourier transform of {y\mapsto f(yx)} is equal to {\lvert x\rvert^{-1}\hat f(y/x)} and, putting this in (20), gives the following alternative statement of Poisson summation.

Theorem 14 If {f\in \mathcal S} has Fourier transform {\hat f} then, for any {x\in{\mathbb R}^*},

\displaystyle  \sum_n f(nx)=\frac1{\lvert x\rvert}\sum_n\hat f(n/x) (22)

The Functional Equation

The proof of the functional equation starts with the following identity

\displaystyle  \int f(nx)\lvert x\rvert^s\,d^*x = \lvert n\rvert^{-s}\int f(x)\lvert x\rvert^s\,d^*x.

Here, {n} is any nonzero integer, {f} is a Schwartz function on the reals, {\Re(s) > 0}, and {d^*x=dx/\lvert x\rvert} represents the Haar measure on the multiplicative group {{\mathbb R}^*}. The identity is achieved simply by substituting {x} with {x/n}. Restricting to {\Re(s) > 1}, we can sum over {n},

\displaystyle  \int \sum_{n\not=0}f(nx)\lvert x\rvert^s\,d^*x = 2M(f,s)\zeta(s). (23)

What we would really like to do here is to simply substitute in (22) for the sum of {f(nx)}, substitute {x} by {x^{-1}} in the integral, and immediately derive the functional equation (11). Unfortunately this leads to divergent sums and integrals. Instead, start by rearranging (22) as

\displaystyle  \sum_{n\not=0}f(nx)=\frac1{\lvert x\rvert}\sum_{n\not=0}\hat f(n/x) + \frac1{\lvert x\rvert}\hat f(0)-f(0).

We will apply this to the integrand in (23), but only over the range with {\lvert x\rvert < 1}.

\displaystyle  \setlength\arraycolsep{2pt} \begin{array}{rl} \displaystyle \int\limits_{\lvert x\rvert < 1}\sum_{n\not=0}f(nx)\lvert x\rvert^s d^*x&\displaystyle=\int\limits_{\lvert x\rvert < 1}\left(\sum_{n\not=0}\hat f(n/x)\lvert x\rvert^{s-1}+\hat f(0)\lvert x\rvert^{s-1}-f(0)\lvert x\rvert^s\right)\,d^*x\smallskip\\ &\displaystyle=\int\limits_{\lvert x\rvert > 1}\sum_{n\not=0}\hat f(nx)\lvert x\rvert^{1-s}\,d^*x+2\frac{\hat f(0)}{s-1}-2\frac{f(0)}{s} \end{array}

Here, we substituted {x^{-1}} for {x} in the first term on the right hand side, and used the exact value for the integral in the other two terms. Using this in (23),

\displaystyle  \setlength\arraycolsep{2pt} \begin{array}{rl} \displaystyle M(f,s)\zeta(s)=&\displaystyle\frac12\int\limits_{\lvert x\rvert > 1}\sum_{n\not=0}\left(f(nx)\lvert x\rvert^s+\hat f(nx)\lvert x\rvert^{1-s}\right)\,d^*x\smallskip\\ &\displaystyle\qquad+\frac{\hat f(0)}{s-1}-\frac{f(0)}{s}. \end{array} (24)

As {f\in\mathcal S} vanishes faster than any power of {x} at infinity, the sum {\sum_{n\not=0}f(nx)} is absolutely convergent and also vanishes faster than any power of {x}. The same statement holds for {\hat f}, so the integral in (24) is defined for all {s\in{\mathbb C}} and is analytic. This extends {M(f,s)\zeta(s)} to a meromorphic function on the complex plane with poles and residues as stated in theorem 4. Finally, noting that the Fourier transform of {\hat f(x)} is {f(-x)}, the right hand side of (24) is unchanged if {s} is replaced by {1-s} and {f} is replaced by {\hat f}, proving the functional equation (11).

A Note on Mellin Transforms

The Mellin transform was defined above as an integral over the real numbers (9), which deviates slightly from the more usual definition as an integral over the positive reals,

\displaystyle  M_{{\mathbb R}^+}(f,s)=\int_0^\infty f(x) x^{s-1}\,dx.

The reason for using the alternative definition is that we were interested in the functional equation relating it to the Fourier transform defined over the reals, so also required the Mellin transform to be defined with the same domain of integration. However, in doing so, we lose some properties, such as the existence of an inversion formula which, for the usual Mellin transform, is

\displaystyle  f(x)=\frac1{2\pi i}\int\limits_{c-i\infty}^{c+i\infty}M_{{\mathbb R}^+}(f,s) x^{-s}\,ds,

for any fixed {c} in the domain where the Mellin transform is absolutely integrable.

The transform defined by (9) is unchanged if {f(x)} is replaced by {f(-x)}, so is not one-to-one and cannot be inverted. The best that can be done is to recover the even part of {f},

\displaystyle  f(x)+f(-x)=\frac1{2\pi i}\int\limits_{c-i\infty}^{c+i\infty}M(f,s)\lvert x\rvert^{-s}\,ds.

An explanation for the non-invertibility of the Mellin transform defined over {{\mathbb R}^*} is that we did not consider the full set of characters. We only looked at characters of the form {x\mapsto\lvert x\rvert^{s}} but, for example, this excludes the function {{\rm sgn}(\cdot)} mapping positive reals to {1} and negative reals to {-1}. More generally, for any {s\in{\mathbb C}} and {\epsilon=\pm1}, a character {\chi_{\epsilon,s}} can be defined by

\displaystyle  \chi_{\epsilon,s}(x)=\begin{cases} \lvert x\rvert^s,&{\rm if\ }\epsilon=1,\\ {\rm sgn}(x)\lvert x\rvert^s,&{\rm if\ }\epsilon=-1. \end{cases} (25)

It is immediate that this is a continuous map from the nonzero reals to {{\mathbb C}^*} satisfying {\chi_{\epsilon,s}(xy)=\chi_{\epsilon,s}(x)\chi_{\epsilon,s}(y)}. In fact, it can shown that (25) gives the full set of characters on {{\mathbb R}^*} . The characters given by {\epsilon=1}, which we made use of in the discussion above, are precisely those which are trivial on the roots of unity {\{\pm1\}} and are called unramified characters. Those given by {\epsilon=-1} are called ramified.

The Mellin transform with respect to an arbitrary character {\chi\colon{\mathbb R}^*\rightarrow{\mathbb C}^*} is

\displaystyle  M(f,\chi)=\int_{-\infty}^\infty f(x)\chi(x)\,\frac{dx}{\lvert x\rvert}.

The transform defined using the full set of characters can be inverted,

\displaystyle  f(x)=\frac{1}{4\pi i}\int\limits_{c-i\infty}^{c+i\infty}\sum_{\epsilon=\pm1}M(f,\chi_{\epsilon,s})\chi_{\epsilon,s}(x)^{-1}\,ds.

The argument given above, including the proof of the functional equation (11), could have been carried out with the full set of characters. In that case, the zeta function is replaced by

\displaystyle  \zeta(\epsilon,s)=\frac12\sum_{n\not=0}\chi_{\epsilon,s}(n)^{-1}.

For the unramified characters, {\epsilon=1}, this is the usual Riemann zeta function. For ramified characters, {\zeta} equals {0} and the functional equation reduces to the trivial statement {0=0}. So, there was nothing to be gained by including ramified characters in the discussion.