Proof of Dirichlet’s theorem, part 2

Section 7.7 Proof of Dirichlet’s theorem, part 2

In this section we’ll finish the proof of Dirichlet’s theorem. What is left is to show that \(L(1,\chi) \neq 0\) for every character \(\chi\text{.}\) We already know it for the principal character \(\chi_0\) (in fact \(L(1,\chi_0)=\infty\)), so we just have to show it for nonprincipal characters. This will happen in two steps: First we’ll show it for real nonprincipal characters. The Dirichlet hyperbola method makes an appearance here. Second, we’ll show it for nonreal nonprincipal characters, by a clever trick which hinges on complex conjugation.

We fix a positive integer \(q\text{.}\) For this part of the proof, the integer \(a\) (as in, the primes such that \(p \equiv a \pmod{q}\)) doesn’t play a role.

Subsection 7.7.1 Real nonprincipal characters

Definition 7.7.1.

A character \(\chi\) is called real if its values are in the real numbers.

Since the values of \(\chi\) are roots of unity (or zero), the only real values are \(\pm 1, 0\text{.}\) For example (we won’t use these examples in the proof):

The principal character is real.
If \(q\) is prime, the Legendre symbol \(\chi(n) = \left(\frac{n}{q}\right)\) is a real character.
As an exercise, one can show that modulo a prime \(q\text{,}\) the principal character and the Legendre symbol are the only real characters.

Theorem 7.7.2.

Let \(\chi\) be a real nonprincipal Dirichlet character modulo \(q\text{.}\) Then \(L(1,\chi) \neq 0\text{.}\)

Proof.

We know that \(\chi\) is completely multiplicative.

Claim 7.7.3. Step 1.

Let \(r(n) = \sum_{d \mid n} \chi(d)\text{.}\) Then \(r(n) \geq 0\) for all \(n \in \bbN\) and \(r(n) \geq 1\) if \(n\) is a square.

Proof.

Since \(\chi\) is (completely) multiplicative and \(r = \chi * 1\) (Dirichlet convolution), it follows that \(r\) is multiplicative (but not necessarily completely multiplicative).

Let \(p\) be a prime and \(a \geq 1\text{.}\) Then

\begin{align} r(p^a) \amp = \sum_{d \mid p^a} \chi(d) \tag{7.7.1}\\ \amp = \chi(1) + \chi(p) + \chi(p^2) + \dotsb + \chi(p^a) \tag{7.7.2}\\ \amp = \chi(1) + \chi(p) + \chi(p)^2 + \dotsb + \chi(p)^a \tag{7.7.3}\\ \amp = \begin{cases} a+1, \amp \text{if } \chi(p) = 1; \\ 1, \amp \text{if } \chi(p) = 0; \\ 1, \amp \text{if } \chi(p) = -1 \text{ and } a=2k; \\ 0, \amp \text{if } \chi(p) = -1 \text{ and } a=2k+1. \end{cases} \tag{7.7.4} \end{align}

Let \(n = p_1^{a_1} \dotsm p_k^{a_k}\text{.}\) Then \(r(n) = r(p_1^{a_1}) \dotsm r(p_k^{a_k})\text{.}\) We saw that every \(r(p^a) \geq 0\text{,}\) hence \(r(n) \geq 0\text{.}\) If \(n\) is a perfect square, then every power \(a_i\) is even, so each \(r(p_i^{a_i}) \geq 1\text{.}\) It follows that \(r(n) \geq 1\text{.}\)

Claim 7.7.4. Step 2.

Let \(A(x) = \sum_{n \leq x} \frac{r(n)}{\sqrt{n}}\text{.}\) Then \(A(x) \to \infty\) as \(x \to \infty\text{.}\)

Proof.

We have

\begin{align*} A(x) \amp = \sum_{n \leq x} \frac{r(n)}{\sqrt{n}} \\ \amp \geq \sum_{n \leq x, n = m^2} \frac{1}{\sqrt{n}} \\ \amp = \sum_{m \leq \sqrt{x}} \frac{1}{m} \end{align*}

This is a partial sum of the harmonic series; as \(x \to \infty\) it diverges.

Claim 7.7.5. Step 3.

\(A(x) = 2\sqrt{x} L(1,\chi) + O(1) \text{.}\)

Proof.

We will use the Dirichlet hyperbola method.

We have

\begin{align} A(x) \amp = \sum_{n \leq x} \frac{r(n)}{\sqrt{n}} \tag{7.7.5}\\ \amp = \sum_{n \leq x} \frac{1}{\sqrt{n}} \sum_{d \mid n} \chi(d) \tag{7.7.6}\\ \amp = \sum_{dk \leq x} \frac{\chi(d)}{\sqrt{dk}} \tag{7.7.7}\\ \amp = \sum_{dk \leq x} \frac{\chi(d)}{\sqrt{d}} \cdot \frac{1}{\sqrt{k}} \tag{7.7.8} \end{align}

This sum is over lattice points \((d,k)\) where \(d,k\) are postive integers such that \(dk \leq x\text{,}\) i.e., lying on or underneath the hyperbola \(dk = x\) in the \(dk\)-plane. We separate the region under the hyperbola into three parts as follows: the first part is defined by \(d \leq \sqrt{x}\text{,}\) the second is defined by \(k \leq \sqrt{x}\text{,}\) and the third is the intersection of the first two, so \(d, k \leq \sqrt{x}\) (both). Visually, if \(d\) is the horizontal coordinate and \(k\) is the vertical coordinate, then the region under the hyperbola with \(d \leq \sqrt{x}\) is the “upward” branch, the region under the hyperbola with \(k \leq \sqrt{x}\) is the “rightward” branch, and the region with both \(d,k \leq \sqrt{x}\) is a square. Define \(S_1,S_2,S_3\) as the sums over lattice points in these regions:

\begin{equation} S_1 = \sum_{dk \leq x, d \leq \sqrt{x}} \frac{\chi(d)}{\sqrt{d}} \frac{1}{\sqrt{k}}, \qquad S_2 = \sum_{dk \leq x, k \leq \sqrt{x}} \frac{\chi(d)}{\sqrt{d}} \frac{1}{\sqrt{k}}, \qquad S_3 = \sum_{dk \leq x, d \leq \sqrt{x}, k \leq \sqrt{x}} \frac{\chi(d)}{\sqrt{d}} \frac{1}{\sqrt{k}}.\tag{7.7.9} \end{equation}

We have

\begin{equation} A(x) = S_1 + S_2 - S_3.\tag{7.7.10} \end{equation}

Now our goal is to approximate the \(S_i\text{.}\) Looking ahead, we can expect to encounter certain sums, so it will be helpful to recall the following approximations:

\begin{equation} \sum_{k \leq y} \frac{1}{k^s} = \frac{y^{-s+1}}{-s+1} + O(1)\tag{7.7.11} \end{equation}

and

\begin{equation} \sum_{d \leq z} \frac{\chi(d)}{d^s} = L(s,\chi) + O\left(\frac{1}{z^s}\right)\text{,}\tag{7.7.12} \end{equation}

both of these holding for all \(0 \lt s \lt 1\text{.}\) (The textbook mentions a more accurate approximation for the sum of \(1/k^s\) that can be found by using Euler summation.) We will use them for \(s = \frac{1}{2}\text{.}\) In order to avoid headaches about fractions, here are explicit statements:

\begin{equation} \sum_{k \leq y} \frac{1}{\sqrt{k}} = \frac{y^{1/2}}{1/2} + O(1) = 2\sqrt{y} + O(1)\tag{7.7.13} \end{equation}

and

\begin{equation} \sum_{d \leq z} \frac{\chi(d)}{\sqrt{d}} = L(1/2,\chi) + O\left(\frac{1}{\sqrt{z}}\right)\text{.}\tag{7.7.14} \end{equation}

We will apply these with various \(y\)’s and \(z\)’s; warning, sometimes we will use, for example, \(y = \sqrt{x}\) in which case \(\sqrt{y} = x^{1/4}\) (similar for \(z\)).

Now, here are the approximations of the \(S_i\text{.}\) The easiest is \(S_3\text{:}\)

\begin{align} S_3 \amp = \left( \sum_{d \leq \sqrt{x}} \frac{\chi(d)}{\sqrt{d}} \right) \left( \sum_{k \leq \sqrt{x}} \frac{1}{\sqrt{k}} \right) \tag{7.7.15}\\ \amp = (L(1/2,\chi) + O(1/x^{1/4})) (2x^{1/4} + O(1)) \tag{7.7.16}\\ \amp = 2x^{1/4} L(1/2,\chi) + O(1) \tag{7.7.17} \end{align}

Next, we approximate \(S_1\text{:}\)

\begin{align} S_1 \amp = \sum_{d \leq \sqrt{x}} \frac{\chi(d)}{\sqrt{d}} \sum_{k \leq \frac{x}{d}} \frac{1}{\sqrt{k}} \tag{7.7.18}\\ \amp = \sum_{d \leq \sqrt{x}} \frac{\chi(d)}{\sqrt{d}} \left(2\sqrt{x/d} + O(1) \right) \tag{7.7.19}\\ \amp = 2\sqrt{x} \sum_{d \leq \sqrt{x}} \frac{\chi(d)}{d} + O(1) \sum_{d \leq \sqrt{x}} \frac{\chi(d)}{\sqrt{d}} \tag{7.7.20}\\ \amp = 2\sqrt{x} (L(1,\chi) + O(1/\sqrt{x})) + O(1)(L(1/2,\chi) + O(1/x^{1/4})) \tag{7.7.21}\\ \amp = 2 \sqrt{x} L(1,\chi) + O(1) \tag{7.7.22} \end{align}

(For some reason, the textbook uses more accurate approximations here and does extra work, to arrive at the same result in the end.) Finally, \(S_2\) is similar to \(S_1\text{:}\)

\begin{align} S_2 \amp = \sum_{k \leq \sqrt{x}} \frac{1}{\sqrt{k}} \sum_{d \leq \frac{x}{k}} \frac{\chi(d)}{\sqrt{d}} \tag{7.7.23}\\ \amp = \sum_{k \leq \sqrt{x}} \frac{1}{\sqrt{k}} (L(1/2,\chi) + O(1/\sqrt{x/k})) \tag{7.7.24}\\ \amp = L(1/2,\chi) \sum_{k \leq \sqrt{x}} \frac{1}{\sqrt{k}} + O(1/\sqrt{x}) \sum_{k \leq \sqrt{x}} \frac{1}{\sqrt{k}} O(\sqrt{k}) \tag{7.7.25}\\ \amp = L(1/2,\chi) (2x^{1/4} + O(1)) + O(1/\sqrt{x}) O(\sqrt{x}) \tag{7.7.26}\\ \amp = 2x^{1/4} L(1/2,\chi) + O(1) \tag{7.7.27} \end{align}

Putting it all together:

\begin{align} A(x) \amp = S_1 + S_2 - S_3 \tag{7.7.28}\\ \amp = (2 \sqrt{x} L(1,\chi) + O(1)) + (2 x^{1/4} L(1/2,\chi) + O(1)) - (2x^{1/4}L(1/2,\chi) + O(1)) \tag{7.7.29}\\ \amp = 2 \sqrt{x} L(1,\chi) + O(1) \tag{7.7.30} \end{align}

which proves the claim.

Claim 7.7.6. Conclusion.

\(L(1,\chi) \neq 0\text{.}\)

Proof.

If \(L(1,\chi)=0\) then \(A(x) = O(1)\text{,}\) contradicting that \(A(x) \to \infty\) as \(x \to \infty\text{.}\)

Subsection 7.7.2 Nonreal nonprincipal characters

Theorem 7.7.7.

Let \(\chi\) be any nonprincipal Dirichlet character modulo \(q\text{.}\) Then \(L(1,\chi) \neq 0\text{.}\)

Proof.

Let \(N\) be the number of Dirichlet characters \(\chi\) modulo \(q\) with \(L(1,\chi) = 0\text{.}\) Our goal is to show that \(N = 0\text{.}\)

Recall that:

For the principal character \(\chi_0\text{,}\) \(\sum_{n \leq x} \frac{\chi_0(n)\Lambda(n)}{n} = \log x + O(1)\text{.}\)
For \(\chi \neq \chi_0\text{,}\) if \(L(1,\chi) \neq 0\text{,}\) then \(\sum_{n \leq x} \frac{\chi(n)\Lambda(n)}{n} = O(1)\text{.}\)
For \(\chi \neq \chi_0\text{,}\) if \(L(1,\chi) = 0\text{,}\) then \(\sum_{n \leq x} \frac{\chi(n)\Lambda(n)}{n} = -\log x + O(1)\text{.}\)

Adding this up for all characters modulo \(q\text{,}\) we get

\begin{equation} \sum_{\chi \mod q} \sum_{n \leq x} \frac{\chi(n)\Lambda(n)}{n} = (1-N)\log x + O(1)\tag{7.7.31} \end{equation}

We get \(+1\log x\) from the principal character, and \(-N\log x\) from the nonprincipal characters with \(L(1,\chi)=0\text{.}\)

We can use the orthogonality relations to simplify the left hand side of the equation:

\begin{align} \sum_{\chi \mod q} \sum_{n \leq x} \frac{\chi(n)\Lambda(n)}{n} \amp = \sum_{n \leq x} \left( \sum_{\chi \mod q} \chi(n) \right) \frac{\Lambda(n)}{n} \tag{7.7.32}\\ \amp = \phi(q) \sum_{n \leq x, n \equiv 1 \pmod{q}} \frac{\Lambda(n)}{n} \tag{7.7.33}\\ \amp \geq 0 \tag{7.7.34} \end{align}

We used that \(\sum_{\chi \mod q} \chi(n)\) is equal to \(\phi(q)\) if \(n \equiv 1 \pmod{q}\text{,}\) or \(0\) otherwise. And we used that each term \(\frac{\Lambda(n)}{n} \geq 0\text{.}\)

The result is that

\begin{equation*} (1-N)\log x + O(1) \geq 0 \end{equation*}

Well, if \(N \gt 1\) this is impossible, the left hand side will go to \(-\infty\text{.}\) Therefore \(N \leq 1\text{.}\)

Finally we rule out the possibility that \(N=1\text{.}\) If \(N=1\) it means that there is a nonprincipal character \(\chi\) with \(L(1,\chi) = 0\text{.}\) We know that \(\chi\) must be nonreal, i.e., \(\chi(a) \notin \bbR\) for some \(a\text{.}\)

But then \(\overline{\chi}\) is also a character, distinct from \(\chi\) (\(\overline{\chi} \neq \chi\)), and we have

\begin{equation*} L(1,\overline{\chi}) = \sum \frac{\overline{\chi}(n)}{n} = \overline{\sum \frac{\chi(n)}{n}} = \overline{L(1,\chi)} = 0 \end{equation*}

which would force \(N \geq 2\text{.}\)

This contradiction shows that \(N=1\) is impossible. So it must be \(N=0\text{.}\) There are no Dirichlet characters \(\chi\) with \(L(1,\chi)=0\text{.}\)

Subsection 7.7.3 Conclusion: Proof of Dirichlet’s theorem

Theorem 7.7.8. Dirichlet’s theorem.

Let \(q,a\) be positive integers with \((a,q)=1\text{.}\) Then

\begin{equation*} \sum_{p \leq x, p \equiv a \pmod{q}} \frac{\log p}{p} = \frac{1}{\phi(q)} \log x + O(1)\text{.} \end{equation*}

In particular, there are infinitely many primes congruent to \(a\) modulo \(q\text{.}\)

Proof.

We have

\begin{align} \sum_{p \leq x, p \equiv a \pmod{q}} \frac{\log p}{p} \amp = \sum_{n \leq x, n \equiv a \pmod{q}} \frac{\Lambda(n)}{n} + O(1) \tag{7.7.35}\\ \amp = \frac{1}{\phi(q)} \sum_{\chi \mod q} \overline{\chi}(a) \sum_{n \leq x} \frac{\chi(n)\Lambda(n)}{n} + O(1) \tag{7.7.36}\\ \amp = \frac{1}{\phi(q)} \log x + \frac{1}{\phi(q)} \sum_{\chi \mod q, \chi \neq \chi_0} \sum_{n \leq x} \frac{\chi(n)\Lambda(n)}{n} + O(1) \tag{7.7.37}\\ \amp = \frac{1}{\phi(q)} \log x + \frac{1}{\phi(q)} \sum_{\chi \mod q, \chi \neq \chi_0} O(1) + O(1) \tag{7.7.38}\\ \amp = \frac{1}{\phi(q)} \log x + O(1) \tag{7.7.39} \end{align}

as claimed.

Prev Top Next