Skip to main content

Section 7.6 Proof of Dirichlet’s theorem, part 1

In this section we’ll start to prove Dirichlet’s theorem. We’ll give essentially the whole proof except for the hardest part of the proof, showing that \(L(1,\chi) \neq 0\) for every character \(\chi\text{.}\) Showing that will be the second part of the proof, in the next section of notes.
As in the statement of Dirichlet’s theorem, we fix positive integers \(a,q\) such that \((a,q) = 1\text{.}\)

Subsection 7.6.1 Going from summing over primes to summing over an arithmetic progression

This step is similar to part of the proof of Mertens’s theorem.

Proof.

We have
\begin{align} \sum_{n \leq x, n \equiv a \pmod{q}} \frac{\Lambda(n)}{n} \amp = \sum_{p^k \leq x, p^k \equiv a \pmod{q}} \frac{\log p}{p^k} \tag{7.6.2}\\ \amp = \sum_{p \leq x, p \equiv a \pmod{q}} \frac{\log p}{p} + \sum_{p^k \leq x, k \geq 2, p^k \equiv a \pmod{q}} \frac{\log p}{p^k} . \tag{7.6.3} \end{align}
The last sum converges: we have
\begin{align} \sum_{p^k \leq x, k \geq 2, p^k \equiv a \pmod{q}} \frac{\log p}{p^k} \amp \leq \sum_n \log n \sum_{k \geq 2} \frac{1}{n^k} \tag{7.6.4}\\ \amp = \sum_n \frac{\log n}{n(n-1)} \tag{7.6.5} \end{align}
which converges by comparison with a p-series.

Subsection 7.6.2 Summing over all positive integers by using the orthogonality relations of Dirichlet characters

Our next step is to take the sum of \(\frac{\Lambda(n)}{n}\) from a sum over \(n \equiv a \pmod{q}\) into a sum over all positive \(n\text{.}\) We will use the orthogonality relations of Dirichlet characters so that the terms \(n \equiv a \pmod{q}\) remain in the sum, while all other values of \(n\) cancel out, up to some bounded quantity.

Proof.

Recall that for any integers \(k,m\text{,}\) if \((m,q)=1\text{,}\) we have the orthogonality relation
\begin{equation*} \sum_{\chi \mod q} \chi(k) \overline{\chi}(m) = \begin{cases} \phi(q), \amp \text{if } k \equiv m \pmod{q}, \\ 0, \amp \text{otherwise}. \end{cases} \end{equation*}
Using this,
\begin{align} \sum_{n \leq x, n \equiv a \pmod{q}} \frac{\Lambda(n)}{n} \amp = \sum_{n \leq x} \frac{\Lambda(n)}{n} \left( \frac{1}{\phi(q)} \sum_{\chi \mod q} \chi(n) \overline{\chi}(a) \right) \tag{7.6.9}\\ \amp = \frac{1}{\phi(q)} \sum_{\chi \mod q} \overline{\chi}(a) \sum_{n \leq x} \frac{\chi(n)\Lambda(n)}{n} . \tag{7.6.10} \end{align}
This shows the first claim.
For the principal character, the summation of \(\frac{\chi_0(n)\Lambda(n)}{n} \) agrees with the summation of \(\frac{\Lambda(n)}{n} \) except for terms where \((n,q) \gt 1\text{.}\) Mertens’s theorem includes the result that
\begin{equation*} \sum_{n \leq x} \frac{\Lambda(n)}{n} = \log x + O(1)\text{.} \end{equation*}
Now consider the terms with \((n,q) \gt 1\text{.}\) There is an additional restriction: \(\Lambda(n) \neq 0\) only when \(n = p^k\) is a power of a prime. So the difference between the sum we are considering, and the sum that appeared in Mertens’s theorem, is just for terms where \(n = p^k\) is a power of a prime, for a prime \(p\) that divides the modulus \(q\text{.}\)
There are only finitely many such primes. Each such prime \(p\) contributes a difference of \(\sum_{p^k \leq x} \frac{\log p}{p^k}\text{.}\) We have seen already the convergence of the full series
\begin{equation*} \sum_{k=1}^{\infty} \frac{\log p}{p^k}\text{.} \end{equation*}
So the sum for \(p^k \leq x\) is a bounded amount. Adding up these bounded amounts for the finitely many primes that divide \(q\) gives a bounded amount, in other words an \(O(1)\text{.}\) This proves the second claim.
The third claim follows from the first one, separating out the terms for the principal and nonprincipal characters, and substituting the estimate for the principal character.
Combining this with step 1, we get
\begin{equation*} \sum_{p \leq x, p \equiv a \pmod{q}} \frac{\log p}{p} = \frac{1}{\phi(q)} \log x + \frac{1}{\phi(q)} \sum_{\chi \mod q, \chi \neq \chi_0} \overline{\chi}(a) \sum_{n \leq x} \frac{\chi(n) \Lambda(n)}{n} + O(1)\text{.} \end{equation*}
Our goal is to show that it is equal to \(\frac{1}{\phi(q)} \log x + O(1)\text{.}\) So, what is left is to show that the middle term is bounded (\(O(1)\)). In fact we will show that for each \(\chi \neq \chi_0\text{,}\) the sum \(\sum_{n \leq x} \frac{\chi(n)\Lambda(n)}{n} \) is bounded. Since there are finitely many characters (\(\hat{G}\) is a finite group), then adding them all together gives a bounded quantity.

Remark 7.6.3.

This step is the first, last, and only place where the integer \(a\) (as in, the primes such that \(p \equiv a \pmod{q}\)) plays a role in the proof. Specifically, the integer \(a\) is used to provide the coefficients \(\overline{\chi}(a)\) used to combine the sums \(\sum_{n \leq x} \frac{\chi(n) \Lambda(n)}{n}\text{.}\)

Subsection 7.6.3 Each nonprincipal character contributes a bounded amount

Proof: Nonzero case.

Assume that \(L(1,\chi) \neq 0\text{.}\) Consider the sum
\begin{equation*} \sum_{n \leq x} \frac{\chi(n)\log n}{n}\text{.} \end{equation*}
By Corollary 7.5.3 this sum converges as \(x \to \infty\text{,}\) so it is bounded, i.e., \(O(1)\text{.}\)
On the other hand, we have
\begin{equation} \sum_{n \leq x} \frac{\chi(n)\log n}{n} = \sum_{n \leq x} \frac{\chi(n)}{n} \sum_{d \mid n} \Lambda(n) = \sum_{dk \leq x} \frac{\chi(dk)}{dk} \Lambda(d)\text{.}\tag{7.6.12} \end{equation}
The first equality follows from the identity \(\log n = \sum_{d \mid n} \Lambda(d)\) which we have seen before. The second equality is by substitution \(n=dk\text{.}\)
Since every Dirichlet character is completely multiplicative, we have \(\chi(dk) = \chi(d)\chi(k)\) (for all \(d,k\text{,}\) including if \((d,k) \gt 1\)). So the summand in the last sum can be rewritten as
\begin{equation} \frac{\chi(dk)}{dk} \Lambda(d) = \frac{\chi(d)\Lambda(d)}{d} \cdot \frac{\chi(k)}{k}\text{.}\tag{7.6.13} \end{equation}
First we will fix \(d\) and evaluate the sum over \(k \leq \frac{x}{d}\text{.}\) We know that for \(d \leq x\text{,}\)
\begin{equation} \sum_{k \leq \frac{x}{d}} \frac{\chi(k)}{k} = L(1,\chi) + O\left(\frac{d}{x}\right)\tag{7.6.14} \end{equation}
by Corollary 7.5.3. Therefore
\begin{align*} \sum_{dk \leq x} \frac{\chi(d)\Lambda(d)}{d} \cdot \frac{\chi(k)}{k} \amp = \sum_{d \leq x} \frac{\chi(d)\Lambda(d)}{d} \cdot \left( L(1,\chi) + O\left(\frac{d}{x}\right) \right) \\ \amp = L(1,\chi) \sum_{d \leq x} \frac{\chi(d)\Lambda(d)}{d} + \sum_{d \leq x} \frac{\chi(d)\Lambda(d)}{d} \cdot O\left(\frac{d}{x}\right) \\ \amp = L(1,\chi) \sum_{d \leq x} \frac{\chi(d)\Lambda(d)}{d} + \sum_{d \leq x} \chi(d)\Lambda(d) \cdot O\left(\frac{1}{x}\right) \end{align*}
We want to simplify that last sum. First, observe that we can bound the sum:
\begin{equation*} \left| \sum_{d \leq x} \chi(d)\Lambda(d) \right| \leq \sum_{d \leq x} |\Lambda(d)| = \sum_{d \leq x} \Lambda(d)\text{.} \end{equation*}
Here we are using that \(|\chi(d)| \leq 1\) because each \(\chi(d)\) is a root of unity, or zero. Every value of \(\Lambda(d)\) is either \(\log p\) or zero; and \(\log p \gt 0\) (we have \(\log 2 \lt 1\text{,}\) but still \(\gt 0\)). Recall that \(\sum_{d \leq x} \Lambda(d) = \psi(x)\text{,}\) Chebyshev’s \(\psi\) function, by definition. And recall that by Chebyshev’s estimates, \(\psi(x) = O(x)\text{.}\)
So, \(\sum_{d \leq x} \chi(d)\Lambda(d) = O(x)\text{.}\) Can we conclude that \(\sum_{d \leq x} \chi(d)\Lambda(d)O(\frac{1}{x}) = O(1)\text{?}\) Not immediately. The problem is that the implicit constant appearing in the \(O(1/x)\) factor might a priori depend on \(d\text{;}\) and it might grow as \(d\) grows.
Well, the \(O(1/x)\) factors are not arbitrary, independent things. They come from the same place: the tails of the sum \(\sum \frac{\chi(k)}{k}\text{.}\) Specifically, we have
\begin{equation} \left| \sum_{k \gt x} \frac{\chi(k)}{k} \right| \leq C \cdot \frac{1}{x}\tag{7.6.15} \end{equation}
for some constant \(C\) that doesn’t depend on \(x\text{.}\) In particular that means
\begin{equation*} \left| \sum_{k \gt \frac{x}{d}} \frac{\chi(k)}{k} \right| \leq C \cdot \frac{d}{x} \end{equation*}
where the constant \(C\) doesn’t depend on \(x\) or on \(d\text{.}\) So, the sum we’re trying to bound can now be bounded:
\begin{equation} \left| \sum_{d \leq x} \frac{\chi(d)\Lambda(d)}{d} O\left(\frac{d}{x}\right) \right| \leq \sum_{d \leq x} \left| \frac{\chi(d)\Lambda(d)}{d} \cdot C \cdot \frac{d}{x} \right| \leq \frac{C}{x} \sum_{d \leq x} \Lambda(d)\tag{7.6.16} \end{equation}
and again, the sum of \(\Lambda(d)\) is equal to \(\psi(x)\) which is \(O(x)\text{.}\)
In conclusion,
\begin{equation} L(1,\chi) \sum_{d \leq x} \frac{\chi(d) \Lambda(d)}{d} + O(1) = \sum_{n \leq x} \frac{\chi(n)\log(n)}{n} = O(1)\tag{7.6.17} \end{equation}
and so therefore
\begin{equation} L(1,\chi) \sum_{d \leq x} \frac{\chi(d) \Lambda(d)}{d} = O(1)\tag{7.6.18} \end{equation}
Finally, since \(L(1,\chi) \neq 0\text{,}\) we can divide to get that the sum is \(O(1)\) as claimed.
Remark 7.6.5.
It could have all gone wrong if it had been
\begin{equation*} \sum_{d \leq x} \chi(d)\Lambda(d) \cdot O\left(\frac{1}{x}\right) = \sum_{d \leq x} \chi(d)\Lambda(d) \cdot \frac{C_d}{x} \end{equation*}
with \(C_d \to \infty\text{.}\) This highlights the importance of knowing when the implicit constant in big O notation is dependent or independent of other variables.

Proof: Zero case.

Assume that \(L(1,\chi) = 0\text{.}\) We will show that \(\log x + \sum_{n \leq x} \frac{\chi(n)\Lambda(n)}{n}\) is bounded.
Recall the identities
\begin{equation*} \Lambda(n) = \sum_{d \mid n} \mu(d) \log \frac{n}{d} = - \sum_{d \mid n} \mu(d) \log d\text{.} \end{equation*}
(As a reminder: We have mentioned the identity \(\log n = \sum_{d \mid n} \Lambda(d)\text{.}\) Then Möbius inversion gives \(\Lambda = \mu * \log\text{,}\) which is the first equality. The second equality is algebraic simplification.)
Using these identities,
\begin{align} \sum_{d \mid n} \mu(d) \log \frac{x}{d} \amp = (\log x) \sum_{d \mid n} \mu(d) - \sum_{d \mid n} \mu(d) \log d \tag{7.6.19}\\ \amp = (\log x) \sum_{d \mid n} \mu(d) + \Lambda(n) \tag{7.6.20}\\ \amp = \begin{cases} \log x, \amp \text{if } n = 1 \\ \Lambda(n), \amp \text{if } n \gt 1 \end{cases}\tag{7.6.21} \end{align}
because if \(n = 1\) then \(\sum_{d \mid n} \mu(d) = 1\) and \(\Lambda(n) = 0\text{,}\) while if \(n \gt 1\) then \(\sum_{d \mid n} \mu(d) = 0\text{.}\)
Now we consider \(\log x + \sum_{n \leq x} \frac{\chi(n) \Lambda(n)}{n}\text{,}\) with the goal of showing it is bounded. Both of the expressions \(\log x\) and \(\Lambda(n)\) can be replaced with \(\sum_{d \mid n} \mu(d) \log \frac{x}{d}\text{.}\) In addition, when \(n=1\) we have \(\frac{\chi(n)}{n} = \frac{\chi(1)}{1} = 1\text{.}\) After we make this substitution, we will interchange the order of summation:
\begin{align} \log x + \sum_{n \leq x} \frac{\chi(n)\Lambda(n)}{n} \amp = \sum_{n \leq x} \frac{\chi(n)}{n} \sum_{d \mid n} \mu(d) \log \frac{x}{d}\tag{7.6.22}\\ \amp = \sum_{d \leq x} \mu(d) \log \frac{x}{d} \sum_{k \leq \frac{x}{d}} \frac{\chi(dk)}{dk}\tag{7.6.23}\\ \amp = \sum_{d \leq x} \mu(d) \left( \log \frac{x}{d}\right) \frac{\chi(d)}{d} \sum_{k \leq \frac{x}{d}} \frac{\chi(k)}{k}\tag{7.6.24}\\ \amp = \sum_{d \leq x} \mu(d) \left( \log \frac{x}{d}\right) \frac{\chi(d)}{d} \left( L(1,\chi) + O\left(\frac{d}{x} \right) \right)\tag{7.6.25}\\ \amp = \sum_{d \leq x} \mu(d) \left( \log \frac{x}{d}\right) \frac{\chi(d)}{d} \cdot O\left(\frac{d}{x} \right)\tag{7.6.26}\\ \amp = O\left(\frac{1}{x}\right) \sum_{d \leq x} \mu(d) \left( \log \frac{x}{d} \right) \chi(d)\tag{7.6.27}\\ \amp = O\left(\frac{1}{x}\right) O\left(\sum_{d \leq x} \log \frac{x}{d} \right)\tag{7.6.28}\\ \amp = O(1) \quad \text{(!!) (Exercise)}\tag{7.6.29} \end{align}
Therefore \(\sum_{n \leq x} \frac{\chi(n)\Lambda(n)}{n} = -\log x + O(1)\text{.}\)