Average Order of Growth

Section 5.1 Average Order of Growth

Recall some of the previously defined arithmetic functions:

\(d(n)\text{:}\) number of positive divisors of \(n\)
\(\sigma(n)\text{:}\) sum of positive divisors of \(n\)

These can “oscillate” wildly: for example \(d(n)\) can be arbitrarily large (\(d(2^{k-1}) = k\)) and it also drops down to \(d(n)=2\) infinitely often (specifically, at primes). We can “smooth out the oscillations” by averaging. That is, for any \(x \gt 1\text{,}\) we consider the average value of \(d(n)\) for all \(n\) up to \(x\text{:}\)

\begin{equation*} \frac{1}{x} \sum_{n \leq x} d(n) \text{.} \end{equation*}

Similarly for \(\sigma\) and any other arithmetic function.

The book uses capital letters for averages. Thus if \(f\) is an arithmetic function, then \(F\) is the average,

\begin{equation*} F(x) = \frac{1}{x} \sum_{n \leq x} f(n) \text{.} \end{equation*}

Unless confusion seems likely, it may be preferable to use an overline, as in \(\overline{f}\text{:}\)

\begin{equation*} \overline{f}(x) = \frac{1}{x} \sum_{n \leq x} f(n) \text{.} \end{equation*}

Observe that the domain of the average function is the real numbers rather than the positive integers. At this point, this is a more or less arbitrary choice (“why not”).

Definition 5.1.1.

Let \(f\) be an arithmetic function. We say \(g\) is an average order of \(f\) if \(g\) is positive and

\begin{equation} \overline{f}(x) \sim \frac{1}{x} \sum_{n \leq x} g(n) \text{.}\tag{5.1.1} \end{equation}

Typically we require \(g\) to be monotone.

The average order is not unique. For example, if \(g\) is an average order for \(f\) and \(g'(n) = g(n)\) for all but finitely many \(n\text{,}\) then \(g'\) is also an average order for \(f\text{.}\) Indeed, the difference between summing \(g(n)\) and summing \(g'(n)\) is some constant (finite) value; after dividing by \(x\text{,}\) the difference goes to zero as \(x \to \infty\text{.}\)

Nevertheless we will say “the average order” instead of “an average order”, just because it sounds nicer.

Subsection 5.1.1 Average order of number of divisors

Theorem 5.1.2.

For \(d(n)\text{,}\) the number of positive divisors of \(n\text{,}\) we have \(\sum_{n \leq x} d(n) = x \log x + O(x)\text{.}\) Thus, \(\overline{d}(x) \sim \log x\text{.}\) The average order of \(d(n)\text{,}\) the number of divisors of \(n\text{,}\) is \(\log n\text{.}\) That is, \(\overline{d}(x) = \frac{1}{x}\sum_{n \leq x} d(n) \sim \frac{1}{x} \sum_{n \leq x} \log n\text{.}\)

Proof.

We have

\begin{align*} \sum_{n \leq x} d(n) \amp = \sum_{n \leq x} \sum_{d \mid n} 1 \\ \amp = \sum_{d \leq x} \sum_{n \leq x, d \mid n} 1 \\ \amp = \sum_{d \leq x} \sum_{dk \leq x} 1 \\ \amp = \sum_{d \leq x} \left\lfloor \frac{x}{d} \right\rfloor \\ \amp = \sum_{d \leq x} \left( \frac{x}{d} + O(1) \right) \\ \amp = x \sum_{d \leq x} \frac{1}{d} + O(x) \\ \amp = x (\log x + O(1)) + O(x) \\ \amp = x \log x + O(x) \text{.} \end{align*}

We used \(\sum_{d \leq x} \frac{1}{d} = \log x + O(1)\text{.}\)

From this it follows that

\begin{equation*} \overline{d}(x) = \frac{1}{x} \sum_{n \leq x} d(n) = \log x + O(1)\text{.} \end{equation*}

In particular

\begin{equation*} \frac{\overline{d}(x)}{\log x} = 1 + \frac{O(1)}{\log x}\text{.} \end{equation*}

The fraction on the right goes to zero as \(x \to \infty\text{,}\) hence \(\overline{d}(x) \sim \log x\) as claimed.

Finally, we have seen that

\begin{equation*} \sum_{n \leq x} \log n = x \log x - x + O(1) \text{.} \end{equation*}

After dividing by \(x\) we get \(\log x + O(1)\text{,}\) matching \(\overline{d}(x)\text{.}\)

Subsection 5.1.2 Average order of sum of divisors

Theorem 5.1.3.

For the divisor sum function \(\sigma\) we have \(\sum_{n \leq x} \sigma(n) = \frac{\pi^2}{12} x^2 + O(x \log x)\text{.}\)

Proof.

We have

\begin{align*} \sum_{n \leq x} \sigma(n) \amp = \sum_{n \leq x} \sum_{d \mid n} d \\ \amp = \sum_{dk \leq x} d \\ \amp = \sum_{k \leq x} \sum_{d \leq x/k} d \\ \amp = \sum_{k \leq x} \frac{1}{2} \left\lfloor \frac{x}{k} \right\rfloor \left( \left\lfloor \frac{x}{k} \right\rfloor + 1 \right) \\ \amp = \frac{1}{2} \sum_{k \leq x} \left( \frac{x^2}{k^2} + O(x/k) \right) \\ \amp = \frac{1}{2} x^2 \sum_{k \leq x} \frac{1}{k^2} + O\left( x \sum_{k \leq x} \frac{1}{k} \right) \text{.} \end{align*}

We can use

\begin{equation*} \sum_{k \leq x} \frac{1}{k} = \log x + O(1) \end{equation*}

to rewrite

\begin{equation*} O\left( x \sum_{k \leq x} \frac{1}{k} \right) = O(x \log x) + O(x) = O(x \log x) \text{.} \end{equation*}

For the \(x^2\) term, we use these facts. First,

\begin{equation} \sum_{k = 1}^\infty \frac{1}{k^2} = \frac{\pi^2}{6} \text{.}\tag{5.1.2} \end{equation}

This is Euler’s solution of the Basel problem. For now we will use this without giving a proof (many proofs exist and can be found in various textbooks or online). Second, we have seen that for \(\alpha \gt 1\text{,}\)

\begin{equation} \sum_{n \gt x} \frac{1}{n^\alpha} = O(x^{-\alpha+1}) \text{.}\tag{5.1.3} \end{equation}

Combining these,

\begin{align*} \frac{1}{2} x^2 \sum_{k \leq x} \frac{1}{k^2} \amp = \frac{1}{2} x^2 \left( \sum_{k=1}^\infty \frac{1}{k^2} - \sum_{k \gt x} \frac{1}{k^2} \right) \\ \amp = \frac{1}{2} x^2 \left( \frac{\pi^2}{6} - O(x^{-1}) \right) \\ \amp = \frac{\pi^2}{12} x^2 - O(x) \text{.} \end{align*}

This completes the proof.

Corollary 5.1.4.

The average order of \(\sigma(n)\) is \(\frac{\pi^2}{6} n\text{.}\)

Proof.

The claim is that

\begin{equation*} \frac{1}{x} \sum_{n \leq x} \sigma(n) \sim \frac{1}{x} \sum_{n \leq x} \frac{\pi^2}{6} n \text{.} \end{equation*}

The left hand side is

\begin{equation*} \frac{1}{x} \sum_{n \leq x} \sigma(n) = \frac{\pi^2}{12} x + O(\log x) \end{equation*}

and the right hand side is

\begin{align*} \frac{1}{x} \sum_{n \leq x} \frac{\pi^2}{6} n \amp = \frac{\pi^2}{12} \frac{1}{x} \lfloor x \rfloor ( \lfloor x \rfloor + 1 ) \\ \amp = \frac{\pi^2}{12} x + O(1) \text{.} \end{align*}

The two quantities are asymptotic since

\begin{equation*} \lim_{x \to \infty} \frac{\frac{\pi^2}{12} x + O(\log x)}{\frac{\pi^2}{12} x + O(1) } = 1\text{.} \end{equation*}

This is not quite the same thing as saying that the average value of \(\sigma(n)/n\) should be \(\pi^2/6\text{.}\) In our statements above we averaged \(\sigma(n)\text{.}\) We have not evaluated the sum or the average of \(\sigma(n)/n\text{.}\)

Proposition 5.1.5.

\begin{equation} \sum_{n \leq x} \frac{\sigma(n)}{n} = \frac{\pi^2}{6} x + O(\log x) \text{.}\tag{5.1.4} \end{equation}

Proof.

We have

\begin{align*} \sum_{n \leq x} \frac{\sigma(n)}{n} \amp = \sum_{n \leq x} \frac{1}{n} \sum_{d \mid n} d \\ \amp = \sum_{dk \leq x} \frac{d}{dk} \\ \amp = \sum_{dk \leq x} \frac{1}{k} \\ \amp = \sum_{k \leq x} \sum_{d \leq x/k} \frac{1}{k} \\ \amp = \sum_{k \leq x} \frac{1}{k} \left\lfloor \frac{x}{k} \right\rfloor \\ \amp = \sum_{k \leq x} \frac{1}{k} \left( \frac{x}{k} - \left\{ \frac{x}{k} \right\} \right) \\ \amp = \sum_{k \leq x} \frac{x}{k^2} - O(\sum_{k \leq x} \frac{1}{k}) \\ \amp = \frac{\pi^2}{6} x - O(1/x) - O(\log x) \text{.} \end{align*}

This completes the proof.

Subsection 5.1.3 Average order of Euler totient

Theorem 5.1.6.

\(\sum_{n \leq x} \varphi(n) = \frac{3}{\pi^2} x^2 + O(x \log x) \text{.}\)

Proof.

The proof is similar to before. We use

\begin{equation*} \varphi(n) = n \sum_{d \mid n} \frac{\mu(d)}{d} \text{,} \end{equation*}

where \(\mu\) is the Möbius function. After some algebra very similar to the proof of the previous theorem,

\begin{equation*} \sum_{n \leq x} \varphi(n) = \frac{1}{2} x^2 \left( \sum_{d=1}^\infty \frac{\mu(d)}{d^2} \right) + O(x \log x)\text{.} \end{equation*}

We have mentioned that

\begin{equation} \left(\sum_{n=1}^\infty \frac{1}{n^s} \right) \left(\sum_{n=1}^\infty \frac{\mu(n)}{n^s} \right) = 1\text{,}\tag{5.1.5} \end{equation}

so in particular for \(s=2\) we get

\begin{equation} \sum_{n=1}^{\infty} \frac{\mu(n)}{n^2} = \frac{1}{\sum_{n=1}^{\infty} \frac{1}{n^2}} = \frac{1}{\pi^2/6} = \frac{6}{\pi^2}\text{.}\tag{5.1.6} \end{equation}

This completes the proof.

Subsection 5.1.4 Extensions

We have been concerned with averages (and sums) of arithmetic functions. It would be interesting to investigate other statistical properties such as the standard deviation, or the second, third, and higher moments. (The \(k\)th moment is simply the average of the \(k\)th power. Moments are important in statistics and analysis.)

As described in the textbook’s chapter notes, there has been some study of modified averages taking, instead of all \(n \leq x\text{,}\) just those \(n \leq x\) with \(n \equiv a \pmod{q}\) for some \(a, q\text{.}\) It seems there is plenty still to be figured out in that direction.

Another way to modify the sequence would be to fix an irrational number \(\epsilon\) and a parameter \(0 \leq p \leq 1\) and take those \(n \leq x\) such that the fractional part \(\{n\epsilon\} \lt p\text{.}\) (This is a way to take a sort of “random” subset of integers.)

Prev Top Next