Section 4.2 Sums of Monotone Functions
Our goal here is to compare sums with integrals,
\begin{equation*}
\sum_{y \lt n \leq x} f(n),
\qquad
\int_y^x f(t) dt\text{.}
\end{equation*}
In general these should be “close”: the integral represents the area under the graph of \(f\) between \(y\) and \(x\text{,}\) while the sum represents the total area of a “histogram” with bars of width \(1\) and height \(f(n)\text{.}\) This is, roughly, a Riemann sum for the integral. In this section we’ll make this more precise.
By standard results in real analysis, every monotone function is integrable (even if it is not continuous). Likewise, every unimodal function is integrable. So, it makes sense to talk about the integrals in this context.
The comparison between sum and integral is straightforward when the interval has integer endpoints:
Theorem 4.2.2.
If \(a\) and \(b\) are integers with \(a \lt b\) and \(f\) is (non-strictly) monotone on \([a,b]\text{,}\) then
\begin{equation}
\min\{f(a),f(b)\} \leq \sum_{n=a}^b f(n) - \int_a^b f(t) dt \leq \max\{f(a),f(b)\}\text{.}\tag{4.2.1}
\end{equation}
Proof.
First suppose \(f\) is monotone increasing. The minimum on the left is \(f(a)\) and the maximum on the right is \(f(b)\text{.}\) For every integer \(a \leq n \lt b\) and every real number \(t \in [n,n+1]\) we have
\begin{equation*}
f(a) \leq f(n) \leq f(t) \leq f(n+1) \leq f(b)\text{.}
\end{equation*}
Therefore
\begin{equation*}
f(n) = \int_n^{n+1} f(n) dt \leq \int_n^{n+1} f(t) dt \leq \int_n^{n+1} f(n+1) dt = f(n+1)\text{.}
\end{equation*}
The first and last equalities are simply integrals of constant values over an interval of length \(1\text{.}\)
Adding up these inequalities gives
\begin{equation*}
\int_a^b f(t) dt = \sum_{n=a}^{b-1} \int_n^{n+1} f(t) dt \leq \sum_{n=a}^{b-1} f(n+1)
= \sum_{n=a+1}^b f(n)\text{.}
\end{equation*}
Adding \(f(a)\) to both sides and subtracting the integral to the right gives
\begin{equation*}
f(a) \leq \sum_{n=a}^b f(n) - \int_a^b f(t) dt\text{.}
\end{equation*}
The inequality \(\leq f(b)\) is similar, as is the case that \(f\) is decreasing instead of increasing (or, deduce the decreasing case by applying the increasing case to \(-f\)).
When the interval has arbitrary endpoints, we will obtain a similar comparison, although takes a bit more work.
In the following statement, the textbook adds the condition \(y \lt \lfloor x \rfloor \text{.}\) This seems unnecessary.
Theorem 4.2.3.
If \(x,y \in \bbR\) with \(y \lt x\) and \(f\) is nonnegative and (non-strictly) monotone on \([y,x]\text{,}\) then
\begin{equation}
\left| \sum_{y \lt n \leq x} f(n) - \int_y^x f(t) dt \right| \leq \max\{f(y),f(x)\}\text{.}\tag{4.2.2}
\end{equation}
Proof.
First, take the case \(y \geq \lfloor x \rfloor\text{.}\) This means that there is no integer \(n\) such that \(y \lt n \leq x\text{.}\) Then the sum is empty, so its value is zero. At the same time, the interval \([y,x]\) has width at most \(1\text{.}\) So the value of the integral is less than or equal to the maximum value of \(f\) on the interval; by monotonicity this is a value at an endpoint. So the claimed inequality holds in this case.
Now suppose \(y \lt \lfloor x \rfloor\text{,}\) so there is at least one integer \(n\) such that \(y \lt n \leq x\text{.}\) Suppose that \(f\) is (non-strictly) increasing. We will compare the integral \(\int_y^x f(t) dt\) with the sum \(\sum_{y \lt n \leq x} f(n)\text{.}\) This is not precisely a Riemann sum for the integral, but it is close. Let us regard the sum as being equal to the total area of the following figure: for each integer \(n\) with \(y \lt n \leq x\text{,}\) we take the rectangle whose base is the interval from \(n-1\) to \(n\) on the horizontal axis, and whose height is equal to \(f(n)\text{;}\) i.e., the right-hand endpoint of the base interval. So, this is something like a right-hand Riemann sum. It is not precisely a Riemann sum because the first interval may extend to the left of \(y\text{,}\) and the last interval (ending at \(\lfloor x \rfloor\)) may end to the left of \(x\text{.}\)
So, this sum (that is, the total area of the rectangles) differs from the integral (the total area under the graph of \(f\)) in three ways. First, in each interval \([n-1,n]\) the rectangle extends above the graph. This is because the function \(f\) is increasing, and the rectangle’s height is given by the height of the right-hand endpoint, ie., the highest endpoint on that interval. So, each rectangle includes a “triangle” region on top of the graph. Second, the leftmost rectangle includes a vertical strip over the base \([\lfloor y \rfloor, y]\) that is not included in the integral. Third, in contrast, the integral includes a vertical strip of area over the base \([\lfloor x \rfloor, x]\) which is not covered by any rectangle.
First consider the area covered by the rectangles but not the integral. These are the first and second types of error in the previous paragraph. We claim this adds up to at most \(f(x)\text{.}\) This is because the “error regions” in each rectangle are at disjoint heights: in the rectangle over base \([n-1,n]\text{,}\) the “error region” is in heights \(f(n-1)\) to \(f(n)\text{.}\) Here, the leftmost strip is an exception, as it may include an “error region” going down to height \(0\text{.}\) Still, if we horizontally slide these regions into one single strip, they will not overlap. Therefore their total area is at most the width of that strip (which is \(1\)) times the height of that strip, which is \(f(\lfloor x \rfloor)\text{,}\) the greatest height of any rectangle involved in the sum. So the total of these errors is at most \(f(\lfloor x \rfloor)\text{.}\) This is less than or equal to \(f(x)\text{.}\)
Next consider the area covered by the integral but not the rectangles. This is the vertical strip on the right of the region, over the base \([\lfloor x \rfloor, x]\text{.}\) This is contained in a rectangle with that same base, and height \(f(x)\text{.}\) The width of that base is at most \(1\text{.}\) So, the area of this rectangle is at most \(f(x)\text{.}\) And then, the error from this source is at most \(f(x)\text{.}\)
Now, the total error, the difference between the sum and the integral, is the difference of these separate errors, namely, the errors counting area in the rectangles but not the integral, and the errors counting area in the integral but not the rectangles. Since each type of error separately is at most \(f(x)\) (and each of them is nonnegative), then their difference has magnitude at most \(f(x)\text{.}\) In other words, we have
\begin{equation*}
\left| \sum_{y \lt n \leq x} f(n) - \int_y^x f(t) dt \right| \leq f(x) = \max\{f(y),f(x)\}
\end{equation*}
as claimed.
What remains is to show the other case, if \(f\) is decreasing. This follows similarly (using the left-hand sum).
Corollary 4.2.4.
If \(y \in \bbR\) and \(f\) is nonnegative and monotone on \([y,\infty)\text{,}\) then
\begin{equation}
\sum_{y \lt n \leq x} f(n) = \int_y^x f(t) dt + O(\max\{f(y),f(x)\})\text{.}\tag{4.2.3}
\end{equation}
Remark 4.2.5.
I believe the intention here is to regard \(y\) as a fixed number, and everything else as a function of \(x\text{.}\) The sum and the integral are functions of \(x\) explicitly. If \(f\) is increasing, then \(O(\max\{f(y),f(x)\}) = O(f(x)) = O(f)\text{.}\) If \(f\) is decreasing, then \(O(\max\{f(y),f(x)\}) = O(f(y)) = O(1)\text{.}\)
Proof.
We have seen that
\begin{equation}
\left| \sum_{y \lt n \leq x} f(n) - \int_y^x f(t)dt \right| \leq \max\{f(y),f(x)\}\tag{4.2.4}
\end{equation}
for all \(x \gt y\text{.}\) So
\begin{equation*}
\sum_{y \lt n \leq x} f(n) - \int_y^x f(t)dt = c(x) \max\{f(y),f(x)\}
\end{equation*}
where \(-1 \leq c(x) \leq 1\text{;}\) this is \(O(\max\{f(y),f(x)\})\text{.}\)Corollary 4.2.6.
If \(f\) is nonnegative and unimodal on \([1,\infty)\) then
\begin{equation}
\sum_{n \leq x} f(n) = \int_1^x f(t)dt + O(1)\text{.}\tag{4.2.5}
\end{equation}
Remark 4.2.7.
The book says that \(f\) should be unimodal on \([1,x]\text{.}\) But requiring \(f\) to be unimodal on \([1,x]\) for all \(x \gt 1\) is equivalent to requiring \(f\) to be unimodal on \([1,\infty)\) (exercise). And it must be intended to require unimodality for all \(x\text{;}\) we can require \(f\) to be unimodal on \([1,x]\) for a particular \(x\text{,}\) but then “\(O(1)\)” doesn’t make sense.
Proof.
Say \(f\) is increasing for \(x \lt x_0\) and decreasing for \(x \gt x_0\text{.}\) Then for \(x \gt x_0\) we have
\begin{align*}
\left| \sum_{n \leq x} f(n) - \int_1^x f(t)dt \right|
\amp \leq \left| f(1) + \sum_{1 \lt n \leq x_0} f(n) - \int_1^{x_0} f(t)dt \right| \\
\amp \quad + \left| \sum_{x_0 \lt n \leq x} f(n) - \int_{x_0}^x f(t)dt \right| \\
\amp \leq f(1) + \max\{f(1),f(x_0)\} + \max\{f(x_0),f(x)\} \\
\amp = f(1) + 2 f(x_0) \text{.}
\end{align*}
This is bounded, hence \(O(1)\text{.}\)Example 4.2.8.
Let \(\alpha \geq 0\text{.}\) Then:
\begin{equation}
\sum_{n \leq x} n^\alpha = \frac{x^{\alpha+1}}{\alpha+1} + O(x^\alpha)\text{.}\tag{4.2.6}
\end{equation}
Indeed,
\begin{equation*}
\sum_{n \leq x} n^\alpha = \int_1^x t^\alpha dt + O(\max\{1^\alpha,x^\alpha\})\text{.}
\end{equation*}
Example 4.2.9.
For \(\alpha \gt 0\text{,}\) \(\alpha \neq 1\text{,}\) we have
\begin{equation}
\sum_{n \leq x} \frac{1}{n^\alpha} = \frac{x^{-\alpha+1}}{-\alpha+1} + O(1)\text{.}\tag{4.2.7}
\end{equation}
Indeed,
\begin{equation*}
\sum_{n \leq x} \frac{1}{n^\alpha} = \int_1^x \frac{1}{t^\alpha} dt + O(\max\{1/1^\alpha,1/x^\alpha\})\text{.}
\end{equation*}
The maximum on the right is \(O(1)\text{.}\) Evaluating the integral gives the result.
Example 4.2.10.
Let \(\alpha \gt 1\text{.}\) Then:
\begin{equation}
\sum_{n \gt x} \frac{1}{n^\alpha} = O(x^{-\alpha + 1})\text{.}\tag{4.2.8}
\end{equation}
Exercise.
We have
\begin{equation}
\sum_{n \leq x} \frac{1}{n} = \log x + O(1)\tag{4.2.9}
\end{equation}
Exercise.
We have
\begin{equation}
\sum_{n \leq x} \log n = x \log x - x + O(\log x)\text{.}\tag{4.2.10}
\end{equation}
Exercise.
Let \(\alpha \geq 0\text{.}\) Then:
\begin{equation}
\sum_{n \leq x} n^\alpha \log n
= \frac{x^{\alpha+1}}{\alpha+1} \left( \log x - \frac{1}{\alpha+1} \right)
+ O(x^\alpha \log x)\text{.}\tag{4.2.11}
\end{equation}
Exercise.