In my video How to Extend the Sum of Any* Function, I spent a decent amount of time building up a tower of increasingly nested sums that looked similar to this:
I then showed that each line of this expression is simply equal to a binomial coefficient, so the whole thing reduces to
k=0∑n(kx)Δkf(x0).
If you made it to the end of the video, you’ll know the plot twist: This is the discrete version of a Taylor polynomial! Notice the similarity:
k=0∑n(kx)Δkf(x0)k=0∑nn!xnf(k)(x0).
In this post I’ll show that if we apply the same line of reasoning to integrals, we end up with a derivation of Taylor polynomials, including Lagrange’s form of the error term. Then we’ll see that Taylor’s theorem also applies to complex functions, though with a looser error bound.
Taylor’s Theorem for Real Functions
We’ll assume that f:R→R is n+1 times differentiable and that f(n+1) is continuous. This continuity is not required for the theorem to hold, but it makes the derivation easier. Also for brevity we’ll be centering the Taylor polynomials at 0 (so specifically we’re looking at Maclaurin series), and we’ll always consider x to be positive. It’s easy to modify the result to remove these constraints – in fact we do that and more in the next section.
In my video, we arrived at that tower of nested sums by repeatedly applying the fundamental theorem of discrete calculus:
f(x0+a)=f(x0)+k=0∑a−1Δf(x0+k).
Here we’ll do the same thing, but with the plain old fundamental theorem of calculus:
f(x)=f(0)+∫0xf′(t)dt.(1)
To start, let’s apply the fundamental theorem of calculus (1) to the f′(t) in its own definition:
Now we’ve added a triple integral. And we can keep going! We can apply (1) to the f′′′(t3) to introduce a quadruple integral, and so on. After repeating the process n times, we’ll have
Apart from the last line, this is the same as the tower of nested sums at the beginning of this post. Except, of course, all the sums have been replaced with integrals.
Note that, again with the exception of the last line, the integrands are all constants. This makes the nested integrals extremely easy to compute. I’ll leave it to you to verify that
n integrals∫0x∫0t1⋯∫0tn−1f(n)(0)dtn⋯dt2dt1=n!xnf(n)(0).
So the difference between f(x) and Tn(x) is this last remaining nested integral. In other words, the nested integral is the error in the approximation of f by Taylor polynomials. All that’s left to do is to bound that error term.
Fortunately, we’ve assumed that f(n+1) is continuous. Therefore, by the extreme value theorem, it will have a maximum on [0,x], so
Finally, again by the continuity of f(n+1), the intermediate value theorem tells us that there exists some number ξ∈[0,x] for which
f(x)−Tn(x)=(n+1)!xn+1f(n+1)(ξ).
We have just derived the Lagrange form of the Taylor remainder!
Taylor’s theorem with Lagrange error term:
Suppose f is a real function and f(n+1) is continuous. Then there exists a number ξ∈[0,x] such that
f(x)=Tn(x)+(n+1)!xn+1f(n+1)(ξ).
Bonus: Taylor’s Theorem for Complex Functions
Everything up to and including (3) is perfectly valid even if the output of f is complex. However, the later steps relied on f being real-valued, so the Lagrange form of the remainder is only valid for f:R→R. But we can still find a useful bound for complex functions.
To start, suppose that f is a complex-valued function of a real variable. Take the modulus of both sides of (3) to see that
With this, we have a bound of the Taylor remainder for f:R→C. It turns out we can also apply this bound for f:C→C. To be precise,
Taylor’s theorem for complex functions:
Suppose f is a holomorphic function on an open set Ω⊆C. Let L be the line segment connecting two points z0 and z. If L⊆Ω, then
where Tn is the degree n Taylor polynomial of f centered at z0.
To prove this, we just have to translate, rotate, and scale the complex plane to move L to the real axis so that we can use (4). To this end, note that the change of variables t↦z0+t(z−z0) maps the interval [0,1] to L. Define g:[0,1]→C by
g(t)=f(z0+t(z−z0)),
and let Sn denote the degree n Taylor polynomial of g centered at 0. Clearly g(1)=f(z) and
Strangely, people tend not to mention this form of the theorem for complex functions. Taylor’s theorem is almost always stated strictly in terms of functions from R to R. It seems to be an accepted fact that other theorems from complex analysis yield stronger results. For example, Wikipedia claims that the usefulness of Taylor’s theorem is “dwarfed” by other general results in complex analysis. But unless I am missing something, this is not universally true.
There are cases where a function is large, but its (n+1)st derivative is small. For example, suppose that we want to determine the accuracy of Taylor polynomials for f(z)=z3.5 centered very far from the origin. Since f(4)(z) is small when z is far from the origin, Taylor’s theorem tells us that a degree 3 Taylor approximation will have a very small error. This result does not follow from the Cauchy inequalities or any obvious corollary of Cauchy’s integral formula, though these results are sometimes said to be more powerful than Taylor’s theorem.