Thoughts on Numerical Integration (Part 4): Derivation of Trapezoid Rule

Numerical integration is a standard topic in first-semester calculus. From time to time, I have received questions from students on various aspects of this topic, including:

  • Why is numerical integration necessary in the first place?
  • Where do these formulas come from (especially Simpson’s Rule)?
  • How can I do all of these formulas quickly?
  • Is there a reason why the Midpoint Rule is better than the Trapezoid Rule?
  • Is there a reason why both the Midpoint Rule and the Trapezoid Rule converge quadratically?
  • Is there a reason why Simpson’s Rule converges like the fourth power of the number of subintervals?

In this series, I hope to answer these questions. While these are standard questions in a introductory college course in numerical analysis, and full and rigorous proofs can be found on Wikipedia and Mathworld, I will approach these questions from the point of view of a bright student who is currently enrolled in calculus and hasn’t yet taken real analysis or numerical analysis.

In the previous post in this series, I discussed three different ways of numerically approximating the definite integral \displaystyle \int_a^b f(x) \, dx, the area under a curve f(x) between x=a and x=b.

In this series, we’ll choose equal-sized subintervals of the interval [a,b]. If h = (b-a)/n is the width of each subinterval so that x_k = x_0 + kh, then the integral may be approximated as

\int_a^b f(x) \, dx \approx h \left[f(x_0) + f(x_1) + \dots + f(x_{n-1}) \right] \equiv L_n

using left endpoints,

\int_a^b f(x) \, dx \approx h \left[f(x_1) + f(x_2) + \dots + f(x_n) \right] \equiv R_n

using right endpoints, and

\int_a^b f(x) \, dx \approx h \left[f(c_1) + f(c_2) + \dots + f(c_n) \right] \equiv M_n

using the midpoints of the subintervals.

All three of these approximations were obtained by approximating the above shaded region by rectangles. However, perhaps it might be better to use some other shape besides rectangles. In the Trapezoidal Rule, we approximate the area by using (surprise!) trapezoids, as in the figure below.

The first trapezoid has height h and bases f(x_0) and f(x_1), and so the area of the first trapezoid is \frac{1}{2} h[ f(x_0) + f(x_1) ]. The other areas are found similarly. Adding these together, we get the approximation

T_n = \displaystyle \frac{h}{2}[f(x_0) + f(x_1)] + \frac{h}{2} [f(x_1) + f(x_2)] + \dots +

+ \displaystyle \frac{h}{2} [f(x_{n-2})+f(x_{n-1})] + \frac{h}{2} [f(x_{n-1})+f(x_n)]

= \displaystyle \frac{h}{2} [f(x_0) + 2f(x_1) + 2f(x_2) + \dots + 2f(x_{n-2}) + 2f(x_{n-1}) + f(x_n)].

Interestingly, T_n is the average of the two endpoint approximations L_n and R_n:

\displaystyle \frac{L_n+R_n}{2} =  \frac{L_n}{2} + \frac{R_n}{2}

= \displaystyle \frac{h}{2} \left[f(x_0) + f(x_1) + f(x_2) + \dots + f(x_{n-1}) \right]

+\displaystyle \frac{h}{2} \left[f(x_1) + f(x_2) + \dots + f(x_{n-1}) + f(x_{n}) \right]

= \displaystyle \frac{h}{2} \left[f(x_0) + 2f(x_1) + \dots + 2f(x_{n-1}) + f(x_n) \right]

= T_n.

Of course, as a matter of computation, it’s a lot quicker to directly compute T_n instead of computing L_n and R_n separately and then averaging.

 

 

Thoughts on Numerical Integration (Part 3): Derivation of left, right, and midpoint rules

Numerical integration is a standard topic in first-semester calculus. From time to time, I have received questions from students on various aspects of this topic, including:

  • Why is numerical integration necessary in the first place?
  • Where do these formulas come from (especially Simpson’s Rule)?
  • How can I do all of these formulas quickly?
  • Is there a reason why the Midpoint Rule is better than the Trapezoid Rule?
  • Is there a reason why both the Midpoint Rule and the Trapezoid Rule converge quadratically?
  • Is there a reason why Simpson’s Rule converges like the fourth power of the number of subintervals?

In this series, I hope to answer these questions. While these are standard questions in a introductory college course in numerical analysis, and full and rigorous proofs can be found on Wikipedia and Mathworld, I will approach these questions from the point of view of a bright student who is currently enrolled in calculus and hasn’t yet taken real analysis or numerical analysis.

For the sake of completeness, I discuss here the origins of the left-endpoint, right-endpoint, and midpoint rules of numerical integration. (These topics are often presented in calculus texts.) Consider the problem of finding $\displaystyle \int_a^b f(x) \, dx$, the area under a curve f(x) between x=a and x=b.

To start the process of numerical integration, the interval [a,b] is divided into subintervals. Usually, for convenience, the intervals are chosen to be the same length, a convention that I’ll follow in this series. That said, if the function is known to vary wildly on some parts of the domain but not so wildly on other parts, then computational efficiency can be gained by varying the sizes of the subintervals, choosing smaller subintervals for the places where the function varies wildly.

In any event, we’ll choose equal-sized subintervals for the duration of this series.

One numerical approximation can be made by choosing left endpoints. In the picture below, the interval [a,b] was divided into four equal subintervals. Let h = (b-a)/4, so that x_0 = a, x_1 = x_0 +h, x_2 = x_0 + 2h, x_3 = x_0 + 3h, and x_4 = x_0 + 4h = b. We then can draw rectangles using the left endpoints of each subinterval. The sum of the areas of these rectangles below is

hf(x_0) + hf(x_1) + hf(x_2) +hf(x_3),

and so this serves as an approximation to the area under the curve. In general, if there are n subintervals and x_k = x_0 + kh, then the integral may be approximated as

\int_a^b f(x) \, dx \approx h \left[f(x_0) + f(x_1) + \dots + f(x_{n-1}) \right]

That said, left endpoints were not necessary for making an approximation. We could have instead chosen the right endpoints of each subinterval. The sum of the areas of the rectangles below is

hf(x_1) + hf(x_2) + hf(x_3) +hf(x_4),

and so this serves as an approximation to the area under the curve. In general, if there are n subintervals, then the integral may be approximated as

\int_a^b f(x) \, dx \approx h \left[f(x_1) + f(x_2) + \dots + f(x_n) \right]

As a final approximation, any point in each subinterval could’ve been used for making an approximation. In the picture below, we use the midpoints of the subintervals, where c_k = (x_k + x_{k-1})/2. The sum of the areas of the rectangles below is

hf(c_1) + hf(c_2) + hf(c_3) +hf(c_4),

and so this serves as an approximation to the area under the curve. In general, if there are n subintervals, then the integral may be approximated as

\int_a^b f(x) \, dx \approx h \left[f(c_1) + f(c_2) + \dots + f(c_n) \right]

 

Thoughts on Numerical Integration (Part 2): The bell curve

Numerical integration is a standard topic in first-semester calculus. From time to time, I have received questions from students on various aspects of this topic, including:

  • Why is numerical integration necessary in the first place?
  • Where do these formulas come from (especially Simpson’s Rule)?
  • How can I do all of these formulas quickly?
  • Is there a reason why the Midpoint Rule is better than the Trapezoid Rule?
  • Is there a reason why both the Midpoint Rule and the Trapezoid Rule converge quadratically?
  • Is there a reason why Simpson’s Rule converges like the fourth power of the number of subintervals?

In this series, I hope to answer these questions. While these are standard questions in a introductory college course in numerical analysis, and full and rigorous proofs can be found on Wikipedia and Mathworld, I will approach these questions from the point of view of a bright student who is currently enrolled in calculus and hasn’t yet taken real analysis or numerical analysis.

In this post, I’d like to take a closer look at the indefinite integral \displaystyle \int e^{-x^2} dx, which is closely related to the area under the bell curve \displaystyle \frac{1}{\sqrt{2\pi}} e^{-x^2/2} dx. This integral cannot be computed using elementary functions. However, using integration by parts, there are some related integrals that can be computed:

\displaystyle \int x e^{-x^2} dx = -\displaystyle \frac{1}{2} e^{-x^2}

\displaystyle \int x^3 e^{-x^2} dx = -\displaystyle \frac{x^2+1}{2} e^{-x^2}

\displaystyle \int x^5 e^{-x^2} dx = -\displaystyle \frac{x^4+2x^2+2}{2} e^{-x^2}

\displaystyle \int x^7 e^{-x^2} dx = -\displaystyle \frac{x^6+3x^4+6x^2+6}{2} e^{-x^2}

Based on these examples, it stands to reason that, if \displaystyle \int e^{-x^2} dx can be written in terms of elementary functions, it should have the form

\displaystyle \int e^{-x^2} dx = f(x) e^{-x^2},

where f(x) is some polynomial to be determined. We will now show that this is impossible.

Suppose f(x) = \displaystyle \sum_{k=0}^n a_k x^k, a polynomial of degree n to be determined. Then we have

\displaystyle \frac{d}{dx} \left[ f(x) e^{-x^2} \right] = e^{-x^2}

or

f'(x) e^{-x^2} - 2 x f(x) e^{-x^2} =e^{-x^2}

or

f'(x) - 2x f(x) = 1.

In other words, all terms on the left-hand side except the constant term must cancel. However, this is impossible: 2x f(x) is a polynomial of degree n+1 while f'(x) is a polynomial of degree n-1. Therefore, the left hand side must have degree n+1 and therefore cannot be a constant.

A similar argument shows that f(x) cannot have the form f(x) = \displaystyle \sum_{k=0}^n a_k x^{b_k}, where the exponents b_k may or may not be integers.

This may be enough to convince a calculus student that there is no elementary antiderivative of \displaystyle e^{-x^2} dx. Indeed, although the proof goes well beyond first-year calculus, there is a theorem that says that if \displaystyle \int x^a e^{bx^2} can be expressed in terms of elementary functions, then the antiderivative must have the form f(x) e^{b x^2}. So the guess above actually can be rigorously justified. References:

  • Elena Anne Marchisotto and Gholam-Ali Zakeri, “An Invitation to Integration in Finite Terms,” The College Mathematics Journal , Sep., 1994, Vol. 25, No. 4 (Sep., 1994), pp. 295-308
  • J. F. Ritt, Integration in Finite Terms: Liouville’s Theory of Elementary Methods, Columbia University Press, New York, 1948

Thoughts on Numerical Integration (Part 1): Why numerical integration?

Numerical integration is a standard topic in first-semester calculus. From time to time, I have received questions from students on various aspects of this topic, including:

  • Why is numerical integration necessary in the first place?
  • Where do these formulas come from (especially Simpson’s Rule)?
  • How can I do all of these formulas quickly?
  • Is there a reason why the Midpoint Rule is better than the Trapezoid Rule?
  • Is there a reason why both the Midpoint Rule and the Trapezoid Rule converge quadratically?
  • Is there a reason why Simpson’s Rule converges like the fourth power of the number of subintervals?

In this series, I hope to answer these questions. While these are standard questions in a introductory college course in numerical analysis, and full and rigorous proofs can be found on Wikipedia and Mathworld, I will approach these questions from the point of view of a bright student who is currently enrolled in calculus and hasn’t yet taken real analysis or numerical analysis.

First, let’s talk about why numerical integration is necessary in the first place. Indeed, I can still remember a high school calculus teacher asking me this question nearly 20 years ago, and this question really got me thinking about what we’re collectively teaching in the secondary curriculum. Indeed, in a Calculus I course, it seems like every integral can be computed if only the proper trick is used. We teach students to search for these different tricks:

  • Let u = x^2+5 to find \displaystyle \int \frac{6x \, dx}{\sqrt{x^2+9}}.
  • Let x = 3\tan \theta to find \displaystyle \int \frac{6 \, dx}{\sqrt{x^2+9}}
  • Use integration by parts to find \displaystyle \int x^3 e^x \, dx

In fact, we teach so many tricks that we may give the impression that every integral can be computed if only the proper trick is employed. Indeed, my university hosts an annual “Integration Bee” that challenges students to find the right technique(s) to evaluate some pretty tough integrals.

Unfortunately, not every integral can be solved in terms of a finite number of elementary functions (polynomials, rational functions, exponential functions, logarithms, trigonometric and inverse trigonometric functions). One function that is commonly known to many students which does not have an elementary antiderivative is \displaystyle \frac{1}{\sqrt{2\pi}} e^{-x^2/2}, otherwise known as the bell curve. For most numbers a and b, the area

\displaystyle \int_a^b \frac{1}{\sqrt{2\pi}} e^{-x^2/2}

cannot be found exactly, and so we ask students to either use a table in the back of the textbook or else use a function on their scientific calculators to find the answer.

Just for the fun of it, I went through my Ph.D. thesis and wrote down some of the integrals that I had to integrate numerically while in school. As an applied mathematician, I was initially stunned by the teacher’s innocent question because so much of my work would be utterly impossible if it wasn’t for numerical integration. Here are some of the easier ones:

  • \displaystyle \int_0^t \frac{1-e^{-x}}{x} dx
  • \displaystyle \int_{2R}^\infty t^2 g(t) \left( \frac{a_1 t^4 + a_2 t^2 + a_3}{(t^2-R^2)^7} +\frac{b_1 t^2 + b_2}{(t^2-R^2)^5} + \frac{c}{(t^2-R^2)^3} \right) dt
  • \displaystyle \int_{d_2}^\infty \int_0^{d_1} \frac{y^2-x^2}{(x^2+y^2)^2} \left(e^{-a(x+d_1)-b d_2} - c\right) dx \, dy
  • \displaystyle \int_{x/2}^\infty \sqrt{r^2 - k^2/4} \phi(r) \, dr
  • \displaystyle \int_{x/2}^\infty \left( \frac{z \sqrt{4r^2-z^2}}{4} + r^2 \arcsin \left( \frac{z}{2r} \right) \right) \phi(r) \, dr
  • \displaystyle \int_0^{2R} e^{-sz} \exp \left[ -c \left( z \sqrt{4R^2-z^2}  + 4R^2 \arcsin \frac{z}{2R} \right) \right] dz
  • \displaystyle \int_0^d \exp \left[ -sz - \lambda \left(z - \frac{z^2}{4d} \right) \right] dz
  • \displaystyle \int_d^{d \sqrt{2}} \exp \left[ -sz - \lambda \left( \frac{d (\pi+1)}{2} - d \arcsin \frac{d}{z} + \frac{z^2}{4d} - \sqrt{z^2-d^2} \right) \right] dz
  • \displaystyle \int_0^\infty \exp \left[-sz - \eta \left(1 - e^{-cz/2} - \frac{cz}{4} e^{-cz/2} \right) \right] dz
  • \displaystyle \int_{-\infty}^x \frac{e^t}{t} dt

All this to say, there are plenty of integrals that arise from a real-world context that have a numerical answer but cannot be computed using the techniques commonly taught in the first-year calculus sequence.

 

Terrific video on Taylor series

Some time ago, I posted a series on the lecture that I’ll give to student to remind them about Taylor series. I won’t repost the whole thing here, but the basic ideas are inductively motivating the concept by starting with a polynomial and then reinforcing the concept with both numerical calculation and comparison of graphs.

After giving this lecture recently, one of my students told me about this terrific video on Taylor series that does much of the same things, with the added bonus of engaging animations. I recommend this highly.

Convexity and Orthogonality at Saddle Points

Today, the Texas Section of the Mathematical Association of America is holding its annual conference. Like many other professional conferences these days, this conference will be held virtually, and so my contribution to the conference is saved on YouTube and is available to the public.

Here’s the abstract of my talk: “At a saddle point (like the middle of a Pringles potato chip), the directions of maximum upward concavity and maximum downward concavity are perpendicular. The usual proof requires a fair amount of linear algebra: eigenvectors of different eigenvalues of a real symmetric matrix, like the Hessian, must be orthogonal. For this reason, the orthogonality of these two directions is not often stated in calculus textbooks, let alone proven, when the Second Partial Derivative Test for identifying local extrema and saddle points is discussed. In this talk, we present an elementary proof of the orthogonality of these two directions that requires only ideas from Calculus III and trigonometry. Not surprisingly, this proof can be connected to the usual proof from linear algebra.”

If you have 12 minutes to spare, here’s the talk.

Differentiation and Integration

As I tell my calculus students, differentiation is a science. There are rules to follow, but if you follow them carefully, you can compute the derivative of anything. This leads to one of my favorite classroom activities. However, integration is as much art as science; for example, see my series on different techniques for computing

\displaystyle \int_0^{2\pi} \frac{dx}{\cos^2 x + 2 a \sin x \cos x + (a^2 + b^2) \sin^2 x}

The contrast between differentiation and integration was more vividly illustrated in a recent xkcd webcomic:

Source: https://xkcd.com/2117/

My Favorite One-Liners: Part 117

I absolutely love this joke. The integral looks diabolical but can be computed mentally.

For what it’s worth, while it was able to produce an answer to as many decimal places as needed, even Wolfram Alpha was unable to exactly compute this integral. Feel free to click the link if you’d like the (highly suggestive) answer.

Source: https://www.facebook.com/CTYJohnsHopkins/photos/a.323810509981/10151131549924982/?type=3&theater