Thoughts on Numerical Integration (Part 10): Trapezoid Rule and exploration of error analysis

Numerical integration is a standard topic in first-semester calculus. From time to time, I have received questions from students on various aspects of this topic, including:

  • Why is numerical integration necessary in the first place?
  • Where do these formulas come from (especially Simpson’s Rule)?
  • How can I do all of these formulas quickly?
  • Is there a reason why the Midpoint Rule is better than the Trapezoid Rule?
  • Is there a reason why both the Midpoint Rule and the Trapezoid Rule converge quadratically?
  • Is there a reason why Simpson’s Rule converges like the fourth power of the number of subintervals?

In this series, I hope to answer these questions. While these are standard questions in a introductory college course in numerical analysis, and full and rigorous proofs can be found on Wikipedia and Mathworld, I will approach these questions from the point of view of a bright student who is currently enrolled in calculus and hasn’t yet taken real analysis or numerical analysis.

In the previous post in this series, I discussed three different ways of numerically approximating the definite integral \displaystyle \int_a^b f(x) \, dx, the area under a curve f(x) between x=a and x=b.

In this series, we’ll choose equal-sized subintervals of the interval [a,b]. If h = (b-a)/n is the width of each subinterval so that x_k = x_0 + kh, then the integral may be approximated as

\int_a^b f(x) \, dx \approx h \left[f(x_0) + f(x_1) + \dots + f(x_{n-1}) \right] \equiv L_n

using left endpoints,

\int_a^b f(x) \, dx \approx h \left[f(x_1) + f(x_2) + \dots + f(x_n) \right] \equiv R_n

using right endpoints, and

\int_a^b f(x) \, dx \approx h \left[f(c_1) + f(c_2) + \dots + f(c_n) \right] \equiv M_n

using the midpoints of the subintervals. We have also derived the Trapezoid Rule

\int_a^b f(x) \, dx \approx \displaystyle \frac{h}{2} [f(x_0) + 2f(x_1) + \dots + 2f(x_{n-1}) + f(x_n)] \equiv T_n

and Simpson’s Rule (if n is even)

\int_a^b f(x) \, dx \approx \displaystyle \frac{h}{3} \left[y_0 + 4 y_1 + 2 y_2 + 4 y_3 + \dots + 2y_{n-2} + 4 y_{n-1} +  y_{n} \right] \equiv S_n.

In the previous post in this series, we saw that both the left-endpoint and right-endpoint rules have a linear rate of convergence: if twice as many subintervals are taken, then the error appears to go down by a factor of 2. If ten times as many subintervals are used, then the error should go down by a factor of 10. However, the Midpoint Rule has a quadratic rate of convergence: if twice as many subintervals are taken, then the error appears to go down by a factor of 4. If ten times as many subintervals are used, then the error should go down by a factor of 100.

Let’s now explore the results of the Trapezoid Rule applied to \displaystyle \int_1^2 x^9 \, dx = 102.3 using different numbers of subintervals. The results are summarized in the table below.

Once again, the data points fit a quadratic polynomial well, indicating quadratic convergence.

More subtly, it appears that the Trapezoid Rule isn’t quite as good as the Midpoint Rule. Here are the results from the Midpoint Rule (which also appeared in the previous post in this series):

For n = 100 subintervals, the error of the Trapezoid Rule is |102.3191-102.3| = 0.0191, which the error from the Midpoint Rule is |102.2904-102.3| = 0.0096. In other words, while both of these methods are superior to the left- and right-endpoint rules, it appears that the error from the Midpoint Rule is about half of the error from the Trapezoid Rule. The Midpoint Rule appears to be better.

To me, this is far from an obvious conclusion. Geometrically, it’s far from clear that the rectangles from the Midpoint Rule…

… provide a better approximation than using trapezoids …

… yet it appears that’s exactly what happened. This can be rigorously proven, as we’ll explore later in this series.

Thoughts on Numerical Integration (Part 9): Midpoint rule and exploration of error analysis

Numerical integration is a standard topic in first-semester calculus. From time to time, I have received questions from students on various aspects of this topic, including:

  • Why is numerical integration necessary in the first place?
  • Where do these formulas come from (especially Simpson’s Rule)?
  • How can I do all of these formulas quickly?
  • Is there a reason why the Midpoint Rule is better than the Trapezoid Rule?
  • Is there a reason why both the Midpoint Rule and the Trapezoid Rule converge quadratically?
  • Is there a reason why Simpson’s Rule converges like the fourth power of the number of subintervals?

In this series, I hope to answer these questions. While these are standard questions in a introductory college course in numerical analysis, and full and rigorous proofs can be found on Wikipedia and Mathworld, I will approach these questions from the point of view of a bright student who is currently enrolled in calculus and hasn’t yet taken real analysis or numerical analysis.

In the previous post in this series, I discussed three different ways of numerically approximating the definite integral \displaystyle \int_a^b f(x) \, dx, the area under a curve f(x) between x=a and x=b.

In this series, we’ll choose equal-sized subintervals of the interval [a,b]. If h = (b-a)/n is the width of each subinterval so that x_k = x_0 + kh, then the integral may be approximated as

\int_a^b f(x) \, dx \approx h \left[f(x_0) + f(x_1) + \dots + f(x_{n-1}) \right] \equiv L_n

using left endpoints,

\int_a^b f(x) \, dx \approx h \left[f(x_1) + f(x_2) + \dots + f(x_n) \right] \equiv R_n

using right endpoints, and

\int_a^b f(x) \, dx \approx h \left[f(c_1) + f(c_2) + \dots + f(c_n) \right] \equiv M_n

using the midpoints of the subintervals. We have also derived the Trapezoid Rule

\int_a^b f(x) \, dx \approx \displaystyle \frac{h}{2} [f(x_0) + 2f(x_1) + \dots + 2f(x_{n-1}) + f(x_n)] \equiv T_n

and Simpson’s Rule (if n is even)

\int_a^b f(x) \, dx \approx \displaystyle \frac{h}{3} \left[y_0 + 4 y_1 + 2 y_2 + 4 y_3 + \dots + 2y_{n-2} + 4 y_{n-1} +  y_{n} \right] \equiv S_n.

In the previous post in this series, we saw that both the left-endpoint and right-endpoint rules have a linear rate of convergence: if twice as many subintervals are taken, then the error appears to go down by a factor of 2. If ten times as many subintervals are used, then the error should go down by a factor of 10. Let’s now explore the results of the midpoint rule applied to \displaystyle \int_1^2 x^9 \, dx = 102.3 using different numbers of subintervals. The results are summarized in the table below. The first immediate observation is that these approximations are far better than the left- and right-endpoint rule approximations! Indeed, we see that M_{10} \approx 101.3, using only ten subintervals, is a far better approximation than (from the previous post) either L_{100} = 99.8 or R_{100} \approx 104.9 using 100 subintervals! The lesson to learn: choosing a good algorithm is often far better than simply doing lots of computations.

There’s a second observation: the rate of convergence appears to be much, much faster. Indeed, the data points appear to fit a parabola very well instead of a straight line. Said another way, if twice as many subintervals are taken, then the error appears to go down by a factor of 4. If ten times as many subintervals are used, then the error should go down by a factor of 100. This illustrates quadratic convergence, as opposed to the mere linear convergence of the left- and right-endpoint rules.

Thoughts on Numerical Integration (Part 8): Left and right endpoint rules and exploration of error analysis

Numerical integration is a standard topic in first-semester calculus. From time to time, I have received questions from students on various aspects of this topic, including:

  • Why is numerical integration necessary in the first place?
  • Where do these formulas come from (especially Simpson’s Rule)?
  • How can I do all of these formulas quickly?
  • Is there a reason why the Midpoint Rule is better than the Trapezoid Rule?
  • Is there a reason why both the Midpoint Rule and the Trapezoid Rule converge quadratically?
  • Is there a reason why Simpson’s Rule converges like the fourth power of the number of subintervals?

In this series, I hope to answer these questions. While these are standard questions in a introductory college course in numerical analysis, and full and rigorous proofs can be found on Wikipedia and Mathworld, I will approach these questions from the point of view of a bright student who is currently enrolled in calculus and hasn’t yet taken real analysis or numerical analysis.

In the previous post in this series, I discussed three different ways of numerically approximating the definite integral \displaystyle \int_a^b f(x) \, dx, the area under a curve f(x) between x=a and x=b.

In this series, we’ll choose equal-sized subintervals of the interval [a,b]. If h = (b-a)/n is the width of each subinterval so that x_k = x_0 + kh, then the integral may be approximated as

\int_a^b f(x) \, dx \approx h \left[f(x_0) + f(x_1) + \dots + f(x_{n-1}) \right] \equiv L_n

using left endpoints,

\int_a^b f(x) \, dx \approx h \left[f(x_1) + f(x_2) + \dots + f(x_n) \right] \equiv R_n

using right endpoints, and

\int_a^b f(x) \, dx \approx h \left[f(c_1) + f(c_2) + \dots + f(c_n) \right] \equiv M_n

using the midpoints of the subintervals. We have also derived the Trapezoid Rule

\int_a^b f(x) \, dx \approx \displaystyle \frac{h}{2} [f(x_0) + 2f(x_1) + \dots + 2f(x_{n-1}) + f(x_n)] \equiv T_n

and Simpson’s Rule (if n is even)

\int_a^b f(x) \, dx \approx \displaystyle \frac{h}{3} \left[y_0 + 4 y_1 + 2 y_2 + 4 y_3 + \dots + 2y_{n-2} + 4 y_{n-1} +  y_{n} \right] \equiv S_n.

All of the above approximations to \displaystyle \int_a^b f(x) \, dx are precisely that — approximations. That begs the obvious question: how can we get better approximations. One obvious answer is taking more subintervals. The figures below show the left-endpoint approximations using n = 4 and n = 40 subintervals. Geometrically, it’s clear that the orange rectangles in the second picture do a better job of approximating the area under the curve. Unfortunately, simply taking more subintervals has its limitations. Using a spreadsheet as in the previous post in this series, one can implement 100 or even 1000 subintervals without much difficult. However, as demonstrated in the video below, implementing any of these methods with 10,000 subintervals is pretty time-consuming. (Tl/dw: It can take literally a couple of minutes.)

Instead of relying on sheer computational firepower, let’s instead investigate how good these numerical methods actually are. To begin, let’s explore the left-endpoint rule applied to \displaystyle \int_1^2 x^9 \, dx = 102.3 using different numbers of subintervals. The results are summarized in the table below. As n increases, the left endpoint approximations L_n are indeed getting closer and closer to the actual value of \displaystyle \int_1^2 x^9 \, dx = 102.3. Interestingly, when the width h is plotted with these approximations, the data points fall almost exactly on a straight line. The same phenomenon occurs when using right endpoints: So, it appears that the errors in both the left- and right-endpoint rules are a linear function of the size of the subintervals. Said another way, if twice as many subintervals are taken, then the error appears to go down by a factor of 2. If ten times as many subintervals are used, then the error should go down by a factor of 10. As we’ll see in the next few posts, the errors for the Midpoint Rule, the Trapezoid Rule, and especially Simpson’s Rule are much better than the errors from these two methods.

Thoughts on Numerical Integration (Part 7): Implementation with Excel

Numerical integration is a standard topic in first-semester calculus. From time to time, I have received questions from students on various aspects of this topic, including:

  • Why is numerical integration necessary in the first place?
  • Where do these formulas come from (especially Simpson’s Rule)?
  • How can I do all of these formulas quickly?
  • Is there a reason why the Midpoint Rule is better than the Trapezoid Rule?
  • Is there a reason why both the Midpoint Rule and the Trapezoid Rule converge quadratically?
  • Is there a reason why Simpson’s Rule converges like the fourth power of the number of subintervals?

In this series, I hope to answer these questions. While these are standard questions in a introductory college course in numerical analysis, and full and rigorous proofs can be found on Wikipedia and Mathworld, I will approach these questions from the point of view of a bright student who is currently enrolled in calculus and hasn’t yet taken real analysis or numerical analysis.

In the previous post in this series, I discussed three different ways of numerically approximating the definite integral \displaystyle \int_a^b f(x) \, dx, the area under a curve f(x) between x=a and x=b.

In this series, we’ll choose equal-sized subintervals of the interval [a,b]. If h = (b-a)/n is the width of each subinterval so that x_k = x_0 + kh, then the integral may be approximated as

\int_a^b f(x) \, dx \approx h \left[f(x_0) + f(x_1) + \dots + f(x_{n-1}) \right] \equiv L_n

using left endpoints,

\int_a^b f(x) \, dx \approx h \left[f(x_1) + f(x_2) + \dots + f(x_n) \right] \equiv R_n

using right endpoints, and

\int_a^b f(x) \, dx \approx h \left[f(c_1) + f(c_2) + \dots + f(c_n) \right] \equiv M_n

using the midpoints of the subintervals. We have also derived the Trapezoid Rule

\int_a^b f(x) \, dx \approx \displaystyle \frac{h}{2} [f(x_0) + 2f(x_1) + \dots + 2f(x_{n-1}) + f(x_n)] \equiv T_n

and Simpson’s Rule (if n is even)

\int_a^b f(x) \, dx \approx \displaystyle \frac{h}{3} \left[y_0 + 4 y_1 + 2 y_2 + 4 y_3 + \dots + 2y_{n-2} + 4 y_{n-1} +  y_{n} \right] \equiv S_n.

Computing any of the above formulas on a hand-held calculator can tax the patience of even the most error-conscious student. Indeed, I prefer that my students, when first learning these concepts, use a spreadsheet instead of a calculator or even a computer program, as I think that the visual layout of the spreadsheet aids in understanding how the formula works. In what follows, I implement the above formulas for the integral \displaystyle \int_1^2 x^9 \, dx using n=10 subintervals, so that h = (2-1)/10 = 0.1. To implement the left-endpoint rule, I enter the labels “x” and “x^9” in cells A1 and B1 of a spreadsheet. I then enter 1 (the left endpoint) in cell A2. In cell A3, I enter “=A2+0.1”, instructing the spreadsheet to add 0.1 to the value in cell A2. Then, instead of typing all of the other values of x_k, I use the fill-down feature to repeat this pattern for cells A3 through A11. In cell B2, I enter “=A1^9”, applying the function f(x) = x^9 to the x-coordinate in cell A2. Again, I use the fill-down feature to repeat this pattern for cells B3-B11. The fill-down feature saves a lot of time! Finally, in cell B13, I enter “=0.1*SUM(B2:B11)”, adding the values in cells B2 through B11 and multiplying the sum by h. The result, 78.6581, is the approximation using the left-endpoint rule with 10 subintervals. Once this is done, the right-endpoint rule can be obtained almost for free. The only change is to change the value of cell A2 from 1 to 1.1. Everything else should automatically update. The midpoint rule is also obtained quickly by changing the value of cell A2 from 1 to 1.05, the midpoint of the first subinterval [1,1.01]. Implementing the Trapezoid Rule requires a little more work. We reset the value of A2 back to 1, the value of the left-endpoint. We also fill down the pattern one extra row (in this case, row 12). To implement the Trapezoid Rule, we have to multiply all function values (except for those at the endpoints) by 2. To implement this, I introduce column C. These weights can be typed by hand, but again the fill-down feature can speed things up. Then, in column D, I multiply the values in columns B and C. For example, the result in cell D2 is obtained by typing “=B2*C2”. Once again, the fill-down feature is used for all rows. Finally, the approximation itself is obtained by typing “=0.1/2*SUM(D2:D12)” in cell D13. After implementing the Trapezoid Rule, Simpson’s Rule is not much more effort. The biggest change is the alternating weights, so that the endpoints have weight 1 while the others oscillate between 4 and 2, ending on 4 on the second-to-last value of x. Again, these could be typed by hand, but it’s easiest to enter 4 in cell C3, 2 in cell C4, and then “=C3” in cell C5. The fill-down feature can take care of the rest of the weights. The Simpson’s Rule approximation is obtained by typing “=0.1/3*SUM(D2:D12)” in cell D13, with a new denominator of 3.

Thoughts on Numerical Integration (Part 6): Connection between Simpson’s Rule, Trapezoid Rule, and Midpoint Rule

Numerical integration is a standard topic in first-semester calculus. From time to time, I have received questions from students on various aspects of this topic, including:
  • Why is numerical integration necessary in the first place?
  • Where do these formulas come from (especially Simpson’s Rule)?
  • How can I do all of these formulas quickly?
  • Is there a reason why the Midpoint Rule is better than the Trapezoid Rule?
  • Is there a reason why both the Midpoint Rule and the Trapezoid Rule converge quadratically?
  • Is there a reason why Simpson’s Rule converges like the fourth power of the number of subintervals?
In this series, I hope to answer these questions. While these are standard questions in a introductory college course in numerical analysis, and full and rigorous proofs can be found on Wikipedia and Mathworld, I will approach these questions from the point of view of a bright student who is currently enrolled in calculus and hasn’t yet taken real analysis or numerical analysis.
In the previous post in this series, I discussed three different ways of numerically approximating the definite integral \displaystyle \int_a^b f(x) \, dx, the area under a curve f(x) between x=a and x=b.
In this series, we’ll choose equal-sized subintervals of the interval [a,b]. If h = (b-a)/n is the width of each subinterval so that x_k = x_0 + kh, then the integral may be approximated as

\int_a^b f(x) \, dx \approx h \left[f(x_0) + f(x_1) + \dots + f(x_{n-1}) \right] \equiv L_n

using left endpoints,

\int_a^b f(x) \, dx \approx h \left[f(x_1) + f(x_2) + \dots + f(x_n) \right] \equiv R_n

using right endpoints, and

\int_a^b f(x) \, dx \approx h \left[f(c_1) + f(c_2) + \dots + f(c_n) \right] \equiv M_n

using the midpoints of the subintervals. We have also derived the Trapezoid Rule

\int_a^b f(x) \, dx \approx \displaystyle \frac{h}{2} [f(x_0) + 2f(x_1) + \dots + 2f(x_{n-1}) + f(x_n)] \equiv T_n

and Simpson’s Rule (if n is even)

\int_a^b f(x) \, dx \approx \displaystyle \frac{h}{3} \left[y_0 + 4 y_1 + 2 y_2 + 4 y_3 + \dots + 2y_{n-2} + 4 y_{n-1} +  y_{n} \right] \equiv S_n.

There is a somewhat surprising connection between the last three formulas. Let’s divide the interval [a,b] into 2n subintervals with h = (b-a)/(2n) and x_0 = a, x_1 = x_0 + h, x_2 = x_0 + 2h, and so on. Then Simpson’s Rule becomes

S_{2n} = \displaystyle \frac{h}{3} \left[y_0 + 4 y_1 + 2 y_2 + 4 y_3 + \dots + 2y_{2n-2} + 4 y_{2n-1} +  y_{2n} \right].

Next, let’s divide the interval [a,b] into n subintervals, but let’s not redefine the values of h and the x_k. Instead, the width of each subinterval will be (b-a)/n, which is equal to 2h. (In other words, since there are half as many subintervals, each one is twice as long.) Also, the endpoints of these subintervals will be x_0 = a, x_2 = x_0 + 2h, x_4 = x_0 + 4h, and so on. So, keeping the same labeling convention as with Simpson’s Rule, the Trapezoid Rule becomes

T_n = \displaystyle \frac{2h}{2} [f(x_0) + 2f(x_2) + 2f(x_4) + \dots + 2f(x_{2n-2}) + f(x_{2n})]

= h [f(x_0) + 2f(x_2) + 2f(x_4) + \dots + 2f(x_{2n-2}) + f(x_{2n})].

(Again, the width of the subintervals in this case is 2h, where h = (b-a)/2n.) Furthermore, the midpoint of subinterval [x_0, x_2] will be x_1, the midpoint of subinterval [x_2,x_4] will be x_3, and so on. Therefore, keeping the same labeling convention, the Midpoint Rule becomes

M_n = \displaystyle 2h [f(x_1) + f(x_3) + f(x_5) + \dots + f(x_{2n-1}) ].

It turns out that \displaystyle \frac{2}{3} M_n + \frac{1}{3} T_n, a certain weighted average of T_n and M_n, is equal to

\displaystyle \frac{4h}{3} [f(x_1) + f(x_3) + \dots + f(x_{2n-1}) ] + \frac{h}{3} [f(x_0) + 2f(x_2) + \dots + 2f(x_{2n-2}) + f(x_{2n})]

= \displaystyle  \frac{h}{3} [f(x_0) + 4 f(x_1) + 2f(x_2) + \dots + 2f(x_{2n-2}) + 4 f(x_{2n-1} + f(x_{2n})]

= S_{2n}.

So, if the Midpoint Rule and the Trapezoid Rule have already been computed for n subintervals, then Simpson’s Rule for 2n subintervals can be computed at almost no additional effort.

Thoughts on Numerical Integration (Part 5): Derivation of Simpson’s Rule

Numerical integration is a standard topic in first-semester calculus. From time to time, I have received questions from students on various aspects of this topic, including:

  • Why is numerical integration necessary in the first place?
  • Where do these formulas come from (especially Simpson’s Rule)?
  • How can I do all of these formulas quickly?
  • Is there a reason why the Midpoint Rule is better than the Trapezoid Rule?
  • Is there a reason why both the Midpoint Rule and the Trapezoid Rule converge quadratically?
  • Is there a reason why Simpson’s Rule converges like the fourth power of the number of subintervals?

In this series, I hope to answer these questions. While these are standard questions in a introductory college course in numerical analysis, and full and rigorous proofs can be found on Wikipedia and Mathworld, I will approach these questions from the point of view of a bright student who is currently enrolled in calculus and hasn’t yet taken real analysis or numerical analysis.

In the previous post in this series, I discussed three different ways of numerically approximating the definite integral \displaystyle \int_a^b f(x) \, dx, the area under a curve f(x) between x=a and x=b.

In this series, we’ll choose equal-sized subintervals of the interval [a,b]. If h = (b-a)/n is the width of each subinterval so that x_k = x_0 + kh, then the integral may be approximated as

\int_a^b f(x) \, dx \approx h \left[f(x_0) + f(x_1) + \dots + f(x_{n-1}) \right] \equiv L_n

using left endpoints,

\int_a^b f(x) \, dx \approx h \left[f(x_1) + f(x_2) + \dots + f(x_n) \right] \equiv R_n

using right endpoints, and

\int_a^b f(x) \, dx \approx h \left[f(c_1) + f(c_2) + \dots + f(c_n) \right] \equiv M_n

using the midpoints of the subintervals. We have also derived the Trapezoid Rule:

\int_a^b f(x) \, dx \approx \displaystyle \frac{h}{2} [f(x_0) + 2f(x_1) + \dots + 2f(x_{n-1}) + f(x_n)] \equiv T_n

This last approximation was obtained by connecting adjacent points on the curve by line segments, creating trapezoids:

In this post, we will derive Simpson’s Rule. Instead of connecting two adjacent points with line segments, we will connect three adjacent points with a parabola. In the picture below, the points (x_0, f(x_0)), (x_1, f(x_1)) and (x_2,f(x_2)) are connected with one parabola, while the points (x_2, f(x_2)), (x_3, f(x_3)) and (x_4,f(x_4)) are connected with a different second parabola.

Clearly, for this to work, there has to be an even number of subintervals. (By contrast, for the Trapezoid Rule, the Midpoint Rule, or the endpoint rules, the number of subintervals could be even or odd.)

The derivation of Simpson’s Rule is more complicated than the derivation of the Trapezoid Rule because we need to use calculus to find the area under these parabolas. To begin, we make the simplifying assumption that x_1 = 0. Since each subinterval has width h, this means that x_0 = -h and x_2 = h.

To find the area under this parabola, we first need to find the equation of the parabola y = ax^2 + bx + c connecting the three points (-h,y_0), (0,y_1), and (h,y_2). This entails solving a system of three equations in three unknowns:

a(-h)^2 + b(-h) + c = y_0

a(0)^2+b(0) + c = y_1

ah^2 + bh + c = y_2,

or

ah^2 - bh + c = y_0

c = y_1

ah^2 + bh + c = y_2.

While most 3×3 systems are cumbersome to solve, this system is straightforward. Clearly, c = y_1. Also, subtracting the first equation from the third equation yields

2bh = y_2 - y_0, or b = \displaystyle \frac{y_2 - y_0}{2h}

Finally, we solve for a by substituting into the third equation:

ah^2 + \displaystyle \frac{y_2 - y_0}{2h} h + y_1 = y_2

ah^2 + \displaystyle \frac{y_2 - y_0}{2} + y_1 = y_2

ah^2 = \displaystyle \frac{y_0 - y_2}{2} - \frac{2y_1}{2} + \frac{2y_2}{2}

ah^2 = \displaystyle \frac{y_0 - 2y_1 + y_2}{2}

a = \displaystyle \frac{y_0 - 2y_1 + y_2}{2h^2}

Next, we find the integral of y = ax^2 + bx + c between x = -h and x = h:

\displaystyle \int_{-h}^h (ax^2 + bx + c) \, dx = \left[ \frac{ax^3}{3} + \frac{bx^2}{2} + cx \right]^h_{-h}

= \displaystyle \left[ \frac{ah^3}{3} + \frac{bh^2}{2} + ch \right] - \left[ -\frac{ah^3}{3} + \frac{bh^2}{2} - ch \right]

= \displaystyle \frac{2ah^3}{3} + 2ch

= \displaystyle \frac{(y_0 - 2y_1 + y_2)h}{3} + 2y_1h

= \displaystyle \frac{h(y_0 + 4y_1 + y_2)}{3}.

We now turn to the more general case of finding the area under the parabola passing through (x_0,y_0), (x_1,y_1), and (x_2,y_2), where x_1 = x_0 +h and x_2 = x_1 + 2h. Geometrically, it should be clear that this parabola can be obtained from the above parabola by a horizontal translation. Since the area under the curve is not changed by a horizontal translation, the area (and the formula) will be the same.

More formally, if y = ax^2 + bx + c passes through the points (-h,y_0), (0,y_1), and (h,y_2), then y = a(x-x_1)^2 + b(x-x_1) + c will pass through the points (x_0,y_0), (x_1,y_1), and (x_2,y_2). The area under this curve is

\displaystyle \int_{x_0}^{x_2} \left[ a(x-x_1)^2 + b(x-x_1) + c \right] \, dx.

After using the substitution u = x-x_1, this becomes

\displaystyle \int_{-h}^h (au^2 + bu + c) \, du,

which is the same integral that we saw earlier. Therefore,

\displaystyle \int_{x_0}^{x_2} \left[ a(x-x_1)^2 + b(x-x_1) + c \right] \, dx = \displaystyle \frac{h(y_0 + 4y_1 + y_2)}{3}.

Finally, we need to find the sum of the areas under all of these parabolas. Similarly, the area under the parabola passing through (x_2,y_2), (x_3,y_3), and (x_4,y_4) will be \displaystyle \frac{h(y_2 + 4y_3 + y_4)}{3}. So, for the particular example shown above, the total area under the parabolas will be

\displaystyle \frac{h(y_0 + 4y_1 + y_2)}{3} + \frac{h(y_2 + 4y_3 + y_4)}{3} = \frac{h}{3} (y_0 + 4 y_1 + 2 y_2 + 4 y_3 + y_4).

The coefficients of 4 arose from the above integrals, while the coefficient of 2 came from combining the two areas. In general, if there are n subintervals and n is even, then Simpson’s Rule gives the approximation

S_n = \displaystyle \frac{h}{3} \left(y_0 + 4 y_1 + 2 y_2 + 4 y_3 + \dots + 2y_{n-2} + 4 y_{n-1} +  y_{n} \right).

Thoughts on Numerical Integration (Part 4): Derivation of Trapezoid Rule

Numerical integration is a standard topic in first-semester calculus. From time to time, I have received questions from students on various aspects of this topic, including:

  • Why is numerical integration necessary in the first place?
  • Where do these formulas come from (especially Simpson’s Rule)?
  • How can I do all of these formulas quickly?
  • Is there a reason why the Midpoint Rule is better than the Trapezoid Rule?
  • Is there a reason why both the Midpoint Rule and the Trapezoid Rule converge quadratically?
  • Is there a reason why Simpson’s Rule converges like the fourth power of the number of subintervals?

In this series, I hope to answer these questions. While these are standard questions in a introductory college course in numerical analysis, and full and rigorous proofs can be found on Wikipedia and Mathworld, I will approach these questions from the point of view of a bright student who is currently enrolled in calculus and hasn’t yet taken real analysis or numerical analysis.

In the previous post in this series, I discussed three different ways of numerically approximating the definite integral \displaystyle \int_a^b f(x) \, dx, the area under a curve f(x) between x=a and x=b.

In this series, we’ll choose equal-sized subintervals of the interval [a,b]. If h = (b-a)/n is the width of each subinterval so that x_k = x_0 + kh, then the integral may be approximated as

\int_a^b f(x) \, dx \approx h \left[f(x_0) + f(x_1) + \dots + f(x_{n-1}) \right] \equiv L_n

using left endpoints,

\int_a^b f(x) \, dx \approx h \left[f(x_1) + f(x_2) + \dots + f(x_n) \right] \equiv R_n

using right endpoints, and

\int_a^b f(x) \, dx \approx h \left[f(c_1) + f(c_2) + \dots + f(c_n) \right] \equiv M_n

using the midpoints of the subintervals.

All three of these approximations were obtained by approximating the above shaded region by rectangles. However, perhaps it might be better to use some other shape besides rectangles. In the Trapezoidal Rule, we approximate the area by using (surprise!) trapezoids, as in the figure below.

The first trapezoid has height h and bases f(x_0) and f(x_1), and so the area of the first trapezoid is \frac{1}{2} h[ f(x_0) + f(x_1) ]. The other areas are found similarly. Adding these together, we get the approximation

T_n = \displaystyle \frac{h}{2}[f(x_0) + f(x_1)] + \frac{h}{2} [f(x_1) + f(x_2)] + \dots +

+ \displaystyle \frac{h}{2} [f(x_{n-2})+f(x_{n-1})] + \frac{h}{2} [f(x_{n-1})+f(x_n)]

= \displaystyle \frac{h}{2} [f(x_0) + 2f(x_1) + 2f(x_2) + \dots + 2f(x_{n-2}) + 2f(x_{n-1}) + f(x_n)].

Interestingly, T_n is the average of the two endpoint approximations L_n and R_n:

\displaystyle \frac{L_n+R_n}{2} =  \frac{L_n}{2} + \frac{R_n}{2}

= \displaystyle \frac{h}{2} \left[f(x_0) + f(x_1) + f(x_2) + \dots + f(x_{n-1}) \right]

+\displaystyle \frac{h}{2} \left[f(x_1) + f(x_2) + \dots + f(x_{n-1}) + f(x_{n}) \right]

= \displaystyle \frac{h}{2} \left[f(x_0) + 2f(x_1) + \dots + 2f(x_{n-1}) + f(x_n) \right]

= T_n.

Of course, as a matter of computation, it’s a lot quicker to directly compute T_n instead of computing L_n and R_n separately and then averaging.

 

 

Thoughts on Numerical Integration (Part 3): Derivation of left, right, and midpoint rules

Numerical integration is a standard topic in first-semester calculus. From time to time, I have received questions from students on various aspects of this topic, including:

  • Why is numerical integration necessary in the first place?
  • Where do these formulas come from (especially Simpson’s Rule)?
  • How can I do all of these formulas quickly?
  • Is there a reason why the Midpoint Rule is better than the Trapezoid Rule?
  • Is there a reason why both the Midpoint Rule and the Trapezoid Rule converge quadratically?
  • Is there a reason why Simpson’s Rule converges like the fourth power of the number of subintervals?

In this series, I hope to answer these questions. While these are standard questions in a introductory college course in numerical analysis, and full and rigorous proofs can be found on Wikipedia and Mathworld, I will approach these questions from the point of view of a bright student who is currently enrolled in calculus and hasn’t yet taken real analysis or numerical analysis.

For the sake of completeness, I discuss here the origins of the left-endpoint, right-endpoint, and midpoint rules of numerical integration. (These topics are often presented in calculus texts.) Consider the problem of finding $\displaystyle \int_a^b f(x) \, dx$, the area under a curve f(x) between x=a and x=b.

To start the process of numerical integration, the interval [a,b] is divided into subintervals. Usually, for convenience, the intervals are chosen to be the same length, a convention that I’ll follow in this series. That said, if the function is known to vary wildly on some parts of the domain but not so wildly on other parts, then computational efficiency can be gained by varying the sizes of the subintervals, choosing smaller subintervals for the places where the function varies wildly.

In any event, we’ll choose equal-sized subintervals for the duration of this series.

One numerical approximation can be made by choosing left endpoints. In the picture below, the interval [a,b] was divided into four equal subintervals. Let h = (b-a)/4, so that x_0 = a, x_1 = x_0 +h, x_2 = x_0 + 2h, x_3 = x_0 + 3h, and x_4 = x_0 + 4h = b. We then can draw rectangles using the left endpoints of each subinterval. The sum of the areas of these rectangles below is

hf(x_0) + hf(x_1) + hf(x_2) +hf(x_3),

and so this serves as an approximation to the area under the curve. In general, if there are n subintervals and x_k = x_0 + kh, then the integral may be approximated as

\int_a^b f(x) \, dx \approx h \left[f(x_0) + f(x_1) + \dots + f(x_{n-1}) \right]

That said, left endpoints were not necessary for making an approximation. We could have instead chosen the right endpoints of each subinterval. The sum of the areas of the rectangles below is

hf(x_1) + hf(x_2) + hf(x_3) +hf(x_4),

and so this serves as an approximation to the area under the curve. In general, if there are n subintervals, then the integral may be approximated as

\int_a^b f(x) \, dx \approx h \left[f(x_1) + f(x_2) + \dots + f(x_n) \right]

As a final approximation, any point in each subinterval could’ve been used for making an approximation. In the picture below, we use the midpoints of the subintervals, where c_k = (x_k + x_{k-1})/2. The sum of the areas of the rectangles below is

hf(c_1) + hf(c_2) + hf(c_3) +hf(c_4),

and so this serves as an approximation to the area under the curve. In general, if there are n subintervals, then the integral may be approximated as

\int_a^b f(x) \, dx \approx h \left[f(c_1) + f(c_2) + \dots + f(c_n) \right]

 

Thoughts on Numerical Integration (Part 2): The bell curve

Numerical integration is a standard topic in first-semester calculus. From time to time, I have received questions from students on various aspects of this topic, including:

  • Why is numerical integration necessary in the first place?
  • Where do these formulas come from (especially Simpson’s Rule)?
  • How can I do all of these formulas quickly?
  • Is there a reason why the Midpoint Rule is better than the Trapezoid Rule?
  • Is there a reason why both the Midpoint Rule and the Trapezoid Rule converge quadratically?
  • Is there a reason why Simpson’s Rule converges like the fourth power of the number of subintervals?

In this series, I hope to answer these questions. While these are standard questions in a introductory college course in numerical analysis, and full and rigorous proofs can be found on Wikipedia and Mathworld, I will approach these questions from the point of view of a bright student who is currently enrolled in calculus and hasn’t yet taken real analysis or numerical analysis.

In this post, I’d like to take a closer look at the indefinite integral \displaystyle \int e^{-x^2} dx, which is closely related to the area under the bell curve \displaystyle \frac{1}{\sqrt{2\pi}} e^{-x^2/2} dx. This integral cannot be computed using elementary functions. However, using integration by parts, there are some related integrals that can be computed:

\displaystyle \int x e^{-x^2} dx = -\displaystyle \frac{1}{2} e^{-x^2}

\displaystyle \int x^3 e^{-x^2} dx = -\displaystyle \frac{x^2+1}{2} e^{-x^2}

\displaystyle \int x^5 e^{-x^2} dx = -\displaystyle \frac{x^4+2x^2+2}{2} e^{-x^2}

\displaystyle \int x^7 e^{-x^2} dx = -\displaystyle \frac{x^6+3x^4+6x^2+6}{2} e^{-x^2}

Based on these examples, it stands to reason that, if \displaystyle \int e^{-x^2} dx can be written in terms of elementary functions, it should have the form

\displaystyle \int e^{-x^2} dx = f(x) e^{-x^2},

where f(x) is some polynomial to be determined. We will now show that this is impossible.

Suppose f(x) = \displaystyle \sum_{k=0}^n a_k x^k, a polynomial of degree n to be determined. Then we have

\displaystyle \frac{d}{dx} \left[ f(x) e^{-x^2} \right] = e^{-x^2}

or

f'(x) e^{-x^2} - 2 x f(x) e^{-x^2} =e^{-x^2}

or

f'(x) - 2x f(x) = 1.

In other words, all terms on the left-hand side except the constant term must cancel. However, this is impossible: 2x f(x) is a polynomial of degree n+1 while f'(x) is a polynomial of degree n-1. Therefore, the left hand side must have degree n+1 and therefore cannot be a constant.

A similar argument shows that f(x) cannot have the form f(x) = \displaystyle \sum_{k=0}^n a_k x^{b_k}, where the exponents b_k may or may not be integers.

This may be enough to convince a calculus student that there is no elementary antiderivative of \displaystyle e^{-x^2} dx. Indeed, although the proof goes well beyond first-year calculus, there is a theorem that says that if \displaystyle \int x^a e^{bx^2} can be expressed in terms of elementary functions, then the antiderivative must have the form f(x) e^{b x^2}. So the guess above actually can be rigorously justified. References:

  • Elena Anne Marchisotto and Gholam-Ali Zakeri, “An Invitation to Integration in Finite Terms,” The College Mathematics Journal , Sep., 1994, Vol. 25, No. 4 (Sep., 1994), pp. 295-308
  • J. F. Ritt, Integration in Finite Terms: Liouville’s Theory of Elementary Methods, Columbia University Press, New York, 1948

Thoughts on Numerical Integration (Part 1): Why numerical integration?

Numerical integration is a standard topic in first-semester calculus. From time to time, I have received questions from students on various aspects of this topic, including:

  • Why is numerical integration necessary in the first place?
  • Where do these formulas come from (especially Simpson’s Rule)?
  • How can I do all of these formulas quickly?
  • Is there a reason why the Midpoint Rule is better than the Trapezoid Rule?
  • Is there a reason why both the Midpoint Rule and the Trapezoid Rule converge quadratically?
  • Is there a reason why Simpson’s Rule converges like the fourth power of the number of subintervals?

In this series, I hope to answer these questions. While these are standard questions in a introductory college course in numerical analysis, and full and rigorous proofs can be found on Wikipedia and Mathworld, I will approach these questions from the point of view of a bright student who is currently enrolled in calculus and hasn’t yet taken real analysis or numerical analysis.

First, let’s talk about why numerical integration is necessary in the first place. Indeed, I can still remember a high school calculus teacher asking me this question nearly 20 years ago, and this question really got me thinking about what we’re collectively teaching in the secondary curriculum. Indeed, in a Calculus I course, it seems like every integral can be computed if only the proper trick is used. We teach students to search for these different tricks:

  • Let u = x^2+5 to find \displaystyle \int \frac{6x \, dx}{\sqrt{x^2+9}}.
  • Let x = 3\tan \theta to find \displaystyle \int \frac{6 \, dx}{\sqrt{x^2+9}}
  • Use integration by parts to find \displaystyle \int x^3 e^x \, dx

In fact, we teach so many tricks that we may give the impression that every integral can be computed if only the proper trick is employed. Indeed, my university hosts an annual “Integration Bee” that challenges students to find the right technique(s) to evaluate some pretty tough integrals.

Unfortunately, not every integral can be solved in terms of a finite number of elementary functions (polynomials, rational functions, exponential functions, logarithms, trigonometric and inverse trigonometric functions). One function that is commonly known to many students which does not have an elementary antiderivative is \displaystyle \frac{1}{\sqrt{2\pi}} e^{-x^2/2}, otherwise known as the bell curve. For most numbers a and b, the area

\displaystyle \int_a^b \frac{1}{\sqrt{2\pi}} e^{-x^2/2}

cannot be found exactly, and so we ask students to either use a table in the back of the textbook or else use a function on their scientific calculators to find the answer.

Just for the fun of it, I went through my Ph.D. thesis and wrote down some of the integrals that I had to integrate numerically while in school. As an applied mathematician, I was initially stunned by the teacher’s innocent question because so much of my work would be utterly impossible if it wasn’t for numerical integration. Here are some of the easier ones:

  • \displaystyle \int_0^t \frac{1-e^{-x}}{x} dx
  • \displaystyle \int_{2R}^\infty t^2 g(t) \left( \frac{a_1 t^4 + a_2 t^2 + a_3}{(t^2-R^2)^7} +\frac{b_1 t^2 + b_2}{(t^2-R^2)^5} + \frac{c}{(t^2-R^2)^3} \right) dt
  • \displaystyle \int_{d_2}^\infty \int_0^{d_1} \frac{y^2-x^2}{(x^2+y^2)^2} \left(e^{-a(x+d_1)-b d_2} - c\right) dx \, dy
  • \displaystyle \int_{x/2}^\infty \sqrt{r^2 - k^2/4} \phi(r) \, dr
  • \displaystyle \int_{x/2}^\infty \left( \frac{z \sqrt{4r^2-z^2}}{4} + r^2 \arcsin \left( \frac{z}{2r} \right) \right) \phi(r) \, dr
  • \displaystyle \int_0^{2R} e^{-sz} \exp \left[ -c \left( z \sqrt{4R^2-z^2}  + 4R^2 \arcsin \frac{z}{2R} \right) \right] dz
  • \displaystyle \int_0^d \exp \left[ -sz - \lambda \left(z - \frac{z^2}{4d} \right) \right] dz
  • \displaystyle \int_d^{d \sqrt{2}} \exp \left[ -sz - \lambda \left( \frac{d (\pi+1)}{2} - d \arcsin \frac{d}{z} + \frac{z^2}{4d} - \sqrt{z^2-d^2} \right) \right] dz
  • \displaystyle \int_0^\infty \exp \left[-sz - \eta \left(1 - e^{-cz/2} - \frac{cz}{4} e^{-cz/2} \right) \right] dz
  • \displaystyle \int_{-\infty}^x \frac{e^t}{t} dt

All this to say, there are plenty of integrals that arise from a real-world context that have a numerical answer but cannot be computed using the techniques commonly taught in the first-year calculus sequence.