Thoughts on Numerical Integration (Part 16): Midpoint rule and local rate of convergence

Numerical integration is a standard topic in first-semester calculus. From time to time, I have received questions from students on various aspects of this topic, including:
  • Why is numerical integration necessary in the first place?
  • Where do these formulas come from (especially Simpson’s Rule)?
  • How can I do all of these formulas quickly?
  • Is there a reason why the Midpoint Rule is better than the Trapezoid Rule?
  • Is there a reason why both the Midpoint Rule and the Trapezoid Rule converge quadratically?
  • Is there a reason why Simpson’s Rule converges like the fourth power of the number of subintervals?
In this series, I hope to answer these questions. While these are standard questions in a introductory college course in numerical analysis, and full and rigorous proofs can be found on Wikipedia and Mathworld, I will approach these questions from the point of view of a bright student who is currently enrolled in calculus and hasn’t yet taken real analysis or numerical analysis.
In this post, we will perform an error analysis for the Midpoint Rule

\int_a^b f(x) \, dx \approx h \left[f(c_1) + f(c_2) + \dots + f(c_n) \right] \equiv M_n

where n is the number of subintervals and h = (b-a)/n is the width of each subinterval, so that x_k = x_0 + kh. Also, c_i = (x_{i-1} + x_i)/2 is the midpoint of the ith subinterval.
As noted above, a true exploration of error analysis requires the generalized mean-value theorem, which perhaps a bit much for a talented high school student learning about this technique for the first time. That said, the ideas behind the proof are accessible to high school students, using only ideas from the secondary curriculum (especially the Binomial Theorem), if we restrict our attention to the special case f(x) = x^k, where k \ge 5 is a positive integer.

For this special case, the true area under the curve f(x) = x^k on the subinterval [x_i, x_i +h] will be

\displaystyle \int_{x_i}^{x_i+h} x^k \, dx = \frac{1}{k+1} \left[ (x_i+h)^{k+1} - x_i^{k+1} \right]

= \displaystyle \frac{1}{k+1} \left[x_i^{k+1} + {k+1 \choose 1} x_i^k h + {k+1 \choose 2} x_i^{k-1} h^2 + {k+1 \choose 3} x_i^{k-2} h^3 + {k+1 \choose 4} x_i^{k-3} h^4+ {k+1 \choose 5} x_i^{k-4} h^5+ O(h^6) - x_i^{k+1} \right]

= \displaystyle \frac{1}{k+1} \bigg[ (k+1) x_i^k h + \frac{(k+1)k}{2} x_i^{k-1} h^2 + \frac{(k+1)k(k-1)}{6} x_i^{k-2} h^3+ \frac{(k+1)k(k-1)(k-2)}{24} x_i^{k-3} h^4

+ \displaystyle \frac{(k+1)k(k-1)(k-2)(k-3)}{120} x_i^{k-4} h^5 \bigg] + O(h^6)

= x_i^k h + \displaystyle \frac{k}{2} x_i^{k-1} h^2 + \frac{k(k-1)}{6} x_i^{k-2} h^3 + \frac{k(k-1)(k-2)}{24} x_i^{k-3} h^4 + \frac{k(k-1)(k-2)(k-3)}{120} x_i^{k-4} h^5 + O(h^6)

In the above, the shorthand O(h^6) can be formally defined, but here we’ll just take it to mean “terms that have a factor of h^6 or higher that we’re too lazy to write out.” Since h is supposed to be a small number, these terms will small in magnitude and thus can be safely ignored. I wrote the above formula to include terms up to and including h^5 because I’ll need this later in this series of posts. For now, looking only at the Midpoint Rule, it will suffice to write this integral as

\displaystyle \int_{x_i}^{x_i+h} x^k \, dx =x_i^k h + \displaystyle \frac{k}{2} x_i^{k-1} h^2 + \frac{k(k-1)}{6} x_i^{k-2} h^3 + O(h^4).

Using the midpoint of the subinterval, the left-endpoint approximation of \displaystyle \int_{x_i}^{x_i+h} x^k \, dx is \displaystyle \left(x_i+ \frac{h}{2} \right)^k h. Using the Binomial Theorem, this expands as

 x_i^k h + \displaystyle {k \choose 1} x_i^{k-1} \frac{h^2}{2}  + {k \choose 2} x_i^{k-2} \frac{h^3}{4} + {k \choose 3} x_i^{k-3} \frac{h^4}{8}  + {k \choose 4} x_i^{k-4} \frac{h^5}{16} + O(h^6)

 = x_i^k h + \displaystyle \frac{k}{2} x_i^{k-1} h^2  + \frac{k(k-1)}{8} x_i^{k-2} h^3 + \frac{k(k-1)(k-2)}{48} x_i^{k-3} h^4

\displaystyle + \frac{k(k-1)(k-2)(k-3)}{384} x_i^{k-4} h^5 + O(h^6)

Once again, this is a little bit overkill for the present purposes, but we’ll need this formula later in this series of posts. Truncating somewhat earlier, we find that the Midpoint Rule for this subinterval gives

x_i^k h + \displaystyle \frac{k}{2} x_i^{k-1} h^2  + \displaystyle \frac{k(k-1)}{8} x_i^{k-2} h^3 + O(h^4)

Subtracting from the actual integral, the error in this approximation will be equal to

\displaystyle x_i^k h + \frac{k}{2} x_i^{k-1} h^2 + \frac{k(k-1)}{6} x_i^{k-2} h^3 - x_i^k h - \frac{k}{2} x_i^{k-1} h^2  - \frac{k(k-1)}{8} x_i^{k-2} h^3 + O(h^4)

= \displaystyle \frac{k(k-1)}{24} x_i^{k-2} h^3 + O(h^4)

In other words, unlike the left-endpoint and right-endpoint approximations, both of the first two terms x_i^k h and \displaystyle \frac{k}{2} x_i^{k-1} h^2 cancel perfectly, leaving us with a local error on the order of h^3.
The logic for determining the global error is much the same as what we used earlier for the left-endpoint rule. The total error when approximating \displaystyle \int_a^b x^k \, dx = \int_{x_0}^{x_n} x^k \, dx will be the sum of the errors for the integrals over [x_0,x_1], [x_1,x_2], through [x_{n-1},x_n]. Therefore, the total error will be

E \approx \displaystyle \frac{k(k-1)}{24} \left(x_0^{k-2} + x_1^{k-2} + \dots + x_{n-1}^{k-2} \right) h^3.

So that this formula doesn’t appear completely mystical, this actually matches the numerical observations that we made earlier. The figure below shows the left-endpoint approximations to \displaystyle \int_1^2 x^9 \, dx for different numbers of subintervals. If we take n = 100 and h = 0.01, then the error should be approximately equal to

\displaystyle \frac{9 \times 8}{24} \left(1^7 + 1.01^7 + \dots + 1.99^7 \right) (0.01)^3 \approx 0.0093731,

which, as expected, is close to the actual error of 102.3 - 102.2904379 \approx 0.00956211.
Let y_i = x_i^{k-2}, so that the error becomes

E \approx \displaystyle \frac{k(k-1)}{24} \left(y_0 + y_1 + \dots + y_{n-1} \right) h^3 + O(h^4) = \displaystyle \frac{k(k-1)}{24} \overline{y} n h^3,

where \overline{y} = (y_0 + y_1 + \dots + y_{n-1})/n is the average of the y_i. Clearly, this average is somewhere between the smallest and the largest of the y_i. Since y = x^{k-2} is a continuous function, that means that there must be some value of x_* between x_0 and x_{k-1} — and therefore between a and b — so that x_*^{k-2} = \overline{y} by the Intermediate Value Theorem. We conclude that the error can be written as

E \approx \displaystyle \frac{k(k-1)}{24} x_*^{k-2} nh^3,

Finally, since h is the length of one subinterval, we see that nh = b-a is the total length of the interval [a,b]. Therefore,

E \approx \displaystyle \frac{k(k-1)}{24} x_*^{k-2} (b-a)h^2 \equiv ch^2,

where the constant c is determined by a, b, and k. In other words, for the special case f(x) = x^k, we have established that the error from the Midpoint Rule is approximately quadratic in h — without resorting to the generalized mean-value theorem and confirming the numerical observations we made earlier.

Thoughts on Numerical Integration (Part 15): Right endpoint rule and global rate of convergence

Numerical integration is a standard topic in first-semester calculus. From time to time, I have received questions from students on various aspects of this topic, including:
  • Why is numerical integration necessary in the first place?
  • Where do these formulas come from (especially Simpson’s Rule)?
  • How can I do all of these formulas quickly?
  • Is there a reason why the Midpoint Rule is better than the Trapezoid Rule?
  • Is there a reason why both the Midpoint Rule and the Trapezoid Rule converge quadratically?
  • Is there a reason why Simpson’s Rule converges like the fourth power of the number of subintervals?
In this series, I hope to answer these questions. While these are standard questions in a introductory college course in numerical analysis, and full and rigorous proofs can be found on Wikipedia and Mathworld, I will approach these questions from the point of view of a bright student who is currently enrolled in calculus and hasn’t yet taken real analysis or numerical analysis.
In the previous post in this series, we found that the local error of the right endpoint approximation to \displaystyle \int_{x_i}^{x_i+h} x^k \, dx was equal to 

\displaystyle \frac{k}{2} x_i^{k-1} h^2 + O(h^3).

We now consider the global error when integrating over the interval [a,b] and not just a particular subinterval.
The total error when approximating \displaystyle \int_a^b x^k \, dx = \int_{x_0}^{x_n} x^k , dx will be the sum of the errors for the integrals over [x_0,x_1], [x_1,x_2], through [x_{n-1},x_n]. Therefore, the total error will be

E \approx \displaystyle \frac{k}{2} \left(x_1^{k-1} + x_2^{k-1} + \dots + x_{n}^{k-1} \right) h^2.

So that this formula doesn’t appear completely mystical, this actually matches the numerical observations that we made earlier. The figure below shows the left-endpoint approximations to \displaystyle \int_1^2 x^9 \, dx for different numbers of subintervals. If we take n = 100 and h = 0.01, then the error should be approximately equal to

\displaystyle \frac{9}{2} \left(1.01^8 + 1.02^8 + \dots + 2^8 \right) (0.01)^2 \approx 2.61276,

which, as expected, is close to the actual error of 104.8741246 - 102.3 \approx 2.57412.
We now perform a more detailed analysis of the global error, which is almost a perfect copy-and-paste from the previous analysis. Let y_i = x_i^{k-1}, so that the error becomes

E \approx \displaystyle \frac{k}{2} \left(y_1 + y_2 + \dots + y_n \right) h^2 + O(h^3) = \displaystyle \frac{k}{2} \overline{y} n h^2,

where \overline{y} = (y_1 + y_2 + \dots + y_{n})/n is the average of the y_i. Clearly, this average is somewhere between the smallest and the largest of the y_i. Since y = x^{k-1} is a continuous function, that means that there must be some value of x_* between x_1 and x_{n} — and therefore between a and b — so that x_*^{k-1} = \overline{y} by the Intermediate Value Theorem. We conclude that the error can be written as

E \approx \displaystyle \frac{k}{2} x_*^{k-1} nh^2,

Finally, since h is the length of one subinterval, we see that nh = b-a is the total length of the interval [a,b]. Therefore,

E \approx \displaystyle \frac{k}{2} x_*^{k-1} (b-a)h \equiv ch,

where the constant c is determined by a, b, and k. In other words, for the special case f(x) = x^k, we have established that the error from the left-endpoint rule is approximately linear in h — without resorting to the generalized mean-value theorem.

Thoughts on Numerical Integration (Part 14): Right endpoint rule and local rate of convergence

Numerical integration is a standard topic in first-semester calculus. From time to time, I have received questions from students on various aspects of this topic, including:

  • Why is numerical integration necessary in the first place?
  • Where do these formulas come from (especially Simpson’s Rule)?
  • How can I do all of these formulas quickly?
  • Is there a reason why the Midpoint Rule is better than the Trapezoid Rule?
  • Is there a reason why both the Midpoint Rule and the Trapezoid Rule converge quadratically?
  • Is there a reason why Simpson’s Rule converges like the fourth power of the number of subintervals?

In this series, I hope to answer these questions. While these are standard questions in a introductory college course in numerical analysis, and full and rigorous proofs can be found on Wikipedia and Mathworld, I will approach these questions from the point of view of a bright student who is currently enrolled in calculus and hasn’t yet taken real analysis or numerical analysis.

In this post, we will perform an error analysis for the right-endpoint rule

\int_a^b f(x) \, dx \approx h \left[f(x_1) + f(x_2) + \dots + f(x_n) \right] \equiv R_n

where n is the number of subintervals and h = (b-a)/n is the width of each subinterval, so that x_k = x_0 + kh.

As noted above, a true exploration of error analysis requires the generalized mean-value theorem, which perhaps a bit much for a talented high school student learning about this technique for the first time. That said, the ideas behind the proof are accessible to high school students, using only ideas from the secondary curriculum, if we restrict our attention to the special case f(x) = x^k, where k \ge 5 is a positive integer.

For this special case, the true area under the curve $f(x) = x^k$ on the subinterval [x_i, x_i +h] will be

\displaystyle \int_{x_i}^{x_i+h} x^k \, dx = \frac{1}{k+1} \left[ (x_i+h)^{k+1} - x_i^{k+1} \right]

= \displaystyle \frac{1}{k+1} \left[x_i^{k+1} + {k+1 \choose 1} x_i^k h + {k+1 \choose 2} x_i^{k-1} h^2 + O(h^3) - x_i^{k+1} \right]

= \displaystyle \frac{1}{k+1} \left[ (k+1) x_i^k h + \frac{(k+1)k}{2} x_i^{k-1} h^2 + O(h^3) \right]

= x_i^k h + \displaystyle \frac{k}{2} x_i^{k-1} h^2 + O(h^3)

In the above, the shorthand O(h^3) can be formally defined, but here we’ll just take it to mean “terms that have a factor of h^3 or higher that we’re too lazy to write out.” Since h is supposed to be a small number, these terms will be much smaller in magnitude that the terms that have h or h^2 and thus can be safely ignored.

Using only the right-endpoint of the subinterval, the left-endpoint approximation of \displaystyle \int_{x_i}^{x_i+h} x^k \, dx is

(x_i+h)^k h = x_i^k h + k x_i^{k-1} h^2 + O(h^3).

Subtracting, the error in this approximation will be equal to

\displaystyle x_i^k h + k x_i^{k-1} h^2 - x_i^k h - \frac{k}{2} x_i^{k-1} h^2 + O(h^3) = \displaystyle \frac{k}{2} x_i^{k-1} h^2 + O(h^3)

Repeating the logic from the previous post in this series, this local error on [x_i, x_i+h], which is proportional to O(h^2), generates a total error on [a,b] that is proportional to h. That is, the right-endpoint rule has an error that is approximately linear in h, confirming the numerical observation that we made earlier in this series.

Engaging students: Finding the area of regular polygons

In my capstone class for future secondary math teachers, I ask my students to come up with ideas for engaging their students with different topics in the secondary mathematics curriculum. In other words, the point of the assignment was not to devise a full-blown lesson plan on this topic. Instead, I asked my students to think about three different ways of getting their students interested in the topic in the first place.

I plan to share some of the best of these ideas on this blog (after asking my students’ permission, of course).

This student submission comes from my former student Conner Dunn. His topic, from Geometry: finding the area of regular polygons.

green line

How can technology be used to effectively engage students with this topic? Note:  It’s not enough to say “such-and-such is a great website”; you need to explain in some detail why it’s a great website.

A good way to get students into the concept and see it’s real life use is to be given a realistic problem. The natural world doesn’t typically give you perfectly regular polygons, but we certainly like making them ourselves. Better yet, we can make regular polygons of a certain area knowing the methodology behind computing their areas. Using GeoGebra, I can challenge students to construct a regular polygon of an exact area using what they know about the use of equilateral triangles to compute area.

For example, let’s say I ask for a hexagon with an area of 12√3 square units. While there’s a few strategies of constructing a regular hexagon that Geometry students may know of, the strategy to shoot for here is to recognize this means we want a hexagon with a side length of 2 then construct the triangles. GeoGebra allows for students to use line segments and give them certain lengths, as well as construct angles using a virtual compass tool. Below is the solution to this example by constructing 6 equilateral triangles (each with an area of 2√3 square units) to form the regular hexagon.

This image has an empty alt attribute; its file name is geogebra-hexagon.png

green line

How does this topic extend what your students should have learned in previous courses?

By the end of the unit, students will have learned the formula for finding the area of a polygon (A = (1/2) * a * p, with a being the apothem, and p being the perimeter). But a big part of this unit is how we derive the formula from the process in which we solve the area using this equilateral triangle method. From many previous courses, students will have learned both the order of operations and properties of equality in equations, and we use this previous knowledge to connect a geometric understanding of area to an algebraic one. For example, when we have the idea of multiplying the area of an equilateral triangle by the number of sides, n, in the polygon, we have A = (1/2) b * h * n. It is by the use of communitive property that students can rearrange the variables like this: A = (1/2)h*b * n. And then we conclude that the b*n reveals that the perimeter of the polygon plays a role in our equation. This may seem subtle, but students being fluent in this knowledge helps them work in their geometric understandings much easier.

green line

How has this topic appeared in high culture (art, classical music, theatre, etc.)?

A big part of the method for understanding area of polygons is seeing how we can perfectly fit equilateral triangles inside of our polygons of choice. Perfectly fitting shapes into and around other shapes is something you see in mosaic art everywhere, particularly in Islamic architecture.

While mosaic artists are not necessarily calculating the areas of their art pieces (they might but I doubt it), a big part what makes these buildings so nice to look at is how the shapes fit with one another so nicely. This is an art that’s very intentional in its aesthetically pleasing aroma. This is something I think Geometry students can take to heart when confronting Geometry problems (a just as well with future courses). It’s the overlooked skill of literally connecting pieces together in order to get something we want. In the case of the Islamic architect, what we want is a pleasing building to look at, but math, of course, brings in more possible things to shoot for and equips us with plenty of pieces to (literally) connect together.

Thoughts on Numerical Integration (Part 13): Left endpoint rule and global rate of convergence

Numerical integration is a standard topic in first-semester calculus. From time to time, I have received questions from students on various aspects of this topic, including:
  • Why is numerical integration necessary in the first place?
  • Where do these formulas come from (especially Simpson’s Rule)?
  • How can I do all of these formulas quickly?
  • Is there a reason why the Midpoint Rule is better than the Trapezoid Rule?
  • Is there a reason why both the Midpoint Rule and the Trapezoid Rule converge quadratically?
  • Is there a reason why Simpson’s Rule converges like the fourth power of the number of subintervals?
In this series, I hope to answer these questions. While these are standard questions in a introductory college course in numerical analysis, and full and rigorous proofs can be found on Wikipedia and Mathworld, I will approach these questions from the point of view of a bright student who is currently enrolled in calculus and hasn’t yet taken real analysis or numerical analysis.
In the previous post in this series, we found that the local error of the left endpoint approximation to \displaystyle \int_{x_i}^{x_i+h} x^k \, dx was equal to 

\displaystyle \frac{k}{2} x_i^{k-1} h^2 + O(h^3).

We now consider the global error when integrating over the interval [a,b] and not just a particular subinterval.
The total error when approximating \displaystyle \int_a^b x^k \, dx = \int_{x_0}^{x_n} x^k , dx will be the sum of the errors for the integrals over [x_0,x_1], [x_1,x_2], through [x_{n-1},x_n]. Therefore, the total error will be

E \approx \displaystyle \frac{k}{2} \left(x_0^{k-1} + x_1^{k-1} + \dots + x_{n-1}^{k-1} \right) h^2.

So that this formula doesn’t appear completely mystical, this actually matches the numerical observations that we made earlier. The figure below shows the left-endpoint approximations to \displaystyle \int_1^2 x^9 \, dx for different numbers of subintervals. If we take n = 100 and h = 0.01, then the error should be approximately equal to

\displaystyle \frac{9}{2} \left(1^8 + 1.01^8 + \dots + 1.99^8 \right) (0.01)^2 \approx 2.49801,

which, as expected, is close to the actual error of 102.3 - 99.76412456 \approx 2.53588.
We now perform a more detailed analysis of the global error. Let y_i = x_i^{k-1}, so that the error becomes

E \approx \displaystyle \frac{k}{2} \left(y_0 + y_1 + \dots + y_{n-1} \right) h^2 + O(h^3) = \displaystyle \frac{k}{2} \overline{y} n h^2,

where \overline{y} = (y_0 + y_1 + \dots + y_{n-1})/n is the average of the y_i. Clearly, this average is somewhere between the smallest and the largest of the y_i. Since y = x^{k-1} is a continuous function, that means that there must be some value of x_* between x_0 and x_{n-1} — and therefore between a and b — so that x_*^{k-1} = \overline{y} by the Intermediate Value Theorem. We conclude that the error can be written as

E \approx \displaystyle \frac{k}{2} x_*^{k-1} nh^2,

Finally, since h is the length of one subinterval, we see that nh = b-a is the total length of the interval [a,b]. Therefore,

E \approx \displaystyle \frac{k}{2} x_*^{k-1} (b-a)h \equiv ch,

where the constant c is determined by a, b, and k. In other words, for the special case f(x) = x^k, we have established that the error from the left-endpoint rule is approximately linear in h — without resorting to the generalized mean-value theorem.

Engaging students: Finding the area of a circle

In my capstone class for future secondary math teachers, I ask my students to come up with ideas for engaging their students with different topics in the secondary mathematics curriculum. In other words, the point of the assignment was not to devise a full-blown lesson plan on this topic. Instead, I asked my students to think about three different ways of getting their students interested in the topic in the first place.

I plan to share some of the best of these ideas on this blog (after asking my students’ permission, of course).

This student submission comes from my former student Brendan Gunnoe. His topic, from Geometry: finding the area of a circle.

green line

History: Squaring the circle

The ancient Greeks and other groups at the time had a fascination with geometry. These cultures tended to like thinking in terms of simpler geometric shapes, such as circles, equilateral triangles and squares. One of the classic problems proposed by these ancient peoples was “Can you create a square with the same area as a circle with finitely many steps only using a compass and straightedge?”. This problem stood for thousands of years, stumping even the most brilliant of mathematicians that attempted to show it true. Eventually, in the year 1882, it was finally proven impossible because of a property of the number π. It’s not too hard to show that π isn’t an integer, nor is it rational. What was left to show is whether π was algebraic or transcendental. The proof from 1882 showed that π is in fact transcendental, proving that it cannot be made using the rules set out by the original question. If a number is algebraic, then it is a solution to a polynomial with rational coefficients.

green line

Curriculum: Using limit of triangular approximations to get the integral

The teacher starts off class by drawing a circle with an inscribed triangle, another with a square, and so on until a hexagon is inscribed. The teacher then draws isosceles triangles that originate at the circles center and extend to the corners of the polygons. The teacher could ask questions like “What do you notice about the total area of the triangles and the area of the circle as we keep adding sides to the polygon?” and “What do you notice about the triangles we made and the little wedges of the circle, what’s the same and what’s different about them?”. Then the teacher could arrange both the triangles and wedges in an alternating up and down fashion, almost like two zippers, to line up the triangles and wedges. The teacher could ask “What’s the length of the top of the triangles? What about the tops of the wedges, what’s their length?”.

Finally, the teacher asks “What happens when we let the number of pieces gets REALLY big? What happens to difference between the area of the triangles and wedges? What about the tops of the triangles and the tops of the wedges?”. In the limit, the upper edge converges to half of the circumference of the circle and the height of the triangles converges to the radius of the circle. Using this line of thinking, the teacher guides the students into seeing how you can derive the equation for the area of a circle by using approximating it with triangles, and then looking at what happens in the limit.

green line

Application

A telescope’s lens is what’s used to control how much light gets into the eye piece. Suppose you’re an astronomer and want to take a photo of the full Moon on a clear night, which gives off 0.25 lumens/s-m2. Suppose your camera needs to get a total of at least 3 lumens to produce a good photo and 5 lumens to get an amazing photo. What’s the radius of a lens (in centimeters) that can take a good photo in 10 minutes? What’s the radius of a lens that can take an amazing photo in 10 minutes?

Now suppose you’re working with the Hubble space telescope in low Earth orbit trying to get photos of a nearby star system. The radius of the main telescope is 120cm and the star system you want to observe is giving off light at a rate of 10-5 lumens/s-m2. How long will it take to get a good photo with Hubble? What about a great photo?

https://en.wikipedia.org/wiki/Hubble_Space_Telescope

https://en.wikipedia.org/wiki/Squaring_the_circle

Thoughts on Numerical Integration (Part 12): Left endpoint rule and local rate of convergence

Numerical integration is a standard topic in first-semester calculus. From time to time, I have received questions from students on various aspects of this topic, including:

  • Why is numerical integration necessary in the first place?
  • Where do these formulas come from (especially Simpson’s Rule)?
  • How can I do all of these formulas quickly?
  • Is there a reason why the Midpoint Rule is better than the Trapezoid Rule?
  • Is there a reason why both the Midpoint Rule and the Trapezoid Rule converge quadratically?
  • Is there a reason why Simpson’s Rule converges like the fourth power of the number of subintervals?

In this series, I hope to answer these questions. While these are standard questions in a introductory college course in numerical analysis, and full and rigorous proofs can be found on Wikipedia and Mathworld, I will approach these questions from the point of view of a bright student who is currently enrolled in calculus and hasn’t yet taken real analysis or numerical analysis.

In this post, we will perform an error analysis for the left-endpoint rule

\int_a^b f(x) \, dx \approx h \left[f(x_0) + f(x_1) + \dots + f(x_{n-1}) \right] \equiv L_n

where n is the number of subintervals and h = (b-a)/n is the width of each subinterval, so that x_k = x_0 + kh.

As noted above, a true exploration of error analysis requires the generalized mean-value theorem, which perhaps a bit much for a talented high school student learning about this technique for the first time. That said, the ideas behind the proof are accessible to high school students, using only ideas from the secondary curriculum, if we restrict our attention to the special case f(x) = x^k, where k \ge 5 is a positive integer.

For this special case, the true area under the curve $f(x) = x^k$ on the subinterval [x_i, x_i +h] will be

\displaystyle \int_{x_i}^{x_i+h} x^k \, dx = \frac{1}{k+1} \left[ (x_i+h)^{k+1} - x_i^{k+1} \right]

= \displaystyle \frac{1}{k+1} \left[x_i^{k+1} + {k+1 \choose 1} x_i^k h + {k+1 \choose 2} x_i^{k-1} h^2 + O(h^3) - x_i^{k+1} \right]

= \displaystyle \frac{1}{k+1} \left[ (k+1) x_i^k h + \frac{(k+1)k}{2} x_i^{k-1} h^2 + O(h^3) \right]

= x_i^k h + \displaystyle \frac{k}{2} x_i^{k-1} h^2 + O(h^3)

In the above, the shorthand O(h^3) can be formally defined, but here we’ll just take it to mean “terms that have a factor of h^3 or higher that we’re too lazy to write out.” Since h is supposed to be a small number, these terms will be much smaller in magnitude that the terms that have h or h^2 and thus can be safely ignored.

Using only the left-endpoint of the subinterval, the left-endpoint approximation of \displaystyle \int_{x_i}^{x_i+h} x^k \, dx is x_i^k h. Therefore, the error in this approximation will be equal to

\displaystyle \frac{k}{2} x_i^{k-1} h^2 + O(h^3).

In the next post of this series, we’ll show that the global error when integrating between a and b — as opposed to between x_i and x_i + h — is approximately linear in h.