Reminding students about Taylor series (Part 4)

I’m in the middle of a series of posts describing how I remind students about Taylor series. In the previous posts, I described how I lead students to the definition of the Maclaurin series

$f(x) = \displaystyle \sum_{k=0}^{\infty} \frac{f^{(k)}(0)}{k!} x^k$ ,

which converges to $f(x)$ within some radius of convergence for all functions that commonly appear in the secondary mathematics curriculum.

Step 4. Let’s now get some practice with Maclaurin series. Let’s start with $f(x) = e^x$ .

What’s $f(0)$ ? That’s easy: $f(0) = e^0 = 1$ .

Next, to find $f'(0)$ , we first find $f'(x)$ . What is it? Well, that’s also easy: $f'(x) = \frac{d}{dx} (e^x) = e^x$ . So $f'(0)$ is also equal to $1$ .

How about $f''(0)$ ? Yep, it’s also $1$ . In fact, it’s clear that $f^{(n)}(0) = 1$ for all $n$ , though we’ll skip the formal proof by induction.

Plugging into the above formula, we find that

$e^x = \displaystyle \sum_{k=0}^{\infty} \frac{1}{k!} x^k = \sum_{k=0}^{\infty} \frac{x^k}{k!} = 1 + x + \frac{x^2}{2} + \frac{x^3}{3} + \dots$

It turns out that the radius of convergence for this power series is $\infty$ . In other words, the series on the right converges for all values of $x$ . So we’ll skip this for review purposes, this can be formally checked by using the Ratio Test.

At this point, students generally feel confident about the mechanics of finding a Taylor series expansion, and that’s a good thing. However, in my experience, their command of Taylor series is still somewhat artificial. They can go through the motions of taking derivatives and finding the Taylor series, but this complicated symbol in $\displaystyle \sum$ notation still doesn’t have much meaning.

So I shift gears somewhat to discuss the rate of convergence. My hope is to deepen students’ knowledge by getting them to believe that $f(x)$ really can be approximated to high precision with only a few terms. Perhaps not surprisingly, it converges quicker for small values of $x$ than for big values of $x$ .

Pedagogically, I like to use a spreadsheet like Microsoft Excel to demonstrate the rate of convergence. A calculator could be used, but students can see quickly with Excel how quickly (or slowly) the terms get smaller. I usually construct the spreadsheet in class on the fly (the fill down feature is really helpful for doing this quickly), with the end product looking something like this:

In this way, students can immediately see that the Taylor series is accurate to four significant digits by going up to the $x^4$ term and that about ten or eleven terms are needed to get a figure that is as accurate as the precision of the computer will allow. In other words, for all practical purposes, an infinite number of terms are not necessary.

In short, this is how a calculator computes $e^x$ : adding up the first few terms of a Taylor series. Back in high school, when students hit the $e^x$ button on their calculators, they’ve trusted the result but the mechanics of how the calculator gets the result was shrouded in mystery. No longer.

Then I shift gears by trying a larger value of $x$ :

I ask my students the obvious question: What went wrong? They’re usually able to volunteer a few ideas:

The convergence is slower for larger values of $x$ .
The series will converge, but more terms are needed (and I’ll later use the fill down feature to get enough terms so that it does converge as accurate as double precision will allow).
The individual terms get bigger until $k=11$ and then start getting smaller. I’ll ask my students why this happens, and I’ll eventually get an explanation like

$\displaystyle \frac{(11.5)^6}{6!} < \frac{(11.5)^6}{6!} \times \frac{11.5}{7} = \frac{(11.5)^7}{7!}$

but

$\displaystyle \frac{(11.5)^{11}}{11!} < \frac{(11.5)^{11}}{11!} \times \frac{11.5}{12} = \frac{(11.5)^{12}}{12!}$

At this point, I’ll mention that calculators use some tricks to speed up convergence. For example, the calculator can simply store a few values of $e^x$ in memory, like $e^{16}$ , $e^{8}$ , $e^{4}$ , $e^{2}$ , and $e^{1} = e$ . I then ask my class how these could be used to find $e^{11.5}$ . After some thought, they will volunteer that

$e^{11.5} = e^8 \cdot e^2 \cdot e \cdot e^{0.5}$ .

The first three values don’t need to be computed — they’ve already been stored in memory — while the last value can be computed via Taylor series. Also, since $0.5 < 1$ , the series for $e^{0.5}$ will converge pretty quickly. (Some students may volunteer that the above product is logically equivalent to turning $11$ into binary.)

At this point — after doing these explicit numerical examples — I’ll show graphs of $e^x$ and graphs of the Taylor polynomials of $e^x$ , observing that the polynomials get closer and closer to the graph of $e^x$ as more terms are added. (For example, see the graphs on the Wikipedia page for Taylor series, though I prefer to use Mathematica for in-class purposes.) In my opinion, the convergence of the graphs only becomes meaningful to students only after doing some numerical examples, as done above.

At this point, I hope my students are familiar with the definition of Taylor (Maclaurin) series, can apply the definition to $e^x$ , and have some intuition meaning that the nasty Taylor series expression practically means add a bunch of terms together until you’re satisfied with the convergence.

In the next post, we’ll consider another Taylor series which ought to be (but usually isn’t) really familiar to students: an infinite geometric series.

P.S. Here’s the Excel spreadsheet that I used to make the above figures: Taylor.

Reminding students about Taylor series (Part 3)

Sadly, at least at my university, Taylor series is the topic that is least retained by students years after taking Calculus II. They can remember the rules for integration and differentiation, but their command of Taylor series seems to slip through the cracks. In my opinion, the reason for this lack of retention is completely understandable from a student’s perspective: Taylor series is usually the last topic covered in a semester, and so students learn them quickly for the final and quickly forget about them as soon as the final is over.

Of course, when I need to use Taylor series in an advanced course but my students have completely forgotten this prerequisite knowledge, I have to get them up to speed as soon as possible. Here’s the sequence that I use to accomplish this task. Covering this sequence usually takes me about 30 minutes of class time.

I should emphasize that I present this sequence in an inquiry-based format: I ask leading questions of my students so that the answers of my students are driving the lecture. In other words, I don’t ask my students to simply take dictation. It’s a little hard to describe a question-and-answer format in a blog, but I’ll attempt to do this below.

In the previous post, I described how I lead students to the equations

$f(x) = \displaystyle \sum_{k=0}^n \frac{f^{(k)}(0)}{k!} x^k$ .

and

$f(x) = \displaystyle \sum_{k=0}^n \frac{f^{(k)}(a)}{k!} (x-a)^k$ ,

where $f(x)$ is a polynomial and $a$ can be any number.

Step 3. What happens if the original function $f(x)$ is not a polynomial? For one thing, the right-hand side can no longer be a finite sum. As long as the sum on the right-hand side stops at some degree $n$ , the right-hand side is a polynomial, but the left-hand side is assumed to not be a polynomial.

To resolve this, we can cross our fingers and hope that

$f(x) = \displaystyle \sum_{k=0}^{\infty} \frac{f^{(k)}(0)}{k!} x^k$ ,

$f(x) = \displaystyle \sum_{k=0}^{\infty}\frac{f^{(k)}(a)}{k!} (x-a)^k$ .

In other words, let’s make the right-hand side an infinite series, and hope for the best. This is the definition of the Taylor series expansions of $f$ .

Note: At this point in the review, I can usually see the light go on in my students’ eyes. Usually, they can now recall their work with Taylor series in the past… and they wonder why they weren’t taught this topic inductively (like I’ve tried to do in the above exposition) instead of deductively (like the presentation in most textbooks).

While we’d like to think that the Taylor series expansions always work, there are at least two things that can go wrong.

First, the sum on the left is an infinite series, and there’s no guarantee that the series will converge in the first place. There are plenty of example of series that diverge, like $\displaystyle \sum_{k=0}^\infty \frac{1}{k+1}$ .
Second, even if the series converges, there’s no guarantee that the series will converge to the “right” answer $f(x)$ . The canonical example of this behavior is $f(x) = e^{-1/x^2}$ , which is so “flat” near $x=0$ that every single derivative of $f$ is equal to $0$ at $x =0$ .

For the first complication, there are multiple tests devised in Calculus II, especially the Ratio Test, to determine the values of $x$ for which the series converges. This establishes a radius of convergence for the series.

The second complication is far more difficult to address rigorously. The good news is that, for all commonly occurring functions in the secondary mathematics curriculum, the Taylor series of a function properly converges (when it does converge). So we will happily ignore this complication for the remainder of the presentation.

Indeed, it’s remarkable that the series should converge to $f(x)$ at all. Think about the meaning of the terms on the right-hand side:

$f(a)$ is the $y-$ coordinate at $x=a$ .
$f'(a)$ is the slope of the curve at $x=a$ .
$f''(a)$ is a measure of the concavity of the curve at — you guessed it — $x=a$ .
$f'''(a)$ is an even more subtle description of the curve… once again, at $x=a$ .

In other words, if the Taylor series converges to $f(x)$ , then every twist and turn of the function, even at points far away from $x=a$ , is encoded somehow in the shape of the curve at the one point $x=a$ . So analytic functions (which has a Taylor series which converges to the original functions) are indeed quite remarkable.

Reminding students about Taylor series (Part 2)

In this series of posts, I will describe the sequence of examples that I use to remind students about Taylor series. (One time, just for fun, I presented this topic at the end of a semester of Calculus I, and it seemed to go well even for that audience who had not seen Taylor series previously.)

I should emphasize that I present this sequence inductively and in an inquiry-based format: I ask leading questions of my students so that the answers of my students are driving the lecture. In other words, I don’t ask my students to simply take dictation. It’s a little hard to describe a question-and-answer format in a blog, but I’ll attempt to do this below.

Step 1. Find the unique quartic (fourth-degree) polynomial so that $f(0) = 6$ , $f'(0) = -3$ , $f''(0) = 6$ , $f'''(0) = 2$ , and $f^{(4)}(0) = 10$ .

I’ve placed a thought bubble if you’d like to think about it before scrolling down to see the answer. Here’s a hint to get started: let $f(x) = ax^4 + bx^3 + cx^2 + dx + e$ , and start differentiating. Remember that $a$ , $b$ , $c$ , $d$ , and $e$ are constants.

We begin with the information that $f(0) = 6$ . How else can we find $f(0)$? Since $f(x) = ax^4 + bx^3 + cx^2 + dx + e$ , we see that $f(0) = e$ . Therefore, it must be that $e = 6$ .

How about $f'(0)$ ? We see that $f'(x) = 4ax^3 + 3bx^2 + 2cx + d$ , and so $f'(0) = d$ . Since $f'(0) = -3$ , we have that $d = -3$ .

Next, $f''(x) = 12ax^2 + 6bx + 2c$ , and so $f''(0) = 2c$ . Since $f''(0) = 6$ ,we have that $2c = 6$ , or $c = 3$ .

Next, $f'''(x) = 24ax + 6b$ , and so $f'''(0) = 6b$ . Since $f'''(0) = 2$ ,we have that $6b = 2$ , or $b = \frac{1}{3}$ .

Finally, $f^{(4)}(x) = 24a$ , and so $f^{(4)}(0) = 24a$ . Since $f^{(4)}(0) = 10$ , we have $24a = 10$ , or $a = \frac{5}{12}$ .

What do we get when we put all of this information together? The polynomial must be

$f(x) = \frac{5}{12} x^4 + \frac{1}{3} x^3 + 3 x^2 - 3x + 6$ .

Step 2. How are these coefficients related to the information given in the problem?

Let’s start with the leading coefficient, $a = \frac{5}{12}$ . How did we get this answer? It came from dividing $10$ by $24$ . Where did the $10$ come from? It was the given value of $f^{(4)}(0)$ , and so

$a = \displaystyle \frac{f^{(4)}(0)}{24}$ .

Next, $b = \frac{1}{3}$ , which arose from dividing $2$ by $6$ . The number $2$ was the given value of $f'''(0)$ , and so

$b =\displaystyle \frac{f'''(0)}{6}$ .

Moving to the next coefficient, $c = 3$ , which arose from dividing $f''(0) = 6$ by $2$ . So

$c = \displaystyle\frac{f''(0)}{2}$ .

Finally, it’s clear that

$d = f'(0)$ and $e = f(0)$ .

This last line doesn’t quite fit the pattern of the first three lines. The first three lines all have fractions, but these last two expressions don’t. How can we fix this? In the hopes of finding a pattern, let’s (unnecessarily) write $d$ and $e$ as fractions by dividing by $1$ :

$d = \displaystyle\frac{f'(0)}{1}$ and $e = \displaystyle \frac{f(0)}{1}$ .

Let’s now rewrite the polynomial $f(x)$ in light of this discussion:

$f(x) = \displaystyle \frac{f'^{(4)}(0)}{24} x^4 + \frac{f'''(0)}{6} x^3 + \frac{f'''(0)}{2} x^2 + \frac{f'(0)}{1}x + \frac{f(0)}{1}$ .

What pattern do we see in the numerators? It’s apparent that the number of derivatives matches the power of $x$ . For example, the $x^3$ term has a coefficient involving the third derivative of $f$ . The last two terms fit this pattern as well, since $x = x^1$ and the last term is multiplied by $x^0 = 1$ .

What pattern do we see in the denominators? $1, 1, 2, 6, 24 \dots$ where have we seen those before? Oh yes, the factorials! We know that $4! = 4 \cdot 3 \cdot 2 \cdot 1 = 24$ , $3! = 3 \cdot 2 \cdot 1 = 6$ , $2! = 2 \cdot 1 = 2$ , $1! = 1$ , and $0!$ is defined to be $1$ . So $f(x)$ can be rewritten as

$f(x) = \displaystyle \frac{f'^{(4)}(0)}{4!} x^4 + \frac{f'''(0)}{3!} x^3 + \frac{f'''(0)}{2!} x^2 + \frac{f'(0)}{1!}x + \frac{f(0)}{0!}$ .

How can this be written more compactly? By using $\displaystyle \sum-$ notation:

$f(x) = \displaystyle \sum_{k=0}^4 \frac{f^{(k)}(0)}{k!} x^k$ .

Why does the sum stop at 4? Because the original polynomial had degree 4. In general, if the polynomial had degree $n$ , it’s reasonable to guess that

$f(x) = \displaystyle \sum_{k=0}^n \frac{f^{(k)}(0)}{k!} x^k$ .

This is called the Maclaurian series, or the Taylor series about $x =0$ . While I won’t prove it here, one can find Taylor series expansions about points other than $0$ :

$f(x) = \displaystyle \sum_{k=0}^n \frac{f^{(k)}(a)}{k!} (x-a)^k$ ,

where $a$ can be any number. Though not proven here, these series are exactly true for polynomials.

In the next post, we’ll discuss what happens if $f(x)$ is not a polynomial.

Reminding students about Taylor series (Part 1)

At my university, Calculus II covers approximately the same topics covered in an AP Calculus BC course: integrals and derivatives with logarithms and exponential functions, various techniques of integration (including integration by parts and trigonometric substitutions), and convergence of infinite series.

In my opinion, the single most important of these topics is Taylor series (or, if you prefer, Maclaurin series), as these approximations to transcendental functions like $e^x$ and $\sin x$ are used over and over again in higher mathematics.

$\bullet$ A good working knowledge of Taylor series is necessary for computing series solutions of ordinary differential equations.

$\bullet$ In physics, elementary approximations like $\sin x \approx x$ are used over and over again. For example, the governing differential equation for the motion of oscillating pendulums is

$\displaystyle \frac{d^2 \theta}{dt^2} + \frac{g}{\ell} \sin \theta = 0$ ,

where $g$ is the acceleration due to gravity and $\ell$ is the length of the pendulum. This differential equation cannot be solved exactly, and its solution is very complex.

However, for small angles, we may use the approximation $\sin \theta \approx \theta$ , so that the differential equation becomes

$\displaystyle \frac{d^2 \theta}{dt^2} + \frac{g}{\ell} \theta = 0$ ,

By eliminating the $\sin \theta$ term, we now have a second-order differential equation with constant coefficients, which can be solved in a straightforward manner using standard techniques from differential equations. If $\theta(0) = \theta_0$ and $\theta'(0) = 0$ (i.e., the pendulum is pulled a small angle $\theta_0$ and is then released), the solution is

$\theta(t) = \theta_0 \cos\left(t \sqrt{\displaystyle \frac{g}{\ell}} \right)$ .

In other words, the pendulum exhibits sinusoidal behavior. (FYI, for an amazing display of kinetic art, see this demonstration of pendulum waves.)

$\bullet$ The primary way that students interface with Taylor series is through their calculators. When a calculator computes $\cos 1000^o$ , it doesn’t draw a unit circle, trace out an angle of $1000^o$ in standard position, and find the $x-$ coordinate of the terminal point. Instead, the calculator converts $1000^o$ into radians and adds the first few terms of the Taylor series expansion for $\cos x.$

The calculator may use a few tricks to accelerate convergence. For this example, using some trigonometric identities, $\cos 1000^o= \cos 280^o= \cos 80^o= \sin 10^o$ , and (as I’ll discuss) the Maclaurin series for $\sin x$ at $x = 10^o$ converges much faster than the Maclaurin series for $\cos x$ at $x = 1000^o$ .

I’ve argued the importance of Taylor series in higher-level courses in both mathematics and physics. Sadly, at least at my university, Taylor series is probably the topic that is least retained by students years after taking Calculus II. They can remember the rules for integration and differentiation, but their command of Taylor series seems to slip through the cracks.

In my opinion, the reason for this lack of retention is completely understandable from a student’s perspective: Taylor series is usually the last topic covered in a semester, and so students learn them quickly for the final and quickly forget about them as soon as the final is over.

Of course, when I need to use Taylor series in an advanced course but my students have completely forgotten this prerequisite knowledge, I have to get them up to speed as soon as possible. Over the next few posts, I will present the sequence of examples that I use to accomplish this task. Covering this sequence usually takes me about 30-40 minutes of class time, depending on the class.

I should emphasize that, as much as possible, I present this sequence inductively and in an inquiry-based format: I ask leading questions of my students so that the answers of my students are driving the lecture. In other words, I don’t ask my students to simply take dictation. It’s a little hard to describe a question-and-answer format in a blog, but I’ll attempt to do this below.

Beginning with the next post, I’ll describe this sequence.

Beyond the Chalkboard: The Job of a Math Professor

About 15 years ago, when I was starting my career as an assistant professor, I was asked to write an article for Imagine magazine, which is targeted toward gifted students in grades 7-12, about what it’s like to be a math professor. While I would probably write something slightly different today (since my job responsibilities have shifted toward administration, academic advising, and the preparation of future secondary mathematics teachers), I think much of what I wrote still applies today.

Source: J. Quintanilla, “Beyond the Chalkboard: The Job of a Math Professor,” Imagine, Vol. 5, No. 4, p. 10 (March/April 1998).

One part of my job is deceptively simple to explain: I teach math to college students. Some people think I’ve got the easiest job in the world. I teach only two classes a semester for just six hours a week, and I have a flexible schedule with summers off. How cozy!

This view of my job is, of course, misleading. The hours I spend actually lecturing are only the tip of the iceberg. Delivering lectures that make sense and maintain students’ interest for a full hour takes considerable practice and effort. Meanwhile, I am constantly fine-tuning the curriculum, writing exams, and of course grading homework assignments — tasks which keep me working late on many nights. My most time-consuming project in recent months, however, has been to write eighty–and counting–letters of recommendation for former students.

My work with students outside the classroom includes one-on-one tutoring, guiding student research projects, advising students about possible majors and careers, and sometimes just lending an ear when someone’s had a rough day. As a professor I am a public figure on campus, and my current and former students come to me for counsel on a wide range of issues, many of which are only tangentially related to mathematics. I hope that through my words and counsel I am contributing to my students’ development as people as well as scholars.

In addition to lecturing, writing recommendations and counseling, I also have to produce original research. At my university, the quality of my teaching and my research will be weighted equally when I am evaluated for tenure in five years. The relative importance of teaching and research, however, varies from college to college. In general, small liberal arts colleges tend to emphasize teaching, while major universities want their professors to be primarily researchers.

When I started graduate school, I was introduced to my current field of research: applying ideas from probability theory to study theoretical problems in materials science. I have found that my research evolves over four stages: months of frustration, several days of sheer ecstasy when I’m overflowing with ideas, weeks of double-checking that my ideas actually make sense, and finally months of writing up my results for publication in scientific journals. I purposely work on three or four research projects
simultaneously, hoping that the cycle of each is slightly out of phase with the others. Though I work on my research all year, it gets my undivided attention during the summer when I’m not teaching.

Of the many aspects of this job, teaching is for me the most satisfying. I know that most of my students will not become professional mathematicians, so I incorporate “fun lectures” into the curriculum. These lectures illustrate how the mathematics we’re studying can be applied to fields of science. In my “Hunt for Red October” lecture, for example, I talk about applying trigonometry to linguistics, opera, and submarine detection; in my “Voyager 2” lecture, I describe how conic sections are used in planetary exploration. For my favorite fun lecture, I dress up in knickers, carry my golf clubs into class, and use calculus
to analyze the trajectory of golf balls. These lectures have become quite popular with my students, and I love to watch their eyes
light up when they’re excited about learning new things, such as how mathematics can be applied to real life.

Does this career sound appealing to you? If so, heed these words of warning: To become a successful professor, you have to really, really want this career. I am not a math professor for its financial rewards; friends of mine in industry earn salaries that are triple what I make. I don’t mind, and I’m not envious of them–I get to do what I love for a living, and I’m not starving. But this job isn’t for everyone. There are innumerable distractions and frustrations along the way that will derail aspiring professors who are not entirely focused on the goal.

For example, I always thought that I would be assured a job after graduation. In 1987, the National Science Foundation projected a shortfall of 675,000 scientists and engineers over the coming two decades. I was a high school senior in 1987, so I assumed I would be able to write my own ticket after earning a doctorate.

Time would show that this NSF projection, now derisively labeled “The Myth,” was amazingly inaccurate. There is currently an overproduction of Ph.D.s in mathematics, and the job market for aspiring math professors is tight. The unemployment rate for freshly-minted Ph.D.s in mathematics has hovered around 10% throughout the 1990s. A new Ph.D. can expect a nomadic
life — bouncing all over the country from one postdoctoral appointment to another — before finally landing a tenure-track position.

Faced with such daunting employment prospects, I braced myself for an unstable life in pursuit of my dream of becoming a professor. Even so, I must admit that getting avalanched by more than a hundred rejection letters was extremely disheartening. In the end, though, I was blessed with a tenure-track position straight out of grad school.

Like many jobs, the job of a math professor is frustrating at times and can feel overwhelming. But when it does, I think about the excited, curious students at a recent fun lecture and remind myself: I love this job!

Engaging students: Solving one-step and two-step inequalities

In my capstone class for future secondary math teachers, I ask my students to come up with ideas for engaging their students with different topics in the secondary mathematics curriculum. In other words, the point of the assignment was not to devise a full-blown lesson plan on this topic. Instead, I asked my students to think about three different ways of getting their students interested in the topic in the first place.

I plan to share some of the best of these ideas on this blog (after asking my students’ permission, of course).

This first student submission comes from my former student Jesse Faltys (who, by the way, was the instigator for me starting this blog in the first place). Her topic: how to engage students when teaching one-step and two-step inequalities.

A. Applications – How could you as a teacher create an activity or project that involves your topic?

Index Card Game: Make two sets of cards. The first should consist of different inequalities. The second should consist of the matching graph. Put your students in pairs and distribute both sets of cards. The students will then practice solving their inequalities and determine which graph illustrates which inequality.
Inequality Friends: Distribute index cards with simple inequalities to a handful of your students (four or five different inequalities) and to the rest of the students pass of cards that only contain numbers. Have your students rotate around the room and determine if their numbers and inequalities are compatible or not. If they know that their number belongs with that inequality then the students should become “members” and form a group. Once all the students have formed their groups, they should present to the class how they solved their inequality and why all their numbers are “members” of that group.

Both applications allow for a quick assessment by the teacher. Having the students initially work in pairs to explore the inequality and its matching graph allows for discover on their own. While ending the class with a group activity allows the teacher to make individual assessments on each student.

B. Curriculum: How does this topic extend what your students should have learned in previous courses?

In a previous course, students learned to solve one- and two-step linear equations. The process for solving one-step equality is similar to the process of solving a one-step inequality. Properties of Inequalities are used to isolate the variable on one side of the inequality. These properties are listed below. The students should have knowledge of these from the previous course; therefore not overwhelmed with new rules.

Properties of Inequality

1. When you add or subtract the same number from each side of an inequality, the inequality remains true. (Same as previous knowledge with solving one-step equations)

2. When you multiply or divide each side of an inequality by a positive number, the inequality remains true. (Same as previous knowledge with solving one-step equations)

3. When you multiply or divide each side of an inequality by a negative number, the direction of the inequality symbol must be reversed for the inequality to remain true. (THIS IS DIFFERENT)

There is one obvious difference when working with inequalities and multiply/dividing by a negative number there is a change in the inequality symbol. By pointing out to the student, that they are using what they already know with just one adjustment to the rules could help ease their mind on a new subject matter.

C. Culture – How has this topic appeared in pop culture?

Amusement Parks – If you have ever been to an amusement park, you are familiar with the height requirements on many of the rides. The provide chart below shows the rides at Disney that require 35 inches or taller to be able to ride. What rides will you ride?

(Height of Student $\ge$ Height restriction)

Blizzard Beach	Summit Plummet	48″
Magic Kingdom	Barnstormer at Goofy’s Wiseacres Farm	35″
Animal Kingdom	Primeval Whirl	48″
Blizzard Beach	Downhill Double Dipper	48″
DisneyQuest	Mighty Ducks Pinball Slam	48″
Typhoon Lagoon	Bay Slide	52″
Animal Kingdom	Kali River Rapids	38″
DisneyQuest	Buzz Lightyear’s AstroBlaster	51″
DisneyQuest	Cyberspace Mountain	51″
Epcot	Test Track	40″
Epcot	Soarin’	40″
Hollywood Studios	Star Tours: The Adventures Continue	40″
Magic Kingdom	Space Mountain	44″
Magic Kingdom	Stitch’s Great Escape	40″
Typhoon Lagoon	Humunga Kowabunga	48″
Animal Kingdom	Expedition Everest	44″
Blizzard Beach	Cross Country Creek	48″
Epcot	Mission Space	44″
Hollywood Studios	The Twilight Zone Tower of Terror	40″
Hollywood Studios	Rock ‘n’ Roller Coaster Starring Aerosmith	48″
Magic Kingdom	Splash Mountain	40″
Magic Kingdom	Big Thunder Mountain Railroad	40″
Animal Kingdom	Dinosaur	40″
Epcot	Wonders of Life / Body Wars	40″
Blizzard Beach	Summit Plummet	48″
Magic Kingdom	Barnstormer at Goofy’s Wiseacres Farm	35″
Animal Kingdom	Primeval Whirl	48″
Blizzard Beach	Downhill Double Dipper	48″
DisneyQuest	Mighty Ducks Pinball Slam	48″
Typhoon Lagoon	Bay Slide	52″

Sports – Zdeno Chara is the tallest person who has ever played in the NHL. He is 206 cm tall and is allowed to use a stick that is longer than the NHL’s maximum allowable length. The official rulebook of the NHL state limits for the equipment players can use. One of these rules states that no hockey stick can exceed160 cm. (Hockey stick $\le$ 160 cm) The world’s largest hockey stick and puck are in Duncan, British Columbia. The stick is over 62 m in length and weighs almost 28,000 kg. Is your equipment legal?

Weather – Every time the news is on our culture references inequalities by the range in the temperature throughout the day. For example, the most extreme change in temperature in Canada took place in January 1962 in Pincher Creek, Alberta. A warm, dry wind, known as a chinook, raised the temperature from -19 °C to 22 °C in one hour. Represent the temperature during this hour using a double inequality. (-19 < the temperature < 22) What Inequality is today from the weather in 1962?

Video pranks in class

Dr. Matthew Weathers is an assistant professor of Mathematics and Computer Science at Biola University and also a world-class showman. Here are some of his biggest hits (often for Halloween or April Fools’ Day). Enjoy. (If you’re interested, you can find more at his YouTube page, http://www.youtube.com/user/MDWeathers/videos?sort=p&view=0&flow=grid.

Geometric magic trick

This is a magic trick that my math teacher taught me when I was about 13 or 14. I’ve found that it’s a big hit when performed for grade-school children.

Magician: Tell me a number between 3 and 10.

Child: (gives a number, call it $x$ )

Magician: On a piece of paper, draw a shape with $x$ corners.

Child: (draws a figure; an example for $x=6$ is shown)

Important Note: For this trick to work, the original shape has to be convex… something shaped like an L or M won’t work. Also, I chose a maximum of 10 mostly for ease of drawing and counting (and, for later, calculating).

Magician: Tell me another number between 3 and 10.

Child: (gives a number, call it $y$ )

Magician: Now draw that many dots inside of your shape.

Child: (starts drawing $y$ dots inside the figure; an example for $y = 7$ ) While the child does this, the Magician calculates $2y + x - 2$ , writes the answer on a piece of paper, and turns the answer face down.

Magician: Now connect the dots with lines until you get all triangles. Just be sure that no two lines cross each other.

Child: (connects the dots until the shape is divided into triangles; an example is shown)

Magician: Now count the number of triangles.

Child: (counts the triangles)

Magician: Was your answer… (and turns the answer over)?

The reason this magic trick works so well is that it’s so counter-intuitive. No matter what convex $x-$ gon is drawn, no matter where the $y$ points are located, and no matter how lines are drawn to create triangles, there will always be $2y + x - 2$ triangles. For the example above, $2y+x-2 = 2\times 7 + 6 - 2 = 18$ , and there are indeed $18$ triangles in the figure.

Why does this magic trick work? I offer a thought bubble if you’d like to think about it before scrolling down to see the answer.

This trick works by counting the measures of all the angles in two different ways.

Method #1: If there are $T$ triangles created, then the sum of the measures of the angles in each triangle is $180$ degrees. So the sum of the measures of all of the angles must be $180 T$ degrees.

Method #2: The sum of the measures of the angles around each interior point is $360$ degrees. Since there are $y$ interior points, the sum of these angles is $360y$ degrees.

The measures of the remaining angles add up to the sum of the measures of the interior angles of a convex polygon with $x$ sides. So the sum of these measures is $180(x-2)$ degrees.

In other words, it must be the case that

$180T = 360y + 180(x-2)$ , or $T = 2y + x - 2$ .

Welch’s formula

When conducting an hypothesis test or computing a confidence interval for the difference $\overline{X}_1 - \overline{X}_2$ of two means, where at least one mean does not arise from a small sample, the Student t distribution must be employed. In particular, the number of degrees of freedom for the Student t distribution must be computed. Many textbooks suggest using Welch’s formula:

$df = \frac{\displaystyle (SE_1^2 + SE_2^2)^2}{\displaystyle \frac{SE_1^4}{n_1-1} + \frac{SE_2^4}{n_2-1}},$

rounded down to the nearest integer. In this formula, $SE_1 = \displaystyle \frac{\sigma_1}{\sqrt{n_1}}$ is the standard error associated with the first average $\overline{X}_1$ , where $\sigma_1$ (if known) is the population standard deviation for $X$ and $n_1$ is the number of samples that are averaged to find $\overline{X}_1$ . In practice, $\sigma_1$ is not known, and so the bootstrap estimate $\sigma_1 \approx s_1$ is employed.

The terms $SE_2$ and $n_2$ are similarly defined for the average $\overline{X}_2$ .

In Welch’s formula, the term $SE_1^2 + SE_2^2$ in the numerator is equal to $\displaystyle \frac{\sigma_1^2}{n_1} + \frac{\sigma_2^2}{n_2}$ . This is the square of the standard error $SE_D$ associated with the difference $\overline{X}_1 - \overline{X}_2$ , since

$SE_D = \displaystyle \sqrt{\frac{\sigma_1^2}{n_1} + \frac{\sigma_2^2}{n_2}}$ .

This leads to the “Pythagorean” relationship

$SE_1^2 + SE_2^2 = SE_D^2$ ,

which (in my experience) is a reasonable aid to help students remember the formula for $SE_D$ .

Naturally, a big problem that students encounter when using Welch’s formula is that the formula is really, really complicated, and it’s easy to make a mistake when entering information into their calculators. (Indeed, it might be that the pre-programmed calculator function simply gives the wrong answer.) Also, since the formula is complicated, students don’t have a lot of psychological reassurance that, when they come out the other end, their answer is actually correct. So, when teaching this topic, I tell my students the following rule of thumb so that they can at least check if their final answer is plausible:

$\min(n_1,n_2)-1 \le df \le n_1 + n_2 -2$ .

To my surprise, I have never seen this formula in a statistics textbook, even though it’s quite simple to state and not too difficult to prove using techniques from first-semester calculus.

Let’s rewrite Welch’s formula as

$df = \left( \displaystyle \frac{1}{n_1-1} \left[ \frac{SE_1^2}{SE_1^2 + SE_2^2}\right]^2 + \frac{1}{n_2-1} \left[ \frac{SE_2^2}{SE_1^2 + SE_2^2} \right]^2 \right)^{-1}$

For the sake of simplicity, let $m_1 = n_1 - 1$ and $m_2 = n_2 -1$ , so that

$df = \left( \displaystyle \frac{1}{m_1} \left[ \frac{SE_1^2}{SE_1^2 + SE_2^2}\right]^2 + \frac{1}{m_2} \left[ \frac{SE_2^2}{SE_1^2 + SE_2^2} \right]^2 \right)^{-1}$

Now let $x = \displaystyle \frac{SE_1^2}{SE_1^2 + SE_2^2}$ . All of these terms are nonnegative (and, in practice, they’re all positive), so that $x \ge 0$ . Also, the numerator is no larger than the denominator, so that $x \le 1$ . Finally, we notice that

$1-x = 1 - \displaystyle \frac{SE_1^2}{SE_1^2 + SE_2^2} = \frac{SE_2^2}{SE_1^2 + SE_2^2}$ .

Using these observations, Welch’s formula reduces to the function

$f(x) = \left( \displaystyle \frac{x^2}{m_1} + \frac{(1-x)^2}{m_2} \right)^{-1}$ ,

and the central problem is to find the maximum and minimum values of $f(x)$ on the interval $0 \le x \le 1$ . Since $f(x)$ is differentiable on $[0,1]$ , the absolute extrema can be found by checking the endpoints and the critical point(s).

First, the endpoints. If $x=0$ , then $f(0) = \left( \displaystyle \frac{1}{m_2} \right)^{-1} = m_2$ . On the other hand, if $x=1$ , then $f(1) = \left( \displaystyle \frac{1}{m_1} \right)^{-1} = m_1$ .

Next, the critical point(s). These are found by solving the equation $f'(x) = 0$ :

$f'(x) = -\left( \displaystyle \frac{x^2}{m_1} + \frac{(1-x)^2}{m_2} \right)^{-2} \left[ \displaystyle \frac{2x}{m_1} - \frac{2(1-x)}{m_2} \right] = 0$

$\displaystyle \frac{2x}{m_1} - \frac{2(1-x)}{m_2} = 0$

$\displaystyle \frac{2x}{m_1} = \frac{2(1-x)}{m_2}$

$xm_2= (1-x)m_1$

$xm_2 = m_1 - xm_1$

$x(m_1 + m_2) = m_1$

$x = \displaystyle \frac{m_1}{m_1 + m_2}$

Plugging back into the original equation, we find the local extremum

$f \left( \displaystyle \frac{m_1}{m_1+m_2} \right) = \left( \displaystyle \frac{1}{m_1} \frac{m_1^2}{(m_1+m_2)^2} + \frac{1}{m_2} \left[1-\frac{m_1}{m_1+m_2}\right]^2 \right)^{-1}$

$f \left( \displaystyle \frac{m_1}{m_1+m_2} \right) = \left( \displaystyle \frac{1}{m_1} \frac{m_1^2}{(m_1+m_2)^2} + \frac{1}{m_2} \left[\frac{m_2}{m_1+m_2}\right]^2 \right)^{-1}$

$f \left( \displaystyle \frac{m_1}{m_1+m_2} \right) = \left( \displaystyle \frac{m_1}{(m_1+m_2)^2} + \frac{m_2}{(m_1+m_2)^2} \right)^{-1}$

$f \left( \displaystyle \frac{m_1}{m_1+m_2} \right) = \left( \displaystyle \frac{m_1+m_2}{(m_1+m_2)^2} \right)^{-1}$

$f \left( \displaystyle \frac{m_1}{m_1+m_2} \right) = \left( \displaystyle \frac{1}{m_1+m_2} \right)^{-1}$

$f \left( \displaystyle \frac{m_1}{m_1+m_2} \right) = m_1+m_2$

Based on the three local extrema that we’ve found, it’s clear that the absolute minimum of $f(x)$ on $[0,1]$ is the smaller of $m_1$ and $m_2$ , while the absolute maximum is equal to $m_1 + m_2$ .

$\hbox{QED}$

In conclusion, I suggest offering the following guidelines to students to encourage their intuition about the plausibility of their answers:

If $SE_1$ is much smaller than $SE_2$ (i.e., $x \approx 0$ ), then $df$ will be close to $m_2 = n_2 - 1$ .
If $SE_1$ is much larger than $SE_2$ (i.e., $x \approx 1$ ), then $df$ will be close to $m_1 = n_1 - 1$ .
Otherwise, $df$ could be as large as $m_1 + m_2 = n_1 + n_2 - 2$ , but no larger.

Statistical significance

When teaching my Applied Statistics class, I’ll often use the following xkcd comic to reinforce the meaning of statistical significance.

The idea that’s being communicated is that, when performing an hypothesis test, the observed significance level $P$ is the probability that the null hypothesis is correct due to dumb luck as opposed to a real effect (the alternative hypothesis). So if the significance level is really about $0.05$ and the experiment is repeated about 20 times, it wouldn’t be surprising for one of those experiments to falsely reject the null hypothesis.

In practice, statisticians use the Bonferroni correction when performing multiple simultaneous tests to avoid the erroneous conclusion displayed in the comic.

Source: http://www.xkcd.com/882/