Why 0^0 is undefined

TI00

Here’s an explanation for why 0^0 is undefined that should be within the grasp of pre-algebra students:

Part 1.

  • What is 0^3? Of course, it’s 0.
  • What is 0^2? Again, 0.
  • What is 0^1? Again, 0.
  • What is 0^{1/2}, or \sqrt{0}? Again, 0.
  • What is 0^{1/3}, or \sqrt[3]{0}? In other words, what number, when cubed, is 0? Again, 0.
  • What is 0^{1/10}, or \sqrt[10]{0}? In other words, what number, when raised to the 10th power, is 0. Again, 0.

So as the exponent gets closer to 0, the answer remains 0. So, from this perspective, it looks like 0^0 ought to be equal to 0.

Part 2.

  • What is 3^0. Of course, it’s 1.
  • What is 2^0. Again, 1.
  • What is 1^0. Again, 1.
  • What is \left( \displaystyle \frac{1}{2} \right)^0? Again, 1
  • What is \left( \displaystyle \frac{1}{3} \right)^0. Again, 1
  • What is \left( \displaystyle \frac{1}{10} \right)^0? Again, 1

So as the base gets closer to 0, the answer remains 1. So, from this perspective, it looks like 0^0 ought to be equal to 1.

In conclusion: looking at it one way, 0^0 should be defined to be 0. From another perspective, 0^0 should be defined to be 1.

Of course, we can’t define a number to be two different things! So we’ll just say that 0^0 is undefined — just like dividing by 0 is undefined — rather than pretend that 0^0 switches between two different values.

green line

Here’s a more technical explanation about why 0^0 is an indeterminate form, using calculus.

Part 1. As before,

\displaystyle \lim_{x \to 0^+} 0^x = \lim_{x \to 0^+} 0 = 0.

The first equality is true because, inside of the limit, x is permitted to get close to 0 but cannot actually equal 0, and there’s no ambiguity about 0^x = 0 if x >0. (Naturally, 0^x is undefined if x < 0.)

The second equality is true because the limit of a constant is the constant.

Part 2. As before,

\displaystyle \lim_{x \to 0} x^0 = \lim_{x \to 0} 1 = 1.

Once again, the first equality is true because, inside of the limit, x is permitted to get close to 0 but cannot actually equal 0, and there’s no ambiguity about x^0 = 1 if x \ne 0.

As before, the answers from Parts 1 and 2 are different. But wait, there’s more…

Part 3. Here’s another way that 0^0 can be considered, just to give us a headache. Let’s evaluate

\displaystyle \lim_{x \to 0^+} x^{1/\ln x}

Clearly, the base tends to 0 as x \to 0. Also, \ln x \to \infty as x \to 0^+, so that \displaystyle \frac{1}{\ln x} \to 0 as x \to 0^+. In other words, this limit has the indeterminate form 0^0.

To evaluate this limit, let’s take a logarithm under the limit:

\displaystyle \lim_{x \to 0^+} \ln x^{1/\ln x} = \displaystyle \lim_{x \to 0^+} \frac{1}{\ln x} \cdot \ln x

\displaystyle \lim_{x \to 0^+} \ln x^{1/\ln x} = \displaystyle \lim_{x \to 0^+} 1

\displaystyle \lim_{x \to 0^+} \ln x^{1/\ln x} = 1

Therefore, without the extra logarithm,

\displaystyle \lim_{x \to 0^+} x^{1/\ln x} = e^1 = e

Part 4. It gets even better. Let k be any positive real number. By the same logic as above,

\displaystyle \lim_{x \to 0^+} x^{\ln k/\ln x} = e^{\ln k} = k

So, for any k \ge 0, we can find a function f(x) of the indeterminate form 0^0 so that \displaystyle f(x) = k.

In other words, we could justify defining 0^0 to be any nonnegative number. Clearly, it’s better instead to simply say that 0^0 is undefined.

P.S. I don’t know if it’s possible to have an indeterminate form of 0^0 where the answer is either negative or infinite. I tend to doubt it, but I’m not sure.

A surprising appearance of e

Here’s a simple probability problem that should be accessible to high school students who have learned the Multiplication Rule:

Suppose that you play the lottery every day for about 20 years. Each time you play, the chance that you win is 1 chance in 1000. What is the probability that, after playing  1000 times, you never win?

This is a straightforward application of the Multiplication Rule from probability. The chance of not winning on any one play is 0.999. Therefore, the chance of not winning 1000 consecutive times is (0.999)^{1000}, which we can approximate with a calculator.

TIlottery1

Well, that was easy enough. Now, just for the fun of it, let’s find the reciprocal of this answer.

TIlottery2

Hmmm. Two point seven one. Where have I seen that before? Hmmm… Nah, it couldn’t be that.

What if we changed the number 1000 in the above problem to 1,000,000? Then the probability would be (0.999999)^{1000000}.

TIlottery3

There’s no denying it now… it looks like the reciprocal is approximately e, so that the probability of never winning for both problems is approximately 1/e.

Why is this happening? I offer a thought bubble if you’d like to think about this before proceeding to the answer.

green_speech_bubbleThe above calculations are numerical examples that demonstrate the limit

\displaystyle \lim_{n \to \infty} \left(1 + \frac{x}{n}\right)^n = e^x

In particular, for the special case when n = -1, we find

\displaystyle \lim_{n \to \infty} \left(1 - \frac{1}{n}\right)^n = e^{-1} = \displaystyle \frac{1}{e}

The first limit can be proved using L’Hopital’s Rule. By continuity of the function f(x) = \ln x, we have

\ln \left[ \displaystyle \lim_{n \to \infty} \left(1 + \frac{x}{n}\right)^n \right] = \displaystyle \lim_{n \to \infty} \ln \left[ \left(1 + \frac{x}{n}\right)^n \right]

\ln \left[ \displaystyle \lim_{n \to \infty} \left(1 + \frac{x}{n}\right)^n \right] = \displaystyle \lim_{n \to \infty} n \ln \left(1 + \frac{x}{n}\right)

\ln \left[ \displaystyle \lim_{n \to \infty} \left(1 + \frac{x}{n}\right)^n \right] = \displaystyle \lim_{n \to \infty} \frac{ \displaystyle \ln \left(1 + \frac{x}{n}\right)}{\displaystyle \frac{1}{n}}

The right-hand side has the form \infty/\infty as n \to \infty, and so we may use L’Hopital’s rule, differentiating both the numerator and the denominator with respect to n.

\ln \left[ \displaystyle \lim_{n \to \infty} \left(1 + \frac{x}{n}\right)^n \right] = \displaystyle \lim_{n \to \infty} \frac{ \displaystyle \frac{1}{1 + \frac{x}{n}} \cdot \frac{-x}{n^2} }{\displaystyle \frac{-1}{n^2}}

\ln \left[ \displaystyle \lim_{n \to \infty} \left(1 + \frac{x}{n}\right)^n \right] = \displaystyle \lim_{n \to \infty} \displaystyle \frac{x}{1 + \frac{x}{n}}

\ln \left[ \displaystyle \lim_{n \to \infty} \left(1 + \frac{x}{n}\right)^n \right] = \displaystyle \frac{x}{1 + 0}

\ln \left[ \displaystyle \lim_{n \to \infty} \left(1 + \frac{x}{n}\right)^n \right] = x

Applying the exponential function to both sides, we conclude that

\displaystyle \lim_{n \to \infty} \left(1 + \frac{x}{n}\right)^n= e^x

green lineIn an undergraduate probability class, the problem can be viewed as a special case of a Poisson distribution approximating a binomial distribution if there’s a large number of trials and a small probability of success.

The above calculation also justifies (in Algebra II and Precalculus) how the formula for continuous compound interest A = Pe^{rt} can be derived from the formula for discrete compound interest A = P \displaystyle \left( 1 + \frac{r}{n} \right)^{nt}

All this to say, Euler knew what he was doing when he decided that e was so important that it deserved to be named.

Square roots with a calculator (Part 11)

This is the last in a series of posts about square roots and other roots, hopefully providing a deeper look at an apparently simple concept. However, in this post, we discuss how calculators are programmed to compute square roots quickly.

Today’s movie clip, therefore, is set in modern times:

So how do calculators find square roots anyway? First, we recognize that \sqrt{a} is a root of the polynomial f(x) = x^2 - a. Therefore, Newton’s method (or the Newton-Raphson method) can be used to find the root of this function. Newton’s method dictates that we begin with an initial guess x_1 and then iteratively find the next guesses using the recursively defined sequence

x_{n+1} = x_n - \displaystyle \frac{f(x_n)}{f'(x_n)}

For the case at hand, since f'(x) = 2x, we may write

x_{n+1} = x_n - \displaystyle \frac{x_n^2 -a}{2 x_n},

which reduces to

x_{n+1} = \displaystyle \frac{2x_n^2 - (x_n^2 -a)}{2 x_n} = \frac{x_n^2 + a}{2x_n} = \frac{1}{2} \left( x_n + \frac{a}{x_n} \right)

This algorithm can be programmed using C++, Python, etc.. For pedagogical purposes, however, I’ve found that a spreadsheet like Microsoft Excel is a good way to sell this to students. In the spreadsheet below, I use Excel to find \sqrt{2000}. In cell A1, I entered 1000 as a first guess for \sqrt{2000}. Notice that this is a really lousy first guess! Then, in cell A2, I typed the formula

=1/2*(A1+2000/A1)

So Excel computes

x_2 = \frac{1}{2} \left( x_1 + \displaystyle \frac{2000}{x_1} \right) = \frac{1}{2} \left( 1000 + \displaystyle \frac{2000}{1000} \right) = 501.

Then I filled down that formula into cells A3 through A16.

squareroot

Notice that this algorithm quickly converges to \sqrt{2000}, even though the initial guess was terrible. After 7 steps, the answer is only correct to 2 significant digits (45). After 8 steps, the answer is correct to 4 significant digits (44.72). On the 9th step, the answer is correct to 9 significant digits (44.7213595).

Indeed, there’s a theorem that essentially states that, when this algorithm converges, the number of correct digits basically doubles with each successive step. That’s a lot better than the methods shown at the start of this series of posts which only produced one extra digit with each step.

This algorithm works for finding kth roots as well as square roots. Since \sqrt[k]{a} is a root of f(x) = x^k - a, Newton’s method reduces to

x_{n+1} = x_n - \displaystyle \frac{x_n^k - a}{k x_n^{k-1}} = \displaystyle \frac{(k-1) x_n^k + a}{k x_n^{k-1}} = \displaystyle \frac{k-1}{k} x_k + \frac{1}{k} \cdot \frac{a}{x_n},

which reduces to the above sequence if k =2.

See also this Wikipedia page for further historical information as well as discussion about how the above recursive sequence can be obtained without calculus.

Square roots and logarithms without a calculator (Part 10)

This is the fifth in a series of posts about calculating roots without a calculator, with special consideration to how these tales can engage students more deeply with the secondary mathematics curriculum. As most students today have a hard time believing that square roots can be computed without a calculator, hopefully giving them some appreciation for their elders.

Today’s story takes us back to a time before the advent of cheap pocket calculators: 1949.

The following story comes from the chapter “Lucky Numbers” of Surely You’re Joking, Mr. Feynman!, a collection of tales by the late Nobel Prize winning physicist, Richard P. Feynman. Feynman was arguably the greatest American-born physicist — the subject of the excellent biography Genius: The Life and Science of Richard Feynman — and he had a tendency to one-up anyone who tried to one-up him. (He was also a serial philanderer, but that’s another story.) Here’s a story involving how, in the summer of 1949, he calculated \sqrt[3]{1729.03} without a calculator.

The first time I was in Brazil I was eating a noon meal at I don’t know what time — I was always in the restaurants at the wrong time — and I was the only customer in the place. I was eating rice with steak (which I loved), and there were about four waiters standing around.

A Japanese man came into the restaurant. I had seen him before, wandering around; he was trying to sell abacuses. (Note: At the time of this story, before the advent of pocket calculators, the abacus was arguably the world’s most powerful hand-held computational device.) He started to talk to the waiters, and challenged them: He said he could add numbers faster than any of them could do.

The waiters didn’t want to lose face, so they said, “Yeah, yeah. Why don’t you go over and challenge the customer over there?”

The man came over. I protested, “But I don’t speak Portuguese well!”

The waiters laughed. “The numbers are easy,” they said.

They brought me a paper and pencil.

The man asked a waiter to call out some numbers to add. He beat me hollow, because while I was writing the numbers down, he was already adding them as he went along.

I suggested that the waiter write down two identical lists of numbers and hand them to us at the same time. It didn’t make much difference. He still beat me by quite a bit.

However, the man got a little bit excited: he wanted to prove himself some more. “Multiplição!” he said.

Somebody wrote down a problem. He beat me again, but not by much, because I’m pretty good at products.

The man then made a mistake: he proposed we go on to division. What he didn’t realize was, the harder the problem, the better chance I had.

We both did a long division problem. It was a tie.

This bothered the hell out of the Japanese man, because he was apparently well trained on the abacus, and here he was almost beaten by this customer in a restaurant.

Raios cubicos!” he says with a vengeance. Cube roots! He wants to do cube roots by arithmetic. It’s hard to find a more difficult fundamental problem in arithmetic. It must have been his topnotch exercise in abacus-land.

He writes down a number on some paper— any old number— and I still remember it: 1729.03. He starts working on it, mumbling and grumbling: “Mmmmmmagmmmmbrrr”— he’s working like a demon! He’s poring away, doing this cube root.

Meanwhile I’m just sitting there.

One of the waiters says, “What are you doing?”.

I point to my head. “Thinking!” I say. I write down 12 on the paper. After a little while I’ve got 12.002.

The man with the abacus wipes the sweat off his forehead: “Twelve!” he says.

“Oh, no!” I say. “More digits! More digits!” I know that in taking a cube root by arithmetic, each new digit is even more work that the one before. It’s a hard job.

He buries himself again, grunting “Rrrrgrrrrmmmmmm …,” while I add on two more digits. He finally lifts his head to say, “12.0!”

The waiter are all excited and happy. They tell the man, “Look! He does it only by thinking, and you need an abacus! He’s got more digits!”

He was completely washed out, and left, humiliated. The waiters congratulated each other.

How did the customer beat the abacus?

The number was 1729.03. I happened to know that a cubic foot contains 1728 cubic inches, so the answer is a tiny bit more than 12. The excess, 1.03, is only one part in nearly 2000, and I had learned in calculus that for small fractions, the cube root’s excess is one-third of the number’s excess. So all I had to do is find the fraction 1/1728, and multiply by 4 (divide by 3 and multiply by 12). So I was able to pull out a whole lot of digits that way.

A few weeks later, the man came into the cocktail lounge of the hotel I was staying at. He recognized me and came over. “Tell me,” he said, “how were you able to do that cube-root problem so fast?”

I started to explain that it was an approximate method, and had to do with the percentage of error. “Suppose you had given me 28. Now the cube root of 27 is 3 …”

He picks up his abacus: zzzzzzzzzzzzzzz— “Oh yes,” he says.

I realized something: he doesn’t know numbers. With the abacus, you don’t have to memorize a lot of arithmetic combinations; all you have to do is to learn to push the little beads up and down. You don’t have to memorize 9+7=16; you just know that when you add 9, you push a ten’s bead up and pull a one’s bead down. So we’re slower at basic arithmetic, but we know numbers.

Furthermore, the whole idea of an approximate method was beyond him, even though a cubic root often cannot be computed exactly by any method. So I never could teach him how I did cube roots or explain how lucky I was that he happened to choose 1729.03.

The key part of the story, “for small fractions, the cube root’s excess is one-third of the number’s excess,” deserves some elaboration, especially since this computational trick isn’t often taught in those terms anymore. If f(x) = (1+x)^n, then f'(x) = n (1+x)^{n-1}, so that f'(0) = n. Since f(0) = 1, the equation of the tangent line to f(x) at x = 0 is

L(x) = f(0) + f'(0) \cdot (x-0) = 1 + nx.

The key observation is that, for x \approx 0, the graph of L(x) will be very close indeed to the graph of f(x). In Calculus I, this is sometimes called the linearization of f at x =a. In Calculus II, we observe that these are the first two terms in the Taylor series expansion of f about x = a.

For Feynman’s problem, n =\frac{1}{3}, so that \sqrt[3]{1+x} \approx 1 + \frac{1}{3} x if $x \approx 0$. Then $\latex \sqrt[3]{1729.03}$ can be rewritten as

\sqrt[3]{1729.03} = \sqrt[3]{1728} \sqrt[3]{ \displaystyle \frac{1729.03}{1728} }

\sqrt[3]{1729.03} = 12 \sqrt[3]{\displaystyle 1 + \frac{1.03}{1728}}

\sqrt[3]{1729.03} \approx 12 \left( 1 + \displaystyle \frac{1}{3} \times \frac{1.03}{1728} \right)

\sqrt[3]{1729.03} \approx 12 + 4 \times \displaystyle \frac{1.03}{1728}

This last equation explains the line “all I had to do is find the fraction 1/1728, and multiply by 4.” With enough patience, the first few digits of the correction can be mentally computed since

\displaystyle \frac{1.03}{500} < \frac{1.03}{432} = 4 \times \frac{1.03}{1728} < \frac{1.03}{400}

\displaystyle \frac{1.03 \times 2}{1000} < 4 \times \frac{1.03}{1728} < \frac{1.03 \times 25}{10000}

0.00206 < 4 \times \displaystyle \frac{1.03}{1728} < 0.0025075

So Feynman could determine quickly that the answer was 12.002\hbox{something}.

By the way,

\sqrt[3]{1729.03} \approx 12.00238378\dots

\hbox{Estimate} \approx 12.00238426\dots

So the linearization provides an estimate accurate to eight significant digits. Additional digits could be obtained by using the next term in the Taylor series.

green line

I have a similar story to tell. Back in 1996 or 1997, when I first moved to Texas and was making new friends, I quickly discovered that one way to get odd facial expressions out of strangers was by mentioning that I was a math professor. Occasionally, however, someone would test me to see if I really was a math professor. One guy (who is now a good friend; later, we played in the infield together on our church-league softball team) asked me to figure out \sqrt{97} without a calculator — before someone could walk to the next room and return with the calculator. After two seconds of panic, I realized that I was really lucky that he happened to pick a number close to 100. Using the same logic as above,

\sqrt{97} = \sqrt{100} \sqrt{1 - 0.03} \approx 10 \left(1 - \displaystyle \frac{0.03}{2}\right) = 9.85.

Knowing that this came from a linearization and that the tangent line to y = \sqrt{1+x} lies above the curve, I knew that this estimate was too high. But I didn’t have time to work out a correction (besides, I couldn’t remember the full Taylor series off the top of my head), so I answered/guessed 9.849, hoping that I did the arithmetic correctly. You can imagine the amazement when someone punched into the calculator to get 9.84886\dots

Taylor series without calculus

Is calculus really necessary for obtaining a Taylor series? Years ago, while perusing an old Schaum’s outline, I found a very curious formula for the area of a circular segment:

CircularSegment_1001

A = \displaystyle \frac{R^2}{2} (\theta - \sin \theta)

The thought occurred to me that \theta was the first term in the Taylor series expansion of \sin \theta about \theta = 0, and perhaps there was a way to use this picture to generate the remaining terms of the Taylor series.

This insight led to a paper which was published in College Mathematics Journal: cmj38-1-058-059. To my surprise and delight, this paper was later selected for inclusion in The Calculus Collection: A Resource for AP and Beyond, which is a collection of articles from the publications of Mathematical Association of America specifically targeted toward teachers of AP Calculus.

Although not included in the article, it can be proven that this iterative method does indeed yield the successive Taylor polynomials of \sin \theta, adding one extra term with each successive step.

I carefully scaffolded these steps into a project that I twice assigned to my TAMS precalculus students. Both semesters, my students got it… and they were impressed to know the formula that their calculators use to compute \sin \theta. So I think this project is entirely within the grasp of precocious precalculus students.

green line

I personally don’t know of a straightforward way of obtaining the expansion of \cos \theta without calculus. However, once the expansion of \sin \theta is known, the expansion of \cos \theta can be surmised without calculus. To do this, we note that

\cos \theta = 1 - 2 \sin^2 \left( \displaystyle \frac{\theta}{2} \right) = 1 - 2 \left( \displaystyle \frac{\theta}{2} - \frac{(\theta/2)^3}{3!} + \frac{(\theta/2)^5}{5!} \dots \right)^2

Truncating the series after n terms and squaring — and being very careful with the necessary simplifications — yield the first n terms in the Taylor series of \cos \theta.

Reminding students about Taylor series (Part 6)

Sadly, at least at my university, Taylor series is the topic that is least retained by students years after taking Calculus II. They can remember the rules for integration and differentiation, but their command of Taylor series seems to slip through the cracks. In my opinion, the reason for this lack of retention is completely understandable from a student’s perspective: Taylor series is usually the last topic covered in a semester, and so students learn them quickly for the final and quickly forget about them as soon as the final is over.

Of course, when I need to use Taylor series in an advanced course but my students have completely forgotten this prerequisite knowledge, I have to get them up to speed as soon as possible. Here’s the sequence that I use to accomplish this task. Covering this sequence usually takes me about 30 minutes of class time.

I should emphasize that I present this sequence in an inquiry-based format: I ask leading questions of my students so that the answers of my students are driving the lecture. In other words, I don’t ask my students to simply take dictation. It’s a little hard to describe a question-and-answer format in a blog, but I’ll attempt to do this below.

In the previous posts, I described how I lead students to the definition of the Maclaurin series

f(x) = \displaystyle \sum_{k=0}^{\infty} \frac{f^{(k)}(0)}{k!} x^k,

which converges to f(x) within some radius of convergence for all functions that commonly appear in the secondary mathematics curriculum.

green line

Step 7. Let’s now turn to trigonometric functions, starting with f(x) = \sin x.

What’s f(0)? Plugging in, we find f(0) = \sin 0 = 0.

As before, we continue until we find a pattern. Next, f'(x) = \cos x, so that f'(0) = 1.

Next, f''(x) = -\sin x, so that f''(0) = 0.

Next, f'''(x) = -\cos x, so that f''(0) = -1.

No pattern yet. Let’s keep going.

Next, f^{(4)}(x) = \sin x, so that f^{(4)}(0) = 0.

Next, f^{(5)}(x) = \cos x, so that f^{(5)}(0) = 1.

Next, f^{(6)}(x) = -\sin x, so that f^{(6)}(0) = 0.

Next, f^{(7)}(x) = -\cos x, so that f^{(7)}(0) = -1.

OK, it looks like we have a pattern… albeit more awkward than the patterns for e^x and \displaystyle \frac{1}{1-x}. Plugging into the series, we find that

\displaystyle \sin x= x - \frac{x^3}{3!} + \frac{x^5}{5!} - \frac{x^7}{7!} \dots

If we stare at the pattern of terms long enough, we can write this more succinctly as

\sin x = \displaystyle \sum_{n=0}^\infty (-1)^n \frac{x^{2n+1}}{(2n+1)!}

The (-1)^n term accounts for the alternating signs (starting on positive with n=0), while the 2n+1 is needed to ensure that each exponent and factorial is odd.

Let’s see… \sin x has a Taylor expansion that only has odd exponents. In what other sense are the words “sine” and “odd” associated?

In Precalculus, a function f(x) is called odd if f(-x) = -f(x) for all numbers x. For example, f(x) = x^9 is odd since f(-x) = (-x)^9 = -x^9 since 9 is a (you guessed it) an odd number. Also, \sin(-x) = -\sin x, and so \sin x is also an odd function. So we shouldn’t be that surprised to see only odd exponents in the Taylor expansion of \sin x.

A pedagogical note: In my opinion, it’s better (for review purposes) to avoid the \displaystyle \sum notation and simply use the “dot, dot, dot” expression instead. The point of this exercise is to review a topic that’s been long forgotten so that these Taylor series can be used for other purposes. My experience is that the \displaystyle \sum adds a layer of abstraction that students don’t need to overcome for the time being.

green line

Step 8. Let’s now turn try f(x) = \cos x.

What’s f(0)? Plugging in, we find f(0) = \cos 1 = 0.

Next, f'(x) = -\sin x, so that f'(0) = 0.

Next, f''(x) = -\cos x, so that f'(0) = -1.

It looks like the same pattern of numbers as above, except shifted by one derivative. Let’s keep going.

Next, f'''(x) = \sin x, so that f'''(0) = 0.

Next, f^{(4)}(x) = \cos x, so that f^{(4)}(0) = 1.

Next, f^{(5)}(x) = -\sin x, so that f^{(5)}(0) = 0.

Next, f^{(6)}(x) = -\cos x, so that f^{(6)}(0) = -1.

OK, it looks like we have a pattern somewhat similar to that of $\sin x$, except only involving the even terms. I guess that shouldn’t be surprising since, from precalculus we know that \cos x is an even function since \cos(-x) = \cos x for all x.

Plugging into the series, we find that

\displaystyle \cos x= 1 - \frac{x^2}{2!} + \frac{x^4}{4!} - \frac{x^6}{6!} \dots

If we stare at the pattern of terms long enough, we can write this more succinctly as

\cos x = \displaystyle \sum_{n=0}^\infty (-1)^n \frac{x^{2n}}{(2n)!}

green line

As we saw with e^x, the above series converge quickest for values of x near 0. In the case of \sin x and \cos x, this may be facilitated through the use of trigonometric identities, thus accelerating convergence.

For example, the series for \cos 1000^o will converge quite slowly (after converting 1000^o into radians). However, we know that

\cos 1000^o= \cos(1000^o - 720^o) =\cos 280^o

using the periodicity of \cos x. Next, since $\latex 280^o$ is in the fourth quadrant, we can use the reference angle to find an equivalent angle in the first quadrant:

\cos 1000^o = \cos 280^o = \cos(360^o - 80^o) = \cos 80^o

Finally, using the cofunction identity \cos x = \sin(90^o - x), we find

\cos 1000^o = \cos 80^o = sin(90^o - 80^o) = \sin 10^o.

In this way, the sine or cosine of any angle can be reduced to the sine or cosine of some angle between 0^o and $45^o = \pi/4$ radians. Since \pi/4 < 1, the above power series will converge reasonably rapidly.

green line

Step 10. For the final part of this review, let’s take a second look at the Taylor series

e^x = \displaystyle 1 + x + \frac{x^2}{2} + \frac{x^3}{3} + \frac{x^4}{4} + \frac{x^5}{5} + \frac{x^6}{6} + \frac{x^7}{7} + \dots

Just to be silly — for no apparent reason whatsoever, let’s replace x by ix and see what happens:

e^{ix} = \displaystyle 1 - \frac{x^2}{2!} + \frac{x^4}{4!} - \frac{x^6}{6!} \dots + i \left[\displaystyle x - \frac{x^3}{3!} + \frac{x^5}{5!} - \frac{x^7}{7!} \dots \right]

after separating the terms that do and don’t have an i.

Hmmmm… looks familiar….

So it makes sense to define

e^{ix} = \cos x + i \sin x,

which is called Euler’s formula, thus proving an unexpected connected between e^x and the trigonometric functions.

Reminding students about Taylor series (Part 5)

Sadly, at least at my university, Taylor series is the topic that is least retained by students years after taking Calculus II. They can remember the rules for integration and differentiation, but their command of Taylor series seems to slip through the cracks. In my opinion, the reason for this lack of retention is completely understandable from a student’s perspective: Taylor series is usually the last topic covered in a semester, and so students learn them quickly for the final and quickly forget about them as soon as the final is over.

Of course, when I need to use Taylor series in an advanced course but my students have completely forgotten this prerequisite knowledge, I have to get them up to speed as soon as possible. Here’s the sequence that I use to accomplish this task. Covering this sequence usually takes me about 30 minutes of class time.

I should emphasize that I present this sequence in an inquiry-based format: I ask leading questions of my students so that the answers of my students are driving the lecture. In other words, I don’t ask my students to simply take dictation. It’s a little hard to describe a question-and-answer format in a blog, but I’ll attempt to do this below.

In the previous posts, I described how I lead students to the definition of the Maclaurin series

f(x) = \displaystyle \sum_{k=0}^{\infty} \frac{f^{(k)}(0)}{k!} x^k,

which converges to f(x) within some radius of convergence for all functions that commonly appear in the secondary mathematics curriculum.

green line

Step 5. That was easy; let’s try another one. Now let’s try f(x) = \displaystyle \frac{1}{1-x} = (1-x)^{-1}.

What’s f(0)? Plugging in, we find f(x) = \displaystyle \frac{1}{1-0} = 1.

Next, to find f'(0), we first find f'(x). Using the Chain Rule, we find f'(x) = -(1-x)^{-2} \cdot (-1) = \displaystyle \frac{1}{(1-x)^2}, so that f'(0) = 1.

Next, we differentiate again: f'(x) = (-2) \cdot (1-x)^{-3} \cdot (-1) = \displaystyle \frac{2}{(1-x)^3}, so that f''(0) = 2.

Hmmm… no obvious pattern yet… so let’s keep going.

For the next term, f'''(x) = (-3) \cdot 2(1-x)^{-4} \cdot (-1) = \displaystyle \frac{6}{(1-x)^4}, so that f'''(0) = 6.

For the next term, f^{(4)}(x) = (-4) \cdot 6(1-x)^{-5} \cdot (-1) = \displaystyle \frac{24}{(1-x)^5}, so that f^{(4)}(0) = 24.

Oohh… it’s the factorials again! It looks like f^{(n)}(0) = n!, and this can be formally proved by induction.

Plugging into the series, we find that

\displaystyle \frac{1}{1-x} = \sum_{n=0}^\infty \frac{n!}{n!} x^n = \sum_{n=0}^\infty x^n = 1 + x + x^2 + x^3 + \dots.

Like the series for e^x, this series converges quickest for x \approx 0. Unlike the series for e^x, this series does not converge for all real numbers. As can be checked with the Ratio Test, this series only converges if |x| < 1.

The right-hand side is a special kind of series typically discussed in precalculus. (Students often pause at this point, because most of them have forgotten this too.) It is an infinite geometric series whose first term is $1$ and common ratio $x$. So starting from the right-hand side, one can obtain the left-hand side using the formula

a + ar + ar^2 + ar^3 + \dots = \displaystyle \frac{a}{1-r}

by letting a=1 and $r=x$. Also, as stated in precalculus, this series only converges if the common ratio satisfies $|r| < 1$, as before.

In other words, in precalculus, we start with the geometric series and end with the function. With Taylor series, we start with the function and end with the series.

green line

Step 6. A whole bunch of other Taylor series can be quickly obtained from the one for \displaystyle \frac{1}{1-x}. Let’s take the derivative of both sides (and ignore the fact that one should prove that differentiating this infinite series term by term is permissible). Since

\displaystyle \frac{d}{dx} \left( \frac{1}{1-x} \right) = \frac{1}{(1-x)^2}

and

\displaystyle \frac{d}{dx} \left( 1 + x + x^2 + x^3 + x^4 + \dots \right) = 1 + 2x + 3x^2 + 4x^3 + \dots,

we have

\displaystyle \frac{1}{(1-x)^2} = 1 + 2x + 3x^2 + 4x^3 + \dots.

____________________

Next, let’s replace x with -x in the Taylor series in Step 5, obtaining

\displaystyle \frac{1}{1+x} = 1 - x + x^2 - x^3 + x^4 - x^5 \dots

Now let’s take the indefinite integral of both sides:

\displaystyle \int \frac{dx}{1+x} = \int \left( 1 - x + x^2 - x^3 + x^4 - x^5 \dots \right) \, dx

\ln(1+x) = \displaystyle x - \frac{x^2}{2} + \frac{x^3}{3} -\frac{ x^4}{4} + \frac{x^5}{5} -\frac{ x^6}{6} \dots + C

To solve for the constant of integration, let x = 0:

\ln(1) = 0+ C \Longrightarrow C = 0

Plugging back in, we conclude that

\ln(1+x) = x - \displaystyle \frac{x^2}{2} + \frac{x^3}{3} -\frac{ x^4}{4} + \frac{x^5}{5} -\frac{ x^6}{6} \dots

The Taylor series expansion for \ln(1-x) can be found by replacing x with -x:

\ln(1-x) = -x - \displaystyle \frac{x^2}{2} - \frac{x^3}{3} -\frac{ x^4}{4} - \frac{x^5}{5} -\frac{ x^6}{6} \dots

Subtracting, we find

\ln(1+x) - \ln(1-x) = \ln \displaystyle \left( \frac{1+x}{1-x} \right) = 2x + \frac{2x^3}{3}+ \frac{2x^5}{5} \dots

My understanding is that this latter series is used by calculators when computing logarithms.

____________________

Next, let’s replace x with -x^2 in the Taylor series in Step 5, obtaining

\displaystyle \frac{1}{1+x^2} = 1 - x^2 + x^4 - x^6 + x^8 - x^{10} \dots

Now let’s take the indefinite integral of both sides:

\displaystyle \int \frac{dx}{1+x^2} = \int \left(1 - x^2 + x^4 - x^6 + x^8 - x^{10} \dots\right) \, dx

\tan^{-1}x = \displaystyle x - \frac{x^3}{3} + \frac{x^5}{5} -\frac{ x^7}{7} + \frac{x^9}{9} -\frac{ x^{11}}{11} \dots + C

To solve for the constant of integration, let x = 0:

\tan^{-1}(1) = 0+ C \Longrightarrow C = 0

Plugging back in, we conclude that

\tan^{-1}x = \displaystyle x - \frac{x^3}{3} + \frac{x^5}{5} -\frac{ x^7}{7} + \frac{x^9}{9} -\frac{ x^{11}}{11} \dots

____________________

In summary, a whole bunch of Taylor series can be extracted quite quickly by differentiating and integrating from a simple infinite geometric series. I’m a firm believer in minimizing the number of formulas that I should memorize. Any time I personally need any of the above series, I’ll quickly use the above steps to derive them from that of \displaystyle \frac{1}{1-x}.

Reminding students about Taylor series (Part 4)

I’m in the middle of a series of posts describing how I remind students about Taylor series. In the previous posts, I described how I lead students to the definition of the Maclaurin series

f(x) = \displaystyle \sum_{k=0}^{\infty} \frac{f^{(k)}(0)}{k!} x^k,

which converges to f(x) within some radius of convergence for all functions that commonly appear in the secondary mathematics curriculum.

green line

Step 4. Let’s now get some practice with Maclaurin series. Let’s start with f(x) = e^x.

What’s f(0)? That’s easy: f(0) = e^0 = 1.

Next, to find f'(0), we first find f'(x). What is it? Well, that’s also easy: f'(x) = \frac{d}{dx} (e^x) = e^x. So f'(0) is also equal to 1.

How about f''(0)? Yep, it’s also 1. In fact, it’s clear that f^{(n)}(0) = 1 for all n, though we’ll skip the formal proof by induction.

Plugging into the above formula, we find that

e^x = \displaystyle \sum_{k=0}^{\infty} \frac{1}{k!} x^k = \sum_{k=0}^{\infty} \frac{x^k}{k!} = 1 + x + \frac{x^2}{2} + \frac{x^3}{3} + \dots

It turns out that the radius of convergence for this power series is \infty. In other words, the series on the right converges for all values of x. So we’ll skip this for review purposes, this can be formally checked by using the Ratio Test.

green line

At this point, students generally feel confident about the mechanics of finding a Taylor series expansion, and that’s a good thing. However, in my experience, their command of Taylor series is still somewhat artificial. They can go through the motions of taking derivatives and finding the Taylor series, but this complicated symbol in \displaystyle \sum notation still doesn’t have much meaning.

So I shift gears somewhat to discuss the rate of convergence. My hope is to deepen students’ knowledge by getting them to believe that f(x) really can be approximated to high precision with only a few terms. Perhaps not surprisingly, it converges quicker for small values of x than for big values of x.

Pedagogically, I like to use a spreadsheet like Microsoft Excel to demonstrate the rate of convergence. A calculator could be used, but students can see quickly with Excel how quickly (or slowly) the terms get smaller. I usually construct the spreadsheet in class on the fly (the fill down feature is really helpful for doing this quickly), with the end product looking something like this:

Taylor0

In this way, students can immediately see that the Taylor series is accurate to four significant digits by going up to the x^4 term and that about ten or eleven terms are needed to get a figure that is as accurate as the precision of the computer will allow. In other words, for all practical purposes, an infinite number of terms are not necessary.

In short, this is how a calculator computes e^x: adding up the first few terms of a Taylor series. Back in high school, when students hit the e^x button on their calculators, they’ve trusted the result but the mechanics of how the calculator gets the result was shrouded in mystery. No longer.

Then I shift gears by trying a larger value of x:

Taylor1

I ask my students the obvious question: What went wrong? They’re usually able to volunteer a few ideas:

  • The convergence is slower for larger values of x.
  • The series will converge, but more terms are needed (and I’ll later use the fill down feature to get enough terms so that it does converge as accurate as double precision will allow).
  • The individual terms get bigger until k=11 and then start getting smaller. I’ll ask my students why this happens, and I’ll eventually get an explanation like

\displaystyle \frac{(11.5)^6}{6!} < \frac{(11.5)^6}{6!} \times \frac{11.5}{7} = \frac{(11.5)^7}{7!}

but

\displaystyle \frac{(11.5)^{11}}{11!} < \frac{(11.5)^{11}}{11!} \times \frac{11.5}{12} = \frac{(11.5)^{12}}{12!}

At this point, I’ll mention that calculators use some tricks to speed up convergence. For example, the calculator can simply store a few values of e^x in memory, like e^{16}, e^{8}, e^{4}, e^{2}, and e^{1} = e. I then ask my class how these could be used to find e^{11.5}. After some thought, they will volunteer that

e^{11.5} = e^8 \cdot e^2 \cdot e \cdot e^{0.5}.

The first three values don’t need to be computed — they’ve already been stored in memory — while the last value can be computed via Taylor series. Also, since 0.5 < 1, the series for e^{0.5} will converge pretty quickly. (Some students may volunteer that the above product is logically equivalent to turning 11 into binary.)

At this point — after doing these explicit numerical examples — I’ll show graphs of e^x and graphs of the Taylor polynomials of e^x, observing that the polynomials get closer and closer to the graph of e^x as more terms are added. (For example, see the graphs on the Wikipedia page for Taylor series, though I prefer to use Mathematica for in-class purposes.) In my opinion, the convergence of the graphs only becomes meaningful to students only after doing some numerical examples, as done above.

green line

At this point, I hope my students are familiar with the definition of Taylor (Maclaurin) series, can apply the definition to e^x, and have some intuition meaning that the nasty Taylor series expression practically means add a bunch of terms together until you’re satisfied with the convergence.

In the next post, we’ll consider another Taylor series which ought to be (but usually isn’t) really familiar to students: an infinite geometric series.

P.S. Here’s the Excel spreadsheet that I used to make the above figures: Taylor.

Reminding students about Taylor series (Part 3)

Sadly, at least at my university, Taylor series is the topic that is least retained by students years after taking Calculus II. They can remember the rules for integration and differentiation, but their command of Taylor series seems to slip through the cracks. In my opinion, the reason for this lack of retention is completely understandable from a student’s perspective: Taylor series is usually the last topic covered in a semester, and so students learn them quickly for the final and quickly forget about them as soon as the final is over.

Of course, when I need to use Taylor series in an advanced course but my students have completely forgotten this prerequisite knowledge, I have to get them up to speed as soon as possible. Here’s the sequence that I use to accomplish this task. Covering this sequence usually takes me about 30 minutes of class time.

I should emphasize that I present this sequence in an inquiry-based format: I ask leading questions of my students so that the answers of my students are driving the lecture. In other words, I don’t ask my students to simply take dictation. It’s a little hard to describe a question-and-answer format in a blog, but I’ll attempt to do this below.

In the previous post, I described how I lead students to the equations

f(x) = \displaystyle \sum_{k=0}^n \frac{f^{(k)}(0)}{k!} x^k.

and

f(x) = \displaystyle \sum_{k=0}^n \frac{f^{(k)}(a)}{k!} (x-a)^k,

where $f(x)$ is a polynomial and a can be any number.

green line

Step 3. What happens if the original function f(x) is not a polynomial? For one thing, the right-hand side can no longer be a finite sum. As long as the sum on the right-hand side stops at some degree n, the right-hand side is a polynomial, but the left-hand side is assumed to not be a polynomial.

To resolve this, we can cross our fingers and hope that

f(x) = \displaystyle \sum_{k=0}^{\infty} \frac{f^{(k)}(0)}{k!} x^k,

or

f(x) = \displaystyle \sum_{k=0}^{\infty}\frac{f^{(k)}(a)}{k!} (x-a)^k.

In other words, let’s make the right-hand side an infinite series, and hope for the best. This is the definition of the Taylor series expansions of f.

Note: At this point in the review, I can usually see the light go on in my students’ eyes. Usually, they can now recall their work with Taylor series in the past… and they wonder why they weren’t taught this topic inductively (like I’ve tried to do in the above exposition) instead of deductively (like the presentation in most textbooks).

While we’d like to think that the Taylor series expansions always work, there are at least two things that can go wrong.

  1. First, the sum on the left is an infinite series, and there’s no guarantee that the series will converge in the first place. There are plenty of example of series that diverge, like \displaystyle \sum_{k=0}^\infty \frac{1}{k+1}.
  2. Second, even if the series converges, there’s no guarantee that the series will converge to the “right” answer f(x). The canonical example of this behavior is f(x) = e^{-1/x^2}, which is so “flat” near $x=0$ that every single derivative of f is equal to 0 at x =0.

For the first complication, there are multiple tests devised in Calculus II, especially the Ratio Test, to determine the values of x for which the series converges. This establishes a radius of convergence for the series.

The second complication is far more difficult to address rigorously. The good news is that, for all commonly occurring functions in the secondary mathematics curriculum, the Taylor series of a function properly converges (when it does converge). So we will happily ignore this complication for the remainder of the presentation.

Indeed, it’s remarkable that the series should converge to f(x) at all. Think about the meaning of the terms on the right-hand side:

  1. f(a) is the y-coordinate at x=a.
  2. f'(a) is the slope of the curve at x=a.
  3. f''(a) is a measure of the concavity of the curve at — you guessed it — x=a.
  4. f'''(a) is an even more subtle description of the curve… once again, at x=a.

In other words, if the Taylor series converges to f(x), then every twist and turn of the function, even at points far away from x=a, is encoded somehow in the shape of the curve at the one point x=a. So analytic functions (which has a Taylor series which converges to the original functions) are indeed quite remarkable.

 

Reminding students about Taylor series (Part 2)

In this series of posts, I will describe the sequence of examples that I use to remind students about Taylor series. (One time, just for fun, I presented this topic at the end of a semester of Calculus I, and it seemed to go well even for that audience who had not seen Taylor series previously.)

I should emphasize that I present this sequence inductively and in an inquiry-based format: I ask leading questions of my students so that the answers of my students are driving the lecture. In other words, I don’t ask my students to simply take dictation. It’s a little hard to describe a question-and-answer format in a blog, but I’ll attempt to do this below.

green line

Step 1. Find the unique quartic (fourth-degree) polynomial so that f(0) = 6, f'(0) = -3, f''(0) = 6, f'''(0) = 2, and f^{(4)}(0) = 10.

I’ve placed a thought bubble if you’d like to think about it before scrolling down to see the answer. Here’s a hint to get started: let f(x) = ax^4 + bx^3 + cx^2 + dx + e, and start differentiating. Remember that a, b, c, d, and e are constants.

green_speech_bubble

We begin with the information that f(0) = 6. How else can we find $f(0)$? Since f(x) = ax^4 + bx^3 + cx^2 + dx + e, we see that f(0) = e. Therefore, it must be that e = 6.

How about f'(0)? We see that f'(x) = 4ax^3 + 3bx^2 + 2cx + d, and so f'(0) = d. Since f'(0) = -3, we have that d = -3.

Next, f''(x) = 12ax^2 + 6bx + 2c, and so f''(0) = 2c. Since f''(0) = 6,we have that 2c = 6, or c = 3.

Next, f'''(x) = 24ax + 6b, and so f'''(0) = 6b. Since f'''(0) = 2,we have that 6b = 2, or b = \frac{1}{3}.

Finally, f^{(4)}(x) = 24a, and so f^{(4)}(0) = 24a. Since f^{(4)}(0) = 10, we have 24a = 10, or a = \frac{5}{12}.

What do we get when we put all of this information together? The polynomial must be

f(x) = \frac{5}{12} x^4 + \frac{1}{3} x^3 + 3 x^2 - 3x + 6.

green line

Step 2. How are these coefficients related to the information given in the problem?

green_speech_bubbleLet’s start with the leading coefficient, a = \frac{5}{12}. How did we get this answer? It came from dividing 10 by 24. Where did the 10 come from? It was the given value of f^{(4)}(0), and so

a = \displaystyle \frac{f^{(4)}(0)}{24}.

Next, b = \frac{1}{3}, which arose from dividing 2 by 6. The number 2 was the given value of f'''(0), and so

b =\displaystyle \frac{f'''(0)}{6}.

Moving to the next coefficient, c = 3, which arose from dividing f''(0) = 6 by 2. So

c = \displaystyle\frac{f''(0)}{2}.

Finally, it’s clear that

d = f'(0) and e = f(0).

This last line doesn’t quite fit the pattern of the first three lines. The first three lines all have fractions, but these last two expressions don’t. How can we fix this? In the hopes of finding a pattern, let’s (unnecessarily) write d and e as fractions by dividing by 1:

d = \displaystyle\frac{f'(0)}{1} and e = \displaystyle \frac{f(0)}{1}.

Let’s now rewrite the polynomial f(x) in light of this discussion:

f(x) = \displaystyle \frac{f'^{(4)}(0)}{24} x^4 + \frac{f'''(0)}{6} x^3 + \frac{f'''(0)}{2} x^2 + \frac{f'(0)}{1}x + \frac{f(0)}{1}.

What pattern do we see in the numerators? It’s apparent that the number of derivatives matches the power of x. For example, the x^3 term has a coefficient involving the third derivative of f. The last two terms fit this pattern as well, since x = x^1 and the last term is multiplied by x^0 = 1.

What pattern do we see in the denominators? 1, 1, 2, 6, 24 \dots where have we seen those before? Oh yes, the factorials! We know that 4! = 4 \cdot 3 \cdot 2 \cdot 1 = 24, 3! = 3 \cdot 2 \cdot 1 = 6, 2! = 2 \cdot 1 = 2, 1! = 1, and 0! is defined to be 1. So f(x) can be rewritten as

f(x) = \displaystyle \frac{f'^{(4)}(0)}{4!} x^4 + \frac{f'''(0)}{3!} x^3 + \frac{f'''(0)}{2!} x^2 + \frac{f'(0)}{1!}x + \frac{f(0)}{0!}.

How can this be written more compactly? By using \displaystyle \sum-notation:

f(x) = \displaystyle \sum_{k=0}^4 \frac{f^{(k)}(0)}{k!} x^k.

Why does the sum stop at 4? Because the original polynomial had degree 4. In general, if the polynomial had degree n, it’s reasonable to guess that

f(x) = \displaystyle \sum_{k=0}^n \frac{f^{(k)}(0)}{k!} x^k.

This is called the Maclaurian series, or the Taylor series about x =0. While I won’t prove it here, one can find Taylor series expansions about points other than 0:

f(x) = \displaystyle \sum_{k=0}^n \frac{f^{(k)}(a)}{k!} (x-a)^k,

where a can be any number. Though not proven here, these series are exactly true for polynomials.

In the next post, we’ll discuss what happens if f(x) is not a polynomial.