Why does 0! = 1? (Index)

I’m using the Twelve Days of Christmas (and perhaps a few extra days besides) to do something that I should have done a long time ago: collect past series of posts into a single, easy-to-reference post. The following posts formed my series on how I explain to students that $0! = 1$ .

Part 1: Multiplication and division.

Part 2: Combinatorics.

Is there an easy function without an easy Taylor series expansion?

After class one day, a student approached me with an interesting question:

Is there an easy function without an easy Taylor expansion?

This question really struck me for several reasons.

Most functions do not have an easy Taylor (or Maclaurin) expansion. After all, the formula for a Taylor expansion involves the $n$ th derivative of the original function, and higher-order derivatives usually get progressively messier with each successive differentiation.
Most of the series expansions that are taught in Calculus II arise from functions that somehow violate the above rule, like $f(x) = \sin x$ , $f(x) = \cos x$ , $f(x) = e^x$ , and $f(x) = 1/(1-x)$ .
Therefore, this student was under the misconception that most easy functions have easy Taylor expansions, while in reality most functions do not.

It took me a moment to answer his question, but I answered with $f(x) = tan x$ . Successively using the Quotient Rule makes the derivatives of $tan x$ messier and messier, but $tan x$ definitely qualifies as an easy function that most students have seen since high school. It turns out that the Taylor expansion of $f(x) = \sin x$ can be written as an infinite series using the Bernoulli numbers, but that’s a concept that most calculus students haven’t seen yet.

Earlier posts on Taylor series:

https://meangreenmath.com/2013/07/01/reminding-students-about-taylor-series-part-1/

https://meangreenmath.com/2013/07/02/reminding-students-about-taylor-series-part-2/

https://meangreenmath.com/2013/07/03/giving-students-a-refresher-about-taylor-series-part-3/

https://meangreenmath.com/2013/07/04/giving-students-a-refresher-about-taylor-series-part-4/

https://meangreenmath.com/2013/07/05/reminding-students-about-taylor-series-part-5/

https://meangreenmath.com/2013/07/06/reminding-students-about-taylor-series-part-6/

https://meangreenmath.com/2013/07/24/taylor-series-without-calculus-2/

“Or” / “and”

One of the formulas typically taught in mathematics is

$P(A \cup B) = P(A) + P(B) - P(A \cap B)$

In ordinary English, the probability that either event $A$ or $B$ happens is the probability of event $A$ plus the probability of event $B$ minus the probability that the both occur.

For example, when rolling two fair six-sided dice, the probability that at least one three appears is

$P(A \cup B) = \displaystyle \frac{1}{6} + \frac{1}{6} - \frac{1}{36} = \displaystyle \frac{11}{36}$ .

It’s necessary to subtract something off at the end because it’s possible for the first die to be a four and simultaneously the second die to be a four.

This can be a conceptual barrier for students if it’s not directly addressed. In mathematics, the word “or” means “one or the other… or maybe both.” In the previous example, event $A$ was “first die is a four” and event $B$ was “second die is a four,” and it’s possible that both events could occur simultaneously.

Of course, this is different than the way we typically use “or” is spoken English. For example, in the final episode of each season of “The Bachelor,” the guy has to choose one woman or the other… and there’s no possibility of him choosing both! When a student says, “Next semester, my morning class will be history or physics,” we don’t think that there’s a possibility that the student will choose both classes… the student will choose one or the other, but not both.

In terms of mathematical logic, the word “or” in ordinary speech really is an “exclusive or.”

As I said, this isn’t a big deal for students to see, but in my opinion it’s best to directly address this subtlety rather than have students confused about which meaning of the word “or” they should be using when doing their homework.

P.S. The good news is that the word “and” means the same thing in the language of probability/logic as its meaning in ordinary speech.

Why 0^0 is undefined

Here’s an explanation for why $0^0$ is undefined that should be within the grasp of pre-algebra students:

Part 1.

What is $0^3$ ? Of course, it’s $0$ .
What is $0^2$ ? Again, $0$ .
What is $0^1$ ? Again, $0$ .
What is $0^{1/2}$ , or $\sqrt{0}$ ? Again, $0$ .
What is $0^{1/3}$ , or $\sqrt[3]{0}$ ? In other words, what number, when cubed, is $0$ ? Again, $0$ .
What is $0^{1/10}$ , or $\sqrt[10]{0}$ ? In other words, what number, when raised to the 10th power, is $0$ . Again, $0$ .

So as the exponent gets closer to $0$ , the answer remains $0$ . So, from this perspective, it looks like $0^0$ ought to be equal to $0$ .

Part 2.

What is $3^0$ . Of course, it’s $1$ .
What is $2^0$ . Again, $1$ .
What is $1^0$ . Again, $1$ .
What is $\left( \displaystyle \frac{1}{2} \right)^0$ ? Again, $1$
What is $\left( \displaystyle \frac{1}{3} \right)^0$ . Again, $1$
What is $\left( \displaystyle \frac{1}{10} \right)^0$ ? Again, $1$

So as the base gets closer to $0$ , the answer remains $1$ . So, from this perspective, it looks like $0^0$ ought to be equal to $1$ .

In conclusion: looking at it one way, $0^0$ should be defined to be $0$ . From another perspective, $0^0$ should be defined to be $1$ .

Of course, we can’t define a number to be two different things! So we’ll just say that $0^0$ is undefined — just like dividing by $0$ is undefined — rather than pretend that $0^0$ switches between two different values.

Here’s a more technical explanation about why $0^0$ is an indeterminate form, using calculus.

Part 1. As before,

$\displaystyle \lim_{x \to 0^+} 0^x = \lim_{x \to 0^+} 0 = 0$ .

The first equality is true because, inside of the limit, $x$ is permitted to get close to $0$ but cannot actually equal $0$ , and there’s no ambiguity about $0^x = 0$ if $x >0$ . (Naturally, $0^x$ is undefined if $x < 0$ .)

The second equality is true because the limit of a constant is the constant.

Part 2. As before,

$\displaystyle \lim_{x \to 0} x^0 = \lim_{x \to 0} 1 = 1$ .

Once again, the first equality is true because, inside of the limit, $x$ is permitted to get close to $0$ but cannot actually equal $0$ , and there’s no ambiguity about $x^0 = 1$ if $x \ne 0$ .

As before, the answers from Parts 1 and 2 are different. But wait, there’s more…

Part 3. Here’s another way that $0^0$ can be considered, just to give us a headache. Let’s evaluate

$\displaystyle \lim_{x \to 0^+} x^{1/\ln x}$

Clearly, the base tends to $0$ as $x \to 0$ . Also, $\ln x \to \infty$ as $x \to 0^+$ , so that $\displaystyle \frac{1}{\ln x} \to 0$ as $x \to 0^+$ . In other words, this limit has the indeterminate form $0^0$ .

To evaluate this limit, let’s take a logarithm under the limit:

$\displaystyle \lim_{x \to 0^+} \ln x^{1/\ln x} = \displaystyle \lim_{x \to 0^+} \frac{1}{\ln x} \cdot \ln x$

$\displaystyle \lim_{x \to 0^+} \ln x^{1/\ln x} = \displaystyle \lim_{x \to 0^+} 1$

$\displaystyle \lim_{x \to 0^+} \ln x^{1/\ln x} = 1$

Therefore, without the extra logarithm,

$\displaystyle \lim_{x \to 0^+} x^{1/\ln x} = e^1 = e$

Part 4. It gets even better. Let $k$ be any positive real number. By the same logic as above,

$\displaystyle \lim_{x \to 0^+} x^{\ln k/\ln x} = e^{\ln k} = k$

So, for any $k \ge 0$ , we can find a function $f(x)$ of the indeterminate form $0^0$ so that $\displaystyle f(x) = k$ .

In other words, we could justify defining $0^0$ to be any nonnegative number. Clearly, it’s better instead to simply say that $0^0$ is undefined.

P.S. I don’t know if it’s possible to have an indeterminate form of $0^0$ where the answer is either negative or infinite. I tend to doubt it, but I’m not sure.

Why does 0.999… = 1? (Part 5)

Here’s one more way of convincing students that $0.\overline{9} = 1$ . Here’s the idea: how far apart are the two numbers?

First off, since $1 \ge 0.\overline{9}$ , we know that $1 - \overline{9} \ge 0$ .

Of course, we know that $1-0.9 = 0.1$ . Since $0.\overline{9}$ must lie between $0.9$ and $1$ , we know that $1 - 0.\overline{9}$ must be less than $0.1$ .

Second, we know that $1-0.99 = 0.01$ . Since $0.\overline{9}$ must lie between $0.99$ and $1$ , we know that $1 - 0.\overline{9}$ must be less than $0.01$ .

Third, we know that $1-0.999 = 0.001$ . Since $0.\overline{9}$ must lie between $0.999$ and $1$ , we know that $1 - 0.\overline{9}$ must be less than $0.001$ .

By the same reasoning, we conclude that

$0 \le 1 - 0.\overline{9} < \displaystyle \frac{1}{10^n}$

for every integer $n$ . What’s the only number that’s greater than or equal to $0$ and less than every decimal of the form $0.00\dots001$ ? Clearly, the only such number is $0$ . Therefore,

$1 - 0.\overline{9} = 0$ , or $0.\overline{9} = 1$ .

I like this approach because it really gets at the heart of the difference between integers $\mathbb{Z}$ and real numbers $\mathbb{R}$ . For integers, there is always an integer to the immediate left and to the immediate right. In other words, if you give me any integer (say, $15$ ), I can tell you the largest integer that’s less than your number (in our example, $14$ ) and the smallest integer that’s bigger than your number ( $16$ ).

Real numbers, however, do not have this property. There is no real number to the immediate right of $0$ . This is easy to prove by contradiction. Suppose $x > 0$ is the real number to the immediate left of $0$ . That means that there are no real numbers between $0$ and $x$ . However, $x/2$ is bigger than $0$ and less than $x$ , providing the contradiction.

(For what it’s worth, the above proof doesn’t apply to the set of integers $\mathbb{Z}$ since $x/2$ doesn’t have to be an integer.)

By the same logic — visually, you can imagine reflecting the number line across the point $x = 0.5$ — there is no number to the immediate left of $1$ . So while $0.\overline{9}$ would appear to be to the immediate left of $1$ , they are in reality the same point.

Why does 0.999… = 1? (Part 4)

In this series, I discuss some ways of convincing students that $0.999\dots = 1$ and that, more generally, a real number may have more than one decimal representation even though a decimal representation corresponds to only one real number. This can be a major conceptual barrier for even bright students to overcome. I have met a few math majors within a semester of graduating — that is, they weren’t dummies — who could recite all of these ways and were perhaps logically convinced but remained psychologically unconvinced.

Method #5. This is a proof by contradiction; however, I think it should be convincing to a middle-school student who’s comfortable with decimal representations. Also, perhaps unlike Methods #1-4, this argument really gets to the heart of the matter: there can’t be a number in between $0.999\dots$ and $1$ , and so the two numbers have to be equal.

In the proof below, I’m deliberating avoiding the explicit use of algebra (say, letting $x$ be the midpoint) to make the proof accessible to pre-algebra students.

Suppose that $0.999\dots < 1$ . Then the midpoint of $0.999\dots$ and $1$ has to be strictly greater than $0.999\dots$ , since

$\displaystyle \frac{0.999\dots + 1}{2} > \displaystyle \frac{0.999\dots + 0.999\dots}{2} = 0.999\dots$

Similarly, the midpoint is strictly less than $1$ :

$\displaystyle \frac{0.999\dots + 1}{2} < \displaystyle \frac{1 +1}{2} =1$

(For the sake of convincing middle-school students, a number line with three tick marks — for $0.999\dots$ , $1$ , and the midpoint — might be more believable than the above inequalities.)

So what is the decimal representation of the midpoint? Since the midpoint is less than $1$ , the decimal representation has to be $0.\hbox{something}$ Furthermore, the midpoint does not equal $0.999\dots$ . That means, somewhere in the decimal representation of the midpoint, there’s a digit that’s not equal to $9$ . In other words, the midpoint has to have one of the following 9 forms:

midpoint = $0.999\dots 990 \, \_ \, \_ \dots$

midpoint = $0.999\dots 991 \, \_ \, \_ \dots$

midpoint = $0.999\dots 992 \, \_ \, \_ \dots$

midpoint = $0.999\dots 993 \, \_ \, \_ \dots$

midpoint = $0.999\dots 994 \, \_ \, \_ \dots$

midpoint = $0.999\dots 995 \, \_ \, \_ \dots$

midpoint = $0.999\dots 996 \, \_ \, \_ \dots$

midpoint = $0.999\dots 997 \, \_ \, \_ \dots$

midpoint = $0.999\dots 998 \, \_ \, \_ \dots$

In any event, $9$ is the largest digit. That means that, no matter what, the midpoint is less than $0.999\dots$ , contradicting the fact that the midpoint is larger than $0.999\dots$ (if $0.999\dots < 1$ ).

Why does 0.999… = 1? (Part 3)

Method #4. This is a direct method using the formula for an infinite geometric series… and hence will only be convincing to students if they’re comfortable with using this formula. By definition,

$0.999\dots = \displaystyle \frac{9}{10} + \frac{9}{100} + \frac{9}{1000} + \dots$

This is an infinite geometric series. Its first term is $\displaystyle \frac{9}{10}$ , and the common ratio needed to go from one term to the next term is $\displaystyle \frac{1}{10}$ . Therefore,

$0.999\dots = \displaystyle \frac{ \displaystyle \frac{9}{10}}{ \quad \displaystyle 1 - \frac{1}{10} \quad}$

$0.999\dots = \displaystyle \frac{ \displaystyle \frac{9}{10}}{ \quad \displaystyle \frac{9}{10} \quad}$

$0.999\dots = 1$

Why does 0.999… = 1? (Part 2)

Methods #2 and #3 are indirect methods. We start with a decimal representation that we know and end with $0.999\dots$ .

Method #2. This technique should be accessible to any student who can do long division. With long division, we know full well that

$\displaystyle \frac{1}{3} = 0.333\dots$

Multiply both sides by $3$ :

$\displaystyle 3 \times \frac{1}{3} = 3 \times 0.333\dots$

$\displaystyle 1 = 0.999\dots$

Though not logically necessary, this method could be reinforced for students by also considering

$\displaystyle 1 = 9 \times \frac{1}{9} = 9 \times 0.111\dots = 0.999\dots$

Method #3. With long division, we know full well that

$\displaystyle \frac{1}{3} = 0.333\dots \quad$ and $~ \quad \displaystyle \frac{2}{3} = 0.666\dots$

Add them together:

$\displaystyle \frac{1}{3} + \frac{2}{3} = 0.333\dots + 0.666\dots$

$\displaystyle 1 = 0.999\dots$

Though not logically necessary, this method could be reinforced for students by also considering any (or all) of the following:

$1 = \displaystyle \frac{1}{9} + \frac{8}{9} = 0.111\dots + 0.888\dots = 0.999\dots$

$1 = \displaystyle \frac{2}{9} + \frac{7}{9} = 0.222\dots + 0.777\dots = 0.999\dots$

$1 = \displaystyle \frac{4}{9} + \frac{5}{9} = 0.444\dots + 0.555\dots = 0.999\dots$

Why does 0.999… = 1? (Part 1)

Our decimal number system is so wonderful that it’s often taken for granted. (If you doubt me, try multiplying $12$ and $61$ or finding an $18\%$ tip on a restaurant bill using only Roman numerals.)

However, there’s one little quirk about our numbering system that some students find quite unsettling:

If a number has a terminating decimal representation, then the same number also has a second different terminating decimal representation. (However, a number that does not have a terminating decimal representation does not have a second representation.)

Stated another way, a decimal representation corresponds to a unique real number. However, a real number may not have a unique decimal representation.

Some (perhaps many) students find such equalities to be unsettling at first glance, and for good reason. They’d prefer to think that there is a one-to-one correspondence to the set of real numbers and the set of decimal representations. Stated more simply, students are conditioned to think that if two number look different (like $24$ and $25$ ), then they ought to be different.

However, there’s a subtle difference between a number and a numerical representation. The number $1$ is defined to be the multiplicative identity in our system of arithmetic. However, this number has two different representations in our numbering system: $1$ and $0.999\dots$ . (Not to mention its representation in the numbering systems of the ancient Romans, Babylonians, Mayans, etc.)

As usual, let $[0,1]$ be the set of real numbers from $0$ to $1$ (inclusive), and let $D$ be the set of decimal representations of the form $0.d_1 d_2 d_3 \dots$ . Then there’s clearly a function $f : D \to \mathbb{R}$ , defined by

$f(0.d_1 d_2 d_3\dots) = \displaystyle \sum_{i=1}^\infty \frac{d_n}{10^n}$

If I want to give my students a headache, I’ll ask, “In Calculus II, you saw that some series converge and some series diverge. So what guarantee do we have that this series actually converges?” (The convergence of the right series can be verified using the Direct Comparsion Test, the fact that $d_i \le 9$ , and the formula for an infinite geometric series.)

In the language of mathematics: Using the completeness axiom, it can be proven (though no student psychologically doubts this) that $f$ maps $D$ onto $[0,1]$ . In other words, every decimal representation corresponds to a real number, and every real number has a decimal representation. However, the function $f$ is a surjection but not a bijection. In other words, a real number may have more than one decimal representation.

This is a big conceptual barrier for some students — even really bright students — to overcome. They’re not used to thinking that two different decimal expansions can actually represent the same number.

The two most commonly shown equal but different decimal representations are $0.999\dots = 1$ . Other examples are

$0.125 = 0.124999\dots$

$3.458 = 3.457999 \dots$

In this series, I will discuss some ways of convincing students that $0.999\dots = 1$ . That said, I have met a few math majors within a semester of graduating — that is, they weren’t dummies — who could recite all of these ways and were perhaps logically convinced but remained psychologically unconvinced. The idea that two different decimal representations could mean the same number just remained too high of a conceptual barrier for them to hurdle.

Method #1. This first technique is accessible to any algebra or pre-algebra student who’s comfortable assigning a variable to a number. We convert the decimal representation to a fraction using something out of the patented Bag of Tricks. If students aren’t comfortable with the first couple of steps (as in, “How would I have thought to do that myself?”), I tell my usual tongue-in-cheek story: Socrates gave the Bag of Tricks to Plato, Plato gave it to Aristotle, it passed down the generations, my teacher taught the Bag of Tricks to me, and I teach it to my students.

Let $x =0.999\dots$ . Multiply $x$ by $10$ , and subtract:

$10x = 9.999\dots$

$x = 0.999\dots$

$\therefore (10-1)x = 9$

$x =1$

$0.999\dots = 1$

Why does x^0 = 1 and x^(-n) = 1/x^n? (Part 2)

I distinctly remember when, in my second year as a college professor, a really good college student — with an SAT Math score over 650 — asked me why $x^0 = 1$ and $x^{-n} = \displaystyle \frac{1}{x^n}$ . Of course, he knew that these rules were true and he could apply them in complex problems, but he didn’t know why they were true. And he wanted to have this deeper knowledge of mathematics beyond the ability to solve routine algebra problems.

He also related that he had asked his math teachers in high school why these rules worked, but he never got a satisfactory response. So he asked his college professor.

Looking back on it, I see that this was one of the incidents that sparked my interest in teacher education. As always, I never hold a grudge against a student for asking a question. Indeed, I respected my student for posing a really good question, and I was upset for him that he had not received a satisfactory answer to his question.

This is the second of two posts where I give two answers to this question from two different points of view.

Answer #2. This explanation relies on one of the laws of exponents:

$x^n \cdot x^m = x^{n+m}$

For positive integers $n$ and $m$ , this can be proven by repeated multiplication:

$x^n x^m = (x \cdot x \dots \cdot x) \cdot (x \cdot x \dots \cdot x)$ repeated $n$ times and $m$ times

$x^n x^m = x \cdot x \cdot \dots \cdot x \cdot x \cdot \dots \cdot x$ repeated $n+m$ times

$x^n \cdot x^m = x^{n+m}$

Ideally, $x^0$ and $x^{-n}$ should be defined so that this rule still holds even if one (or both) of $n$ and $m$ is either zero or a negative integer. In particular, we should define $x^0$ so that the following rule holds:

$x^n \cdot x^0 = x^{n+0}$

$x^n \cdot x^0 = x^n$

In other words, the product of something with $x^0$ should be the original something. Clearly, the only way to make this work is if we define $x^0 = 1$ .

In the same way, we should define $x^{-n}$ so that the following rule holds:

$x^n \cdot x^{-n} = x^{n + (-n)}$

$x^n \cdot x^{-n} = x^0$

Being a good MIT freshman and using previous work, we see that

$x^n \cdot x^{-n} = 1$

Dividing, we see that

$x^{-n} = \displaystyle \frac{1}{x^n}$