Exponential growth and decay (Part 3): Paying off credit-card debt

The following problem in differential equations has a very practical application for anyone who has either (1) taken out a loan to buy a house or a car or (2) is trying to pay off credit card debt. To my surprise, most math majors haven’t thought through the obvious applications of exponential functions as a means of engaging their future students, even though it is directly pertinent to their lives (both the students’ and the teachers’).

You have a balance of $2,000 on your credit card. Interest is compounded continuously with a relative rate of growth of 25% per year. If you pay the minimum amount of $50 per month (or $600 per year), how long will it take for the balance to be paid?

In yesterday’s post, I showed that the answer to this question was about 7.2 years. To obtain this answer, I started with the differential equation

$\displaystyle \frac{dA}{dt} = 0.25 A - 600$

which, given the initial condition $A(0) = 2000$ , has solution

$A(t) = 2400 - 400 e^{0.25t}$ .

Today, I’ll give some pedagogical thoughts about how this problem, and other similar problems inspired by financial considerations, could fit into a Precalculus course… and hopefully improve the financial literacy of high school students.

I’ve read many Precalculus books; not many of them include applying exponential functions to the paying off of credit-card debt (or a mortgage on a house or car). Of course, yesterday’s derivation was well above the comprehension level of students in Precalculus. However, there’s no reason why Precalculus students couldn’t be given the general formula

$A = \displaystyle \frac{k}{r} - \left( \frac{k}{r} - P \right) e^{rt}$ ,

where $P$ is the initial amount, $r$ is the relative rate of growth, and $k$ is the amount paid per year. In other words, students could be given the formula without the full explanation of where it comes from. After all, many Precalculus textbooks give the formula for Newton’s Law of Cooling (the subject of a future post) with neither derivation nor explanation (though its derivation is nearly identical to the work of yesterday’s post), So I don’t see why also giving students the above formula for paying off credit-card debt isn’t more common.

Plugging in $k = 600$ , $r = 0.25$ , and $P = 2000$ into this equation again yields the function

$A(t) = 2400 - 400 e^{0.25t}$ ,

from which we find that it will take $t = 4\ln 6 \approx 7.2$ years to pay off the debt.

A natural follow-up question is “How much money actually was spent to pay off this debt?” By this point, the answer is quite easy: the lender paid $\$600$ per year for $4\ln 6$ years, and so the amount spent is

$\$600 \times 4 \ln 6 = \$2400 \ln 6 \approx \$4300$ .

When I teach this topic in differential equations, I let that answer sink in for a while. The original debt was only \$2000, but ultimately \$4300 needs to be paid over 7.2 years in order to pay off the debt.

The natural question is, “Why did it take so long?” Of course, the answer is that the debtor only paid the minimal amount — $50 per month, or $600 per year. It stands to reason that if extra money was paid each month, then the debt will be paid off faster at lesser expense.

To give one example, let’s repeat the calculation if the debtor paid twice as much ($100 per month, or $1200 per year). Then the amount owed as a function of time would be

$A(t) = \displaystyle \frac{1200}{0.25} - \left( \frac{1200}{0.25} - 2000 \right) e^{0.25t} = 4800 - 2800 e^{0.25t}$

To find when the credit card will be paid off, we set $A(t) = 0$ :

$0 = 4800 - 2800 e^{0.25t}$

$2800 e^{0.25t} = 4800)$

$e^{0.25t} = \displaystyle \frac{12}{7}$

$0.25t = \displaystyle \ln \left( \frac{12}{7} \right)$

$t = \displaystyle 4 \ln \left( \frac{12}{7} \right)$

$t \approx 2.16$

That’s certainly a lot faster! Also, the amount that’s spent over that time is also considerably less:

$\displaystyle 1000 \times 4 \ln \left( \frac{12}{7} \right) = 4000 \ln \left( \frac{12}{7} \right) \approx \$2156$ .

So, along with being a good way to practice proficiency with exponential and logarithmic functions, this problem lends itself for students discovering some basic principles of financial literacy.

Exponential growth and decay (Part 2): Paying off credit-card debt

You have a balance of $2,000 on your credit card. Interest is compounded continuously with a relative rate of growth of 25% per year. If you pay the minimum amount of $50 per month (or $600 per year), how long will it take for the balance to be paid?

In this post, I present the actual solution of this problem. In tomorrow’s post, I’ll give some pedagogical thoughts about how this problem, and other similar problems inspired by financial considerations, could fit into a Precalculus course.

Let’s treat this problem as a differential equation (though it could also be considered as a first-order difference equation… more on that later). Let $A(t)$ be the amount of money on the credit card after $t$ years. Then there are two competing forces on the amount of money that will be owed in the future:

The effect of compound interest, which will increase the amount owed by $0.25 A(t)$ per year.
The amount that’s paid off each year, which will decrease the amount owed by $\$600$ per year.

Combining, we obtain the differential equation

$\displaystyle \frac{dA}{dt} = 0.25 A - 600$

There are a variety of techniques by which this differential equation can be solved. One technique is separation of variables, thus pretending that $dA/dt$ is actually a fraction. (In the derivation below, I will be a little sloppy with the arbitrary constant of integration for the sake of simplicity.)

$\displaystyle \frac{dA}{0.25 A - 600} = dt$

$\displaystyle \int \frac{dA}{0.25 A - 600} = \displaystyle \int dt$

$\displaystyle 4 \int \frac{0.25 dA}{0.25 A - 600} = \displaystyle \int dt$

$4 \ln |0.25A - 600| = t + C$

$\ln |0.25A - 600| = 0.25 t + C$

$|0.25A - 600| = e^{0.25 t + C}$

$|0.25 A - 600| = C e^{0.25t}$

$0.25A - 600 = C e^{0.25t}$

$0.25 A = 600 + C e^{0.25t}$

$A = 2400 + C e^{0.25t}$

To solve for the missing constant $C$ , we use the initial condition $A(0) = 2000$ :

$A(0) = 2400 + C e^0$

$2000 = 2400 + C$

$-400 = C$

We thus conclude that the amount of money owed after $t$ years is

$A(t) = 2400 - 400 e^{0.25t}$

To determine when the amount of the credit card will be reduced to $0, we see $A(t) = 0$ and solve for $t$ :

$0 = 2400 - 400 e^{0.25 t}$

$400 e^{0.25 t} = 2400$

$e^{0.25t} = 6$

$0.25t = \ln 6$

$t = 4 \ln 6$

$t \approx 7.2 \hbox{~years}$

In tomorrow’s post, I’ll give some pedagogical thoughts about this problem and similar problems.

Exponential growth and decay (Part 1): Phrasing of homework questions

I just completed a series of posts concerning the different definitions of the number $e$ . As part of this series, we considered the formula for continuous compound interest

$A = Pe^{rt}$

Indeed, this formula can be applied to other phenomena besides the accumulation of money. Unfortunately, as they appear in Precalculus textbooks, the wording of questions involving exponential growth or decay can be either really awkward or mathematically imprecise (or both). Here’s a sampling of problems that I’ve collected from various sources:

One thousand bacteria on a petri dish are placed in an incubator, encouraging a relative rate of growth of 10% per hour. How many bacteria will there be in two days?

This is mathematically precise, as it relates to the differential equation $A'(t) = r A(t)$ with solution $A = P e^{rt}$ . The meaning of the value of $r$ is clear from dimensional analysis: the units of $A'(t)$ are $\hbox{bacteria}/ \hbox{hour}$ , while the units of $A(t)$ are $\hbox{bacteria}$ . Therefore, the units of $r$ must be $\hbox{hour}^{-1}$ . So saying that there’s a “relative rate of growth of 10% per hour” makes total sense.

Of course, when Precalculus students are solving this problem, they have no idea about what a differential equation is, making the word relative seem superfluous to the problem.

A sum of $5000 is invested at an interest rate of 9% per year. Find the time required for the money to double if the interest is compounded continuously.

What the problem is trying to say is “Let $r = 0.09$ .” But this is a horrible way to write this in ordinary English! After all, if we plug $r = 0.09$ and $t = 1$ into the formula, we obtain

$A = P e^{0.09 \times 1} \approx 1.09417P$

So it would appear that the interest rate after one year is about 9.417%, and not 9%.

Indeed, if we read the problem at face value that the interest rate is 9% per year, then it stands to reason that, after one year, we have

$P(1.09) = P e^{r \cdot 1}$

$1.09 = e^r$

$\ln 1.09 = r$

In a nutshell, saying that there is “an interest rate of 9% per year” can easily be interpreted to mean that the annual percentage rate is 9% year, and this can be a conceptual barrier for literally-minded students.

I don’t have a good solution for this impasse between ordinary English and giving clear directions to students about what numbers should be used in the formula. But I do think that it’s important for teachers to be aware of this possible misunderstanding as students read their homework questions.

Different definitions of logarithm (Part 8)

There are two apparently different definitions of a logarithm that appear in the secondary mathematics curriculum:

From Algebra II and Precalculus: If $a > 0$ and $a \ne 1$ , then $f(x) = \log_a x$ is the inverse function of $g(x) = a^x$ .
From Calculus: for $x > 0$ , we define $\ln x = \displaystyle \int_1^x \frac{1}{t} dt$ .

In the previous posts of this series, we established the connection between these two apparently different ways of defining a logarithm.

In this post, I describe some natural consequences of Definition 2 as it relates to calculus involving logarithmic and exponential functions. None of the results that follow are new to the senior math majors taking my capstone course. However, they almost unanimously tell me that they never learned why these rules worked when they were taking the calculus sequence several semesters ago… instead, they were given these rules to just memorize without worrying about where they came from. (I don’t doubt that some of them were taught the logical development that’s shown below, but perhaps they forgot about it since they were never held responsible for this logical development on homework or exams when they were freshmen or sophomores.)

A pedagogical note: because none of these theorems should be a surprise to students, I’ll write the left-hand sides and the equal sign on the board and then leave the right-hand side blank. After my students tell me what the punch line of the theorem should be, then I’ll write the complete statement of the theorem on the board before proceeding to the proof. If my students can’t remember the punch-line (which has happened for Theorem 2 below), I’ll just start the proof until we reach the conclusion and then fill in the blank right-hand side of the theorem’s statement later.

Theorem 1. $\displaystyle \frac{d}{dx} (\ln x) = \displaystyle \frac{1}{x}$

Proof. This is a consequence of the Fundamental Theorem of Calculus. Loosely stated, the derivative of an integral is the original function.

Theorem 2. $\displaystyle \frac{d}{dx} (\log_a x) = \displaystyle \frac{1}{x \ln a}$

Proof. This is a consequence of the change-of-base formula:

$\log_a x = \displaystyle \frac{\log_e x}{\log_e a} = \displaystyle \frac{\ln x}{\ln a}$

$\frac{d}{dx} \left( \log_a x \right) = \displaystyle \frac{d}{dx} \left( \frac{\ln x}{\ln a} \right)$

Since $\ln a$ is just a constant, we conclude

$\frac{d}{dx} \left( \log_a x \right) = \displaystyle \frac{1}{\ln a} \cdot \frac{d}{dx} ( \ln x ) = \displaystyle \frac{1}{\ln a} \cdot \frac{1}{x}$

Theorem 3. $\displaystyle \frac{d}{dx} \left( a^x \right) = a^x \ln a$

Proof. This proof uses a standard trick. Suppose that

$y = a^x$

so that

$\log_a y = x$

We need to compute the derivative $\displaystyle \frac{dy}{dx}$ . To this end, let’s apply implicit differentiation to this last equation:

$\displaystyle \frac{1}{y \ln a} \cdot \frac{dy}{dx} = 1$

$\displaystyle \frac{dy}{dx} = y \ln a$

$\displaystyle \frac{d}{dx} \left( a^x \right) = a^x \ln a$

Technical Note: The above proof assumes that the derivative $\displaystyle \frac{dy}{dx}$ exists in the first place, and so, technically, implicit differentiation is inappropriate (even if it does produce the correct answer). However, the existence proof is quite difficult, requiring an advanced version of the Mean Value Theorem. So, for pedagogical reasons, I have absolutely no qualms about showing the above proof to students in a class that does not require second-semester real analysis as a prerequisite.

Pedagogical Note: After presenting this proof to my students, students are usually emotionally stunned. They usually tell me that they never saw how the derivative of an inverse function could be computed back when they were in the calculus sequence… it was just another formula that they had to memorize. So (time permitting), to let this idea sink in a little further, I’ll go off on a tangent (pun intended)…

…to find the derivative of an inverse trigonometric function:

$y = \sin^{-1} x$

$\sin y = x$

$\cos y \cdot \displaystyle \frac{dy}{dx} = 1$

$\displaystyle \frac{dy}{dx} = \frac{1}{\cos y}$

$\displaystyle \frac{dy}{dx} = \frac{1}{\sqrt{1 - \sin^2 y}}$

$\displaystyle \frac{dy}{dx} = \frac{1}{\sqrt{1-x^2}}$

I’ll then invite students to figure out the derivatives of $y = \cos^{-1} x$ and $y = \tan^{-1} x$ on their own after class.

Theorem 4. $\displaystyle \frac{d}{dx} \left( e^x \right) = e^x$

Proof. Let $a = e$ in Theorem 3. Alternatively, repeat the above argument for the inverse functions $e^x$ and $\ln x$ .

Different definitions of logarithm (Part 7)

There are two apparently different definitions of a logarithm that appear in the secondary mathematics curriculum:

From Algebra II and Precalculus: If $a > 0$ and $a \ne 1$ , then $f(x) = \log_a x$ is the inverse function of $g(x) = a^x$ .
From Calculus: for $x > 0$ , we define $\ln x = \displaystyle \int_1^x \frac{1}{t} dt$ .

The connection between these two apparently different ideas begins with the following theorem, which was proven in the few previous posts.

Theorem. Let $a \in \mathbb{R}^+ \setminus \{1\}$ . Suppose that $f: \mathbb{R}^+ \rightarrow \mathbb{R}$ has the following four properties:

$f(1) = 0$

$f(a) = 1$

$f(xy) = f(x) + f(y)$ for all $x, y \in \mathbb{R}^+$

$f$ is continuous

Then $f(x) = \log_a x$ for all $x \in \mathbb{R}^+$ .

At this point, we have provided enough groundwork to make the connection between these two different ways of viewing a logarithm.

Let’s define the function (for $x > 0$ )

$A(x) = \displaystyle \int_1^x \frac{1}{t} dt$ .

I’ll illustrate this with the appropriate area under the hyperbola $y = \frac{1}{x}$ . (Please forgive the crudeness of this drawing; I’m only using Microsoft Paint.)

So if $x$ is the right-hand limit, then $A(x)$ is just the shaded area under the curve.

Often, someone will interject, “Hey, I know how to do that… it’s just the natural logarithm of $x$ .” To which I will respond, “Yes, that’s true. But why is it the natural logarithm of $x$ ?” I have yet to encounter a student who can immediately answer this question (which, of course, is the whole point of me presenting this in class). In other words, I want my students to realize that, many semesters ago, they pretty much accepted on faith that the above integral is equal to $\ln x$ , but they were never told the reason why. And now — several semesters after completing the calculus sequence — we’re finally going over the reason why.

To start, I’ll say, “OK, $A$ is defined as an integral. That means that it must have…” Someone will usually volunteer, “A derivative.” I’ll respond, “That’s right. The Fundamental Theorem of Calculus says that this function is differentiable. So, if something is differentiable, then it also must be…” Someone will usually volunteer, “Continuous.” My response: “That’s right. So $A$ must be continuous. So that’s Property 4: this function is continuous.” I’ll continue: “Let’s see if we can get the other properties.”

I’ll next move to Property 1, as it’s the next easiest. I’ll ask the class, “Can you prove to me that $A(1) = 0$ ?” After a moment of thought, someone will notice that

$A(1) = \displaystyle \int_1^1 \frac{1}{t} dt$ .

must be equal to $0$ since the left and right endpoints of the integral are the same.

Then I’ll skip over to Property 3, which requires a little more thought. To begin, we can write

$A(xy) = \displaystyle \int_1^{xy} \frac{1}{t} dt = \displaystyle \int_1^{x} \frac{1}{t} dt + \int_x^{xy} \frac{1}{t} dt$ .

Before proceeding, I’ll ask my class why the above line has to be true. After a couple moments, someone will volunteer something like “The area from $1$ to $x$ plus the area from $x$ to $xy$ has to be equal to the area from $1$ to $xy$ .”

I’ll then say something like, “We can simplify one of the integrals on the right-hand side right away. Which one?” Students quickly see that the first integral on the right, $\displaystyle \int_1^{x} \frac{1}{t} dt$ , is of course equal to $A(x)$ . So then I’ll ask, “So what do I want the last integral to be equal to?” Students look back at Property 3 and answer, “That should be $A(y)$ .

So, if we can show the final integral is equal to $A(y)$ , we have established Property 3. To this end, I will perform a somewhat unusual looking $u-$ substitution:

$t = ux$

In this formula, I encourage my students to think of $t$ as the old variable of integration, $u$ as the new variable of integration, and $x$ as an unknown number that is constant. So I’ll say parenthetically, “If $t = 5u$ , how do we find $dt$ ?” Students of course answer, “ $dt$ must be $5 du$ .” So I’ll follow up: “If $t = xu$ , how do we find $dt$ ?” Students get the idea:

$t = x \, du$

So to complete the $u-$ substitution, we must adjust the limits of integration. For the lower limit,

$t = x \Longrightarrow u = \displaystyle \frac{x}{x} = 1$

For the upper limit,

$t = xy \Longrightarrow u = \displaystyle \frac{xy}{x} = y$

So we can now complete the $u-$ substitution of the second integral:

$A(xy) = \displaystyle \int_1^{xy} \frac{1}{t} dt = \displaystyle \int_1^{x} \frac{1}{t} dt + \int_x^{xy} \frac{1}{t} dt$ .

$A(xy) = \displaystyle \int_1^{x} \frac{1}{t} dt + \int_1^{y} \frac{1}{xu} x \, du$ .

$A(xy) = \displaystyle \int_1^{x} \frac{1}{t} dt + \int_1^{y} \frac{1}{u} \, du$ .

Students recognize that, except for the variable of integration, the last integral is just $A(y)$ , which leads to the punch line

$A(xy) = A(x) + A(y)$

In other words, we have established that the function $A$ satisfies Property 3.

So the only property left is Property 2. To that end, let’s define the number $e$ so that the area in green above is equal to 1. There’s no other way to describe this number…. we just increase $x$ far enough along the $x-$ axis until the area under the hyperbola is equal to 1. Wherever this happens, that’s the number that we’ll call $e$ . So, by definition, $A(e) = 1$ .

Therefore, by the above theorem, we conclude that $A(x) = \log_e x$ , written more simply as $\ln x$ .

To summarize: using the above theorem, we are able to establish that the integral $\displaystyle \int_1^x \frac{1}{t} dt$ has all of the properties of a logarithm and therefore must be a logarithmic function. The only catch is that we had to define $e$ to be the base of this logarithm through an unusual definition concerning the area under a hyperbola.

Of course, this is not the “standard” definition of $e$ that is usually encountered in a Precalculus class. More on these different definitions in a future series of posts.

One more pedagogical note: My experience is that I can cover the content of the first 7 posts of this series in a single 50-minute lecture and still keep my students’ attention. Naturally, I’ll recapitulate the highlights of this logical development at the start of the next lecture by way of review, as this is an awful lot to absorb at once.

Different definitions of logarithm (Part 6)

There are two apparently different definitions of a logarithm that appear in the secondary mathematics curriculum:

From Algebra II and Precalculus: If $a > 0$ and $a \ne 1$ , then $f(x) = \log_a x$ is the inverse function of $g(x) = a^x$ .
From Calculus: for $x > 0$ , we define $\ln x = \displaystyle \int_1^x \frac{1}{t} dt$ .

The connection between these two apparently different ideas begins with the following theorem.

Theorem. Let $a \in \mathbb{R}^+ \setminus \{1\}$ . Suppose that $f: \mathbb{R}^+ \rightarrow \mathbb{R}$ has the following four properties:

$f(1) = 0$

$f(a) = 1$

$f(xy) = f(x) + f(y)$ for all $x, y \in \mathbb{R}^+$

$f$ is continuous

Then $f(x) = \log_a x$ for all $x \in \mathbb{R}^+$ .

Note. To prove this theorem, I will show that $f(a^x) = x$ , thus proving that $f$ is the inverse of $g(x) = a^x$ .

The proof of these theorem divides into four cases:

Positive integers: $x = m \in \mathbb{Z}^+$
Positive rational numbers: $x = \frac{m}{n}$ , where $m,n \in \mathbb{Z}^+$
Negative rational numbers: $x \in \mathbb{Q}^-$
Real (possibly irrational) numbers: $x \in \mathbb{R}$

In today’s post, I’ll complete the proof by handling Case 4.

Before starting Case 4, I like to take inventory of where we stand in the proof at this point. We have now proven the theorem for all positive rational numbers and for all negative rational numbers. There’s only one rational number left: $x = 0$ . And this single case is simply handled through Property 1:

$f(a^0) = f(1) = 0$

I also like to keep track of which hypotheses have been used so far in the proof. A quick review of Cases 1-3 will reveal that Properties 1-3 have all been used at least once, but Property 4 (the assumption that $f$ is continuous) has not be used so far. Therefore, we had better expect to use it before completing the proof.

I won’t tell the class this (for fear of discouraging them), but the proof of Case 4 is a bit more abstract than Cases 1-3. I can give a numerical example that (hopefully) will shed some insight into the actual proof. However, for Case 4, the actual proof will not be a perfect parallel of the numerical example (as in Cases 1-3).

Idea behind Case 4. Let’s pick a familiar irrational number like $\sqrt{2}$ . There is a natural way to approximate $\sqrt{2}$ by a sequence of rational numbers… namely, the sequence of numbers obtained by taking one extra digit in the decimal expansion of $\sqrt{2}$ . In other words,

$r_1 = 1$

$r_2 = 1.4$

$r_3 = 1.41$

$r_4 = 1.414$

and so on.

In this way, $\displaystyle \lim_{n \to \infty} r_n = \sqrt{2}$ .

We would hope that the sequence

$f \left( a^1 \right), f \left( a^{1.4} \right), f \left( a^{1.41} \right), f \left( a^{1.414} \right), \dots$

converges to the obvious limit of

$f \left( a^{\sqrt{2}} \right)$ .

However, this sequence is also equal to

$1, 1.4, 1.41, 1.414, \dots$

since each exponent is rational. Since a sequence has only one limit, we conclude that these two limits should be equal:

$\sqrt{2} = f \left( a^{\sqrt{2}} \right)$

So that’s the idea of the formal proof, which we now tackle. In the proof below, I’ve marked with quotations some of the more parenthetical steps so that the main argument of the proof stands out a little bit better. You’ll notice that, unlike Cases 1-3, I don’t use as much directed questioning to get students to volunteer the next step of the proof with minimal assistance from me. That’s because I haven’t figured out a good way to use inquiry to quickly get through Case 4.

Proof of Case 4. Let $\{ r_n \}$ be a sequence of rational numbers that converges to $x$ . (Parenthetically, I’ll mention that the sequence of decimal approximations would be one such sequence, just to make this mysterious $\{ r_n \}$ thing that just appeared out of the blue a little less daunting. Of course, any sequence of rational numbers that converges to $x$ will do. Therefore,

$f \left( a^x \right) = f \left( a^{\lim_{n \to \infty} r_n} \right)$

The function $g(x) = a^x$ is continuous. From the ordinary definition of continuous used in calculus, this means that

$\displaystyle \lim_{x \to c} g(x) = g(c)$ .

In other words, the function and the limit can be interchanged. (I’ll usually throw in my standard joke about functions commuting at this point in the lecture.) Stated in terms of a sequence $r_n \to x$ , this means that

$\displaystyle \lim_{n \to \infty} g(r_n) = g(x) = g \left( \lim_{n \to \infty} r_n \right)$ .

Stated another way,

$\displaystyle \lim_{n \to \infty} a^{r_n} = a^{ \lim_{n \to \infty} r_n}$ .

In light of the above work, we conclude that

$f \left( a^x \right) = f \left( a^{\lim_{n \to \infty} r_n} \right) = f \left( \displaystyle \lim_{n \to \infty} a^{r_n} \right)$

Stated simply, the function and the limit interchange.

We now perform a similar step for the function $f$ . Because $f$ is assumed to be continuous, we know that

$\displaystyle \lim_{n \to \infty} f(s_n) = f(c) = f \left( \lim_{n \to \infty} s_n \right)$

if $\{ s_n \}$ is a sequence that converges to $c$ . So, if we replace $s_n$ by $a^{r_n}$ and $c$ by $\displaystyle \lim_{n \to \infty} a^{r_n}$ , we conclude that

$\displaystyle \lim_{n \to \infty} f \left( a^{r_n} \right) = f \left( \displaystyle \lim_{n \to \infty} a^{r_n} \right)$

From the above insight, we see that we have the next step of the proof:

$f \left( a^x \right) = f \left( a^{\lim_{n \to \infty} r_n} \right)$

$= f \left( \displaystyle \lim_{n \to \infty} a^{r_n} \right)$

$= \displaystyle \lim_{n \to \infty} f \left(a^{r_n} \right)$

From now on, the concluding steps are pretty straightforward. The exponent on the last line is a rational number. Therefore, by Cases 2 and 3, we have produce the next step:

$f \left( a^x \right) = f \left( a^{\lim_{n \to \infty} r_n} \right)$

$= f \left( \displaystyle \lim_{n \to \infty} a^{r_n} \right)$

$= \displaystyle \lim_{n \to \infty} f \left(a^{r_n} \right)$

$= \displaystyle \lim_{n \to \infty} r_n$

Finally, by definition from the top of the proof, we can evaluate this limit:

$f \left( a^x \right) = f \left( a^{\lim_{n \to \infty} r_n} \right)$

$= f \left( \displaystyle \lim_{n \to \infty} a^{r_n} \right)$

$= \displaystyle \lim_{n \to \infty} f \left(a^{r_n} \right)$

$= \displaystyle \lim_{n \to \infty} r_n$

$= x$

This concludes the proof that $f \left( a^x \right) = x$ , even if $x$ is an arbitrary (possibly irrational) real number.

Different definitions of logarithm (Part 5)

There are two apparently different definitions of a logarithm that appear in the secondary mathematics curriculum:

From Algebra II and Precalculus: If $a > 0$ and $a \ne 1$ , then $f(x) = \log_a x$ is the inverse function of $g(x) = a^x$ .
From Calculus: for $x > 0$ , we define $\ln x = \displaystyle \int_1^x \frac{1}{t} dt$ .

The connection between these two apparently different ideas begins with the following theorem.

Theorem. Let $a \in \mathbb{R}^+ \setminus \{1\}$ . Suppose that $f: \mathbb{R}^+ \rightarrow \mathbb{R}$ has the following four properties:

$f(1) = 0$

$f(a) = 1$

$f(xy) = f(x) + f(y)$ for all $x, y \in \mathbb{R}^+$

$f$ is continuous

Then $f(x) = \log_a x$ for all $x \in \mathbb{R}^+$ .

Note. To prove this theorem, I will show that $f(a^x) = x$ , thus proving that $f$ is the inverse of $g(x) = a^x$ .

The proof of these theorem divides into four cases:

Positive integers: $x = m \in \mathbb{Z}^+$
Positive rational numbers: $x = \frac{m}{n}$ , where $m,n \in \mathbb{Z}^+$
Negative rational numbers: $x \in \mathbb{Q}^-$
Real (possibly irrational) numbers: $x \in \mathbb{R}$

In today’s post, I’ll describe how I prompt my students to prove Case 3 during class time. Cases 4 will appear in tomorrow’s post.

Idea behind Case 3. Though not formally necessary for the proof, I’ve found it helpful to illustrate the idea of the proof with a specific example before proceeding to the general case. So — on the far end of the chalkboard, away from the space that I’ve allocated for the formal write-up of the proof — I’ll write

$f \left( a^{-2/3} \cdot a^{2/3} \right) =$

I’ll then ask, “How else can we simplify the left-hand side?” As we’ll see below, there are actually two legitimate ways of proceeding. Someone will usually suggest just simplifying the product, and so I’ll write this as the next step:

$f \left( a^{-2/3} \cdot a^{2/3} \right) = f \left( a^0 \right)$

I’ll then ask a very open-ended question, “Now what?” Usually, someone will suggest simplifying the right-hand side using Property 1:

$f \left( a^{-2/3} \cdot a^{2/3} \right) = f \left( a^0 \right) = 0$

By this point, after completing Cases 1 and 2, someone will usually suggest expanding the left-hand side:

$f \left( a^{-2/3} \right) + f \left( a^{2/3} \right) = 0$

I’ll then ask, “What can we do now?” Hopefully, someone will observe that the second term can be simplified using Case 2:

$f \left( a^{-2/3} \right) + \displaystyle \frac{2}{3} = 0$

$f \left( a^{-2/3} \right) = - \displaystyle \frac{2}{3}$

I’ll then note that we’ve finished what we set out to do: show that $f(a^x) = x$ when $x = - \frac{2}{3}$ , a negative rational number.

The natural next question is, “Can we do this for any negative rational number and not just $-\frac{2}{3}$ ?” This leads to the proof of Case 3. I’ve found that it’s helpful to walk through this proof line by line in step with the case of $x= -\frac{2}{3}$ , so that students can see how the steps of this more abstract proof correspond to the concrete example of $x = =\frac{2}{3}$ .

Proof of Case 3. Let $m, n \in \mathbb{Z}^+$ . Then

$f \left(a^{-m/n} \cdot a^{m/n} \right) = f \left(a^0 \right)$

$f \left( a^{-m/n} \right) + f \left( a^{m/n} \right) = 0$

$f \left( a^{-m/n} \right) + \displaystyle \frac{m}{n} = 0$

$f \left( a^{-m/n} \right) = - \displaystyle \frac{m}{n}$

Again, I’ve found that the special case $x = - \frac{2}{3}$ is pedagogically helpful, if not logically necessary to prove Case 3.

Different definitions of logarithm (Part 4)

There are two apparently different definitions of a logarithm that appear in the secondary mathematics curriculum:

From Algebra II and Precalculus: If $a > 0$ and $a \ne 1$ , then $f(x) = \log_a x$ is the inverse function of $g(x) = a^x$ .
From Calculus: for $x > 0$ , we define $\ln x = \displaystyle \int_1^x \frac{1}{t} dt$ .

The connection between these two apparently different ideas begins with the following theorem.

Theorem. Let $a \in \mathbb{R}^+ \setminus \{1\}$ . Suppose that $f: \mathbb{R}^+ \rightarrow \mathbb{R}$ has the following four properties:

$f(1) = 0$

$f(a) = 1$

$f(xy) = f(x) + f(y)$ for all $x, y \in \mathbb{R}^+$

$f$ is continuous

Then $f(x) = \log_a x$ for all $x \in \mathbb{R}^+$ .

Note. To prove this theorem, I will show that $f(a^x) = x$ , thus proving that $f$ is the inverse of $g(x) = a^x$ .

The proof of these theorem divides into four cases:

Positive integers: $x = m \in \mathbb{Z}^+$
Positive rational numbers: $x = \frac{m}{n}$ , where $m,n \in \mathbb{Z}^+$
Negative rational numbers: $x \in \mathbb{Q}^-$
Real (possibly irrational) numbers: $x \in \mathbb{R}$

In today’s post, I’ll describe how I prompt my students to prove Case 2 during class time. Cases 3-4 will appear in the coming posts.

Idea behind Case 2. Though not formally necessary for the proof, I’ve found it helpful to illustrate the idea of the proof with a specific example before proceeding to the general case. So — on the far end of the chalkboard, away from the space that I’ve allocated for the formal write-up of the proof — I’ll write

$2 = f(a^2)$

I’ll ask, “How do we know this is true?” The immediate answer: We just did Case 1. I’ll then do something a little unusual and rewrite this equation in a more complicated way:

$2 = f(a^2) = f \left( \left[a^{2/3} \right]^3 \right)$

After double-checking that the class agrees with this step (even if I just made the right-hand more complicated instead of the usual step of simplifying the right-hand side), I’ll then ask, “OK, we have something to the third power. What can we now do to the right-hand side?” Almost immediately, someone will volunteer the correct next steps using Property 3:

$2 = f(a^2) = f \left( a^{2/3} \cdot a^{2/3} \cdot a^{2/3} \right) = f \left( a^{2/3} \right) + f \left( a^{2/3} \right) + f \left( a^{2/3} \right)$

I’ll then ask, “How can we simplify the right-hand side?” After a moment of thought, someone will volunteer the correct next step:

$2 = f(a^2) = f \left( a^{2/3} \cdot a^{2/3} \cdot a^{2/3} \right) = f \left( a^{2/3} \right) + f \left( a^{2/3} \right) + f \left( a^{2/3} \right)$

$2 = 3 f \left( a^{2/3} \right)$

I’ll then ask, “How do we isolate the $f \left( a^{2/3} \right)$ term?” The obvious correct answer:

$\displaystyle \frac{2}{3} = f(a^{2/3})$

I’ll then note that we’ve finished what we set out to do: show that $f(a^x) = x$ when $x = \frac{2}{3}$ .

The natural next question is, “Can we do this for any positive rational number and not just $\frac{2}{3}$ ?” This leads to the proof of Case 2. I’ve found that it’s helpful to walk through this proof line by line in step with the case of $x=\frac{2}{3}$ , so that students can see how the steps of this more abstract proof correspond to the concrete example of $x =\frac{2}{3}$ .

Proof of Case 2. Let $x = \displaystyle \frac{m}{n}$ where $m, n \in \mathbb{R}^+$ . Then

$m = f(a^m)$

$m = f \left( \left[ a^{m/n} \right]^n \right)$

$m = f \left( a^{m/n} \cdot a^{m/n} \cdot \dots \cdot a^{m/n} \right)$

$m = f \left( a^{m/n} \right) + f \left( a^{m/n} \right) + \dots + f \left( a^{m/n} \right)$

$m = n f \left( a^{m/n} \right)$

$\displaystyle \frac{m}{n} = f \left( a^{m/n} \right)$

Of course, the special case $x = \frac{2}{3}$ is not logically necessary to prove Case 2. Though not logically necessary, I’ve found it to be pedagogically convenient. From the school of hard knocks, I’ve found that the proof of Case 2 goes over easier with students when they see the idea of the proof presented concretely and then abstractly.

Different definitions of logarithm (Part 3)

There are two apparently different definitions of a logarithm that appear in the secondary mathematics curriculum:

From Algebra II and Precalculus: If $a > 0$ and $a \ne 1$ , then $f(x) = \log_a x$ is the inverse function of $g(x) = a^x$ .
From Calculus: for $x > 0$ , we define $\ln x = \displaystyle \int_1^x \frac{1}{t} dt$ .

The connection between these two apparently different ideas begins with the following theorem.

Theorem. Let $a \in \mathbb{R}^+ \setminus \{1\}$ . Suppose that $f: \mathbb{R}^+ \rightarrow \mathbb{R}$ has the following four properties:

$f(1) = 0$

$f(a) = 1$

$f(xy) = f(x) + f(y)$ for all $x, y \in \mathbb{R}^+$

$f$ is continuous

Then $f(x) = \log_a x$ for all $x \in \mathbb{R}^+$ .

Note. To prove this theorem, I will show that $f(a^x) = x$ , thus proving that $f$ is the inverse of $g(x) = a^x$ .

The proof of these theorem divides into four cases:

Positive integers: $x = m \in \mathbb{Z}^+$
Positive rational numbers: $x = \frac{m}{n}$ , where $m,n \in \mathbb{Z}^+$
Negative rational numbers: $x \in \mathbb{Q}^-$
Real (possibly irrational) numbers: $x \in \mathbb{R}$

In today’s post, I’ll describe how I prompt my students to prove Case 1 during class time. Cases 2-4 will appear in the coming posts.

Idea behind Case 1. Though not formally necessary for the proof, I’ve found it helpful to illustrate the idea of the proof with a specific example before proceeding to the general case. So — on the far end of the chalkboard, away from the space that I’ve allocated for the formal write-up of the proof — I’ll write

$f(a^4) =$

I’ll then ask, “How else can we write $a^4$ ?” Someone will usually suggest $a \cdot a \cdot a \cdot a$ , and so I’ll write this as the next step:

$f(a^4) = f(a \cdot a \cdot a \cdot a)$

I’ll then ask, “OK, we have a product here. How can we simplify the right-hand side?” After a moment of thought, someone will volunteer that Property 3 allows the right-hand side to be split up into pieces:

$f(a^4) = f(a \cdot a \cdot a \cdot a) = f(a) + f(a) + f(a) + f(a)$

(Technically, this requires mathematical induction to generalize Property 3 from a product of two numbers to a product of arbitrarily many numbers, but I don’t think that it’s worth the time to expound on this pedantic point.) I’ll then ask, “How can we simplify this?” Almost immediately, someone will usually volunteer Property 2:

$f(a^4) = f(a \cdot a \cdot a \cdot a) = f(a) + f(a) + f(a) + f(a) = 1 + 1 + 1 + 1 = 4$

I’ll then note that we’ve finished what we set out to do: show that $f(a^x) = x$ when $x = 4$ .

The natural next question is, “Can we do this for any positive integer and not just 4?” This leads to the proof of Case 1. I’ve found that it’s helpful to walk through this proof line by line in step with the case of $x=4$ , so that students can see how the steps of this more abstract proof correspond to the concrete example of $x =4$ .

Proof of Case 1.

$f(a^m) = f(a \cdot a \cdot \dots \cdot a)$

$= f(a) + f(a) + \dots + f(a)$

$= 1 + 1 + \dots + 1$

$= m$

Of course, the special case $x = 4$ is not logically necessary to prove Case 1. That said, from the school of hard knocks, I’ve found that the proof of Case 1 goes over easier with students when they see the idea of the proof presented concretely and then abstractly.

Different definitions of logarithm (Part 2)

There are two apparently different definitions of a logarithm that appear in the secondary mathematics curriculum:

From Algebra II and Precalculus: If $b > 0$ and $b \ne 1$ , then $f(x) = \log_b x$ is the inverse function of $g(x) = b^x$ .
From Calculus: for $x > 0$ , we define $\ln x = \displaystyle \int_1^x \frac{1}{t} dt$ .

In this series of posts, we examine the interrelationship between these two different approaches to logarithms. This is a standard topic in my class for future teachers of secondary mathematics as a way of deepening their understanding of a topic that they think they know quite well.

The connection between these two apparently different ideas begins with the following theorem.

Theorem. Let $a \in \mathbb{R}^+ \setminus \{1\}$ . Suppose that $f: \mathbb{R}^+ \rightarrow \mathbb{R}$ has the following four properties:

$f(1) = 0$

$f(a) = 1$

$f(xy) = f(x) + f(y)$ for all $x, y \in \mathbb{R}^+$

$f$ is continuous

Then $f(x) = \underline{\hspace{1in}}$ for all $x \in \mathbb{R}^+$ .

When writing this on the board, I purposefully leave an underline for my students to fill in, because I want them to think. What familiar function has these four properties? I’ll usually invoke the old chidren’s joke: “If it looks like an elephant, smells like an elephant, feels like an elephant, and sounds like an elephant, then it must be an elephant.” After a moment of thought, someone will usually volunteer $f(x) = \log x$ . That’s almost correct, and so I’ll ask if Property 2 is satisfied by this function. After a couple more moments of thought, someone will volunteer the correct answer, $f(x) = \log_a x$ .

To prove this theorem, I will show that

$f(a^x) = x$ for all $x \in \mathbb{R}$ .

I’ll make the observation that the case of $latex x=0$ is Property 1, while the case of $x = 1$ is Property 2.

Then I’ll ask the class: “If I’m able to prove that $f(a^x) = x$ for all real $x$ , why does this mean that $f(x) = \log_a x$ ?” Perhaps unsurprisingly, this usually draws blank stares for a few seconds until someone realizes that this means that $f: \mathbb{R}^+ \rightarrow \mathbb{R}$ and $g: \mathbb{R} \rightarrow \mathbb{R}^+$ defined by $g(x) = a^x$ are inverse functions. So (by definition) $f(x)$ must be equal to $\log_a x$ .

The proof of these theorem has four parts:

Positive integers: $x = m \in \mathbb{Z}^+$
Positive rational numbers: $x = \frac{m}{n}$ , where $m,n \in \mathbb{Z}^+$
Negative rational numbers: $x \in \mathbb{Q}^-$
Real (possibly irrational) numbers: $x \in \mathbb{R}$

Beginning with tomorrow’s post, I’ll discuss how I walk students through the proof in lecture.