Confirming Einstein’s Theory of General Relativity With Calculus, Part 4b: Acceleration in Polar Coordinates

In this series, I’m discussing how ideas from calculus and precalculus (with a touch of differential equations) can predict the precession in Mercury’s orbit and thus confirm Einstein’s theory of general relativity. The origins of this series came from a class project that I assigned to my Differential Equations students maybe 20 years ago.

In this part of the series, we will show that if the motion of a planet around the Sun is expressed in polar coordinates (r,\theta), with the Sun at the origin, then under Newtonian mechanics (i.e., without general relativity) the motion of the planet follows the differential equation

u''(\theta) + u(\theta) = \displaystyle \frac{1}{\alpha},

where u = 1/r and \alpha is a certain constant. Deriving this governing differential equation will require some principles from physics. If you’d rather skip the physics and get to the mathematics, we’ll get to solving this differential equations in the next post.

Part of the derivation of this governing differential equation will involve Newton’s Second Law

{\bf F} = m {\bf a},

where m is the mass of the planet and the force {\bf F} and the acceleration a are vectors. In usual rectangular coordinates, the acceleration vector would be expressed as

{\bf a} = x''(t) {\bf i} + y''(t) {\bf j},

where the components of the acceleration in the x- and y-directors are x''(t) and y''(t), and the unit vectors {\bf i} and {\bf j} are perpendicular, pointing in the positive x and positive y directions.

Unfortunately, our problem involves polar coordinates, and rewriting the acceleration vector in polar coordinates, instead of rectangular coordinates, is going to take some work.

Suppose that the position of the planet is (r,\theta) in polar coordinates, so that the position in rectangular coordinates is {\bf r} = (r\cos \theta, r \sin \theta). This may be rewritten as

{\bf r} = r \cos \theta {\bf i} + r \sin \theta {\bf j} = r ( \cos \theta {\bf i} + \sin \theta {\bf j}) = r {\bf u}_r,

where

{\bf u}_r = \cos \theta {\bf i} + \sin \theta {\bf j}

is a unit vector that points away from the origin. We see that this is a unit vector since

\parallel {\bf u}_r \parallel = {\bf u}_r \cdot {\bf u}_r = \cos^2 \theta + \sin^2 \theta =1.

We also define

{\bf u}_\theta = -\sin \theta {\bf i} + \cos \theta {\bf j}

to be a unit vector that is perpendicular to {\bf u}_r; it turns out that {\bf u}_\theta points in the direction of increasing \theta. To see that {\bf u}_r and {\bf u}_\theta are perpendicular, we observe

{\bf u}_r \cdot {\bf u}_\theta = -\sin \theta \cos \theta + \sin \theta \cos \theta = 0.

Computing the velocity and acceleration vectors in polar coordinates will have a twist that’s not experienced with rectangular coordinates since both {\bf u}_r and {\bf u}_\theta are functions of \theta. Indeed, we have

\displaystyle \frac{d{\bf u}_r}{d\theta} =  \frac{d \cos \theta}{d\theta} {\bf i} + \frac{d\sin \theta}{d\theta} {\bf j} = -\sin \theta {\bf i} + \cos \theta {\bf j} = {\bf u}_\theta.

Furthermore,

\displaystyle \frac{d{\bf u}_\theta}{d\theta} =  -\frac{d \sin \theta}{d\theta} {\bf i} + \frac{d\cos \theta}{d\theta} {\bf j} = -\cos \theta {\bf i} - \sin \theta {\bf j} = -{\bf u}_r.

These two equations will be needed in the derivation below.

We are now in position to express the velocity and acceleration of the orbiting planet in polar coordinates. Clearly, the position of the planet is r {\bf u}_r, or a distance r from the origin in the direction of {\bf u}_r. Therefore, by the Product Rule, the velocity of the planet is

{\bf v} = \displaystyle \frac{d}{dt} (r {\bf u}_r) = \displaystyle \frac{dr}{dt} {\bf u}_r + r \frac{d {\bf u}_r}{dt}

We now apply the Chain Rule to the second term:

{\bf v} = \displaystyle \frac{dr}{dt} {\bf u}_r + r \frac{d {\bf u}_r}{d\theta} \frac{d\theta}{dt}

= \displaystyle \frac{dr}{dt} {\bf u}_r + r \frac{d\theta}{dt} {\bf u}_\theta.

Differentiating a second time with respect to time, and again using the Chain Rule, we find

{\bf a} = \displaystyle \frac{d {\bf v}}{dt} = \displaystyle \frac{d^2r}{dt^2} {\bf u}_r + \frac{dr}{dt} \frac{d{\bf u}_r}{dt} + \frac{dr}{dt} \frac{d\theta}{dt} {\bf u}_\theta + r \frac{d^2\theta}{dt^2} {\bf u}_\theta + r \frac{d\theta}{dt} \frac{d{\bf u}_\theta}{dt}

= \displaystyle \frac{d^2r}{dt^2} {\bf u}_r + \frac{dr}{dt} \frac{d{\bf u}_r}{d\theta} \frac{d\theta}{dt} + \frac{dr}{dt} \frac{d\theta}{dt} {\bf u}_\theta + r \frac{d^2\theta}{dt^2} {\bf u}_\theta +  r \frac{d\theta}{dt} \frac{d{\bf u}_\theta}{d\theta} \frac{d\theta}{dt}

= \displaystyle \frac{d^2r}{dt^2} {\bf u}_r + \frac{dr}{dt} \frac{d\theta}{dt} {\bf u}_\theta  + \frac{dr}{dt} \frac{d\theta}{dt} {\bf u}_\theta + r \frac{d^2\theta}{dt^2} {\bf u}_\theta -  r \left(\frac{d\theta}{dt} \right)^2 {\bf u}_r

= \displaystyle \left[ \frac{d^2r}{dt^2} -  r \left(\frac{d\theta}{dt} \right)^2 \right] {\bf u}_r + \left[ 2\frac{dr}{dt} \frac{d\theta}{dt} + r \frac{d^2\theta}{dt^2} \right] {\bf u}_\theta.

This will be needed in the next post, when we use both Newton’s Second Law and Newton’s Law of Gravitation, expressed in polar coordinates.

Confirming Einstein’s Theory of General Relativity With Calculus, Part 4a: Angular Momentum

In this series, I’m discussing how ideas from calculus and precalculus (with a touch of differential equations) can predict the precession in Mercury’s orbit and thus confirm Einstein’s theory of general relativity. The origins of this series came from a class project that I assigned to my Differential Equations students maybe 20 years ago.

In this part of the series, we will show that if the motion of a planet around the Sun is expressed in polar coordinates (r,\theta), with the Sun at the origin, then under Newtonian mechanics (i.e., without general relativity) the motion of the planet follows the differential equation

u''(\theta) + u(\theta) = \displaystyle \frac{1}{\alpha},

where u = 1/r and \alpha is a certain constant. Deriving this governing differential equation will require some principles from physics. If you’d rather skip the physics and get to the mathematics, we’ll get to solving this differential equations in a few posts.

One principle from physics that we’ll need is the Law of Conservation of Angular Momentum. Mathematically, this is expressed by

mr^2 \displaystyle \frac{d\theta}{dt} = \ell,

where \ell is a constant. Of course, this can be written as

\displaystyle \frac{d\theta}{dt} = \displaystyle \frac{\ell}{mr^2};

this will be used a couple times in the derivation below.

As we’ll soon see, we will need to express the second derivative \displaystyle \frac{d^2 r}{d t^2} in a form that depends only on \theta. To do this, we use the Chain Rule to obtain

r' = \displaystyle \frac{dr}{dt}

= \displaystyle \frac{dr}{d\theta} \cdot \frac{d\theta}{dt}

= \displaystyle \frac{\ell}{mr^2} \frac{dr}{d\theta}

= \displaystyle - \frac{\ell}{m} \frac{d}{d\theta} \left( \frac{1}{r} \right).

This last step used the Chain Rule in reverse:

\displaystyle \frac{d}{d\theta} \left( \frac{1}{r} \right) = \frac{d}{dr} \left( \frac{1}{r} \right) \cdot \frac{dr}{dt} = -\frac{1}{r^2} \cdot \frac{dr}{dt}.

To examine the second derivative \displaystyle \frac{d^2 r}{d t^2}, we again use the Chain Rule:

\displaystyle \frac{d^2 r}{d t^2} = \displaystyle \frac{dr'}{dt}

= \displaystyle \frac{dr'}{d\theta} \cdot \frac{d\theta}{dt}

= \displaystyle \frac{\ell}{mr^2} \frac{dr'}{d\theta}

= \displaystyle \frac{\ell}{mr^2} \frac{d}{d\theta} \left[ \frac{dr}{dt} \right]

= \displaystyle \frac{\ell}{mr^2} \frac{d}{d\theta} \left[ - \frac{\ell}{m} \frac{d}{d\theta} \left( \frac{1}{r} \right) \right]

= \displaystyle - \frac{\ell^2}{m^2r^2} \frac{d}{d\theta} \left[ \frac{d}{d\theta} \left( \frac{1}{r} \right) \right]

= \displaystyle - \frac{\ell^2}{m^2r^2} \frac{d^2}{d\theta^2}  \left( \frac{1}{r} \right) .

While far from obvious now, this will be needed when we rewrite Newton’s Second Law in polar coordinates.

How to check if a student really can perform the Chain Rule

In my experience, a problem like the following is the acid test for determining if a student really understands the Chain Rule:

Find f'(x) if f(x) = \left[6x^2 + \sin 5x \right]^3

The correct answer (unsimplified):

f'(x) = 3 \left[6x^2 + \sin 5x \right]^2 \left(12x + [\cos 5x] \cdot 5 \right)

However, even students that are quite proficient with the Chain Rule can often provide the following incorrect answer:

f'(x) = 3 \left[6x^2 + \sin 5x \right]^2 \left(12x + \cos 5x \right) \cdot 5

Notice the slightly incorrect placement of the 5 at the end of the derivative. Students can so easily get into the rhythm of just multiplying by the derivative of the inside that they can forget where the derivative of the inside should be placed.

Needless to say, a problem like this often appears on my exams as a way of separating the A students from the B students.

Teaching the Chain Rule inductively

I taught Calculus I every spring between 1996 and 2008. Perhaps the hardest topic to teach — at least for me — in the entire course was the Chain Rule. In the early years, I would show students the technique, but it seemed like my students accepted it on faith that their professor knew what he was talking about it. Also, it took them quite a while to become proficient with the Chain Rule… as opposed to the Product and Quotient Rules, which they typically mastered quite quickly (except for algebraic simplifications).

It took me several years before I found a way of teaching the Chain Rule so that the method really sunk into my students by the end of the class period. Here’s the way that I now teach the Chain Rule.

On the day that I introduce the Chain Rule, I teach inductively (as opposed to deductively). At this point, my students are familiar with how to differentiate y = x^n for positive and negative integers n, the trigonometric function, and y = \sqrt{x}. They also know the Product and Quotient Rules.

I begin class by listing a whole bunch of functions that can be found by the Chain Rule if they knew the Chain Rule. However, since my students don’t know the Chain Rule yet, they have to find the derivatives some other way. For example:

Let y = (3x - 5)^2. Then

y = (3x - 5) \cdot (3x -5)

y' = 3 \cdot (3x -5) + (3x -5) \cdot 3

y' = 6(3x-5).

Let y = (x^3 + 4)^2. Then

y = (x^3 + 4) \cdot (x^3 + 4)

y' = 3x^2 \cdot (x^3 + 4) + (x^3 + 4) \cdot 3x^2

y' = 6x^2 (x^3 + 4)

Let y = (\sqrt{x} + 5)^2. Then

y = x + 10 \sqrt{x} + 25

y' = 1 + \displaystyle \frac{5}{\sqrt{x}}

Let y = \sin^2 x. Then

y = \sin x \cdot \sin x

y' = \cos x \cdot \sin x + \sin x \cdot \cos x

y' = 2 \sin x \cos x

Let $y = \sin 2x$. Then

y = 2 \sin x \cos x

y' = 2 \cos x \cos x - 2 \sin x \sin x

y' = 2 (\cos^2 x - \sin^2 x)

y' = 2 \cos 2x

The important thing is to list example after example after example, and have students compute the derivatives. All along, I keep muttering something like, “Boy, it would sure be nice if there was a short-cut that would save us from doing all this work.” Of course, there is a short-cut (the Chain Rule), but I don’t tell the students what it is. Instead, I make the students try to figure out the pattern for themselves. This is absolutely critical: I don’t spill the beans. I just wait and wait and wait until the students figure out the pattern for themselves… though I might give suggestive hints, like rewriting the 6 in the first example as $\latex 3 \times 2$.

This can take 20-30 minutes, and perhaps over a dozen examples (like those above), as students are completely engaged and frustrated trying to figure out the short-cut. But my experience is that when it clicks, it really clicks. So this pedagogical technique requires a lot of patience on the part of the instructor to not “save time” by giving the answer but to allow the students the thrill of discovering the pattern for themselves.

Once the Chain Rule is discovered, then my experience is that students have been prepared for differentiating more complicated functions, like y = \sqrt{4 + \sin 2x} and y = \cos ( \sqrt{x} ). In other words, there’s a significant front-end investment of time as students discover the Chain Rule, but applying the Chain Rule generally moves along quite quickly once it’s been discovered.