Confirming Einstein’s Theory of General Relativity With Calculus, Part 4a: Angular Momentum

In this series, I’m discussing how ideas from calculus and precalculus (with a touch of differential equations) can predict the precession in Mercury’s orbit and thus confirm Einstein’s theory of general relativity. The origins of this series came from a class project that I assigned to my Differential Equations students maybe 20 years ago.

In this part of the series, we will show that if the motion of a planet around the Sun is expressed in polar coordinates $(r,\theta)$ , with the Sun at the origin, then under Newtonian mechanics (i.e., without general relativity) the motion of the planet follows the differential equation

$u''(\theta) + u(\theta) = \displaystyle \frac{1}{\alpha}$ ,

where $u = 1/r$ and $\alpha$ is a certain constant. Deriving this governing differential equation will require some principles from physics. If you’d rather skip the physics and get to the mathematics, we’ll get to solving this differential equations in a few posts.

One principle from physics that we’ll need is the Law of Conservation of Angular Momentum. Mathematically, this is expressed by

$mr^2 \displaystyle \frac{d\theta}{dt} = \ell$ ,

where $\ell$ is a constant. Of course, this can be written as

$\displaystyle \frac{d\theta}{dt} = \displaystyle \frac{\ell}{mr^2}$ ;

this will be used a couple times in the derivation below.

As we’ll soon see, we will need to express the second derivative $\displaystyle \frac{d^2 r}{d t^2}$ in a form that depends only on $\theta$ . To do this, we use the Chain Rule to obtain

$r' = \displaystyle \frac{dr}{dt}$

$= \displaystyle \frac{dr}{d\theta} \cdot \frac{d\theta}{dt}$

$= \displaystyle \frac{\ell}{mr^2} \frac{dr}{d\theta}$

$= \displaystyle - \frac{\ell}{m} \frac{d}{d\theta} \left( \frac{1}{r} \right)$ .

This last step used the Chain Rule in reverse:

$\displaystyle \frac{d}{d\theta} \left( \frac{1}{r} \right) = \frac{d}{dr} \left( \frac{1}{r} \right) \cdot \frac{dr}{dt} = -\frac{1}{r^2} \cdot \frac{dr}{dt}$ .

To examine the second derivative $\displaystyle \frac{d^2 r}{d t^2}$ , we again use the Chain Rule:

$\displaystyle \frac{d^2 r}{d t^2} = \displaystyle \frac{dr'}{dt}$

$= \displaystyle \frac{dr'}{d\theta} \cdot \frac{d\theta}{dt}$

$= \displaystyle \frac{\ell}{mr^2} \frac{dr'}{d\theta}$

$= \displaystyle \frac{\ell}{mr^2} \frac{d}{d\theta} \left[ \frac{dr}{dt} \right]$

$= \displaystyle \frac{\ell}{mr^2} \frac{d}{d\theta} \left[ - \frac{\ell}{m} \frac{d}{d\theta} \left( \frac{1}{r} \right) \right]$

$= \displaystyle - \frac{\ell^2}{m^2r^2} \frac{d}{d\theta} \left[ \frac{d}{d\theta} \left( \frac{1}{r} \right) \right]$

$= \displaystyle - \frac{\ell^2}{m^2r^2} \frac{d^2}{d\theta^2} \left( \frac{1}{r} \right)$ .

While far from obvious now, this will be needed when we rewrite Newton’s Second Law in polar coordinates.

Confirming Einstein’s Theory of General Relativity With Calculus, Part 3: Method of Successive Approximations

One technique that will be necessary for this confirmation is the method of successive approximations. This will be needed in the context of a differential equation; however, we can illustrate the concept by finding the roots of a polynomial. Consider the quadratic equation

$x^2 - x - 1 = 0$ .

(Naturally, we can solve for $x$ using the quadratic formula; more on that later.) To apply the method of successive approximation, we will rewrite this so that $x$ appears on the left side and some function of $x$ appears on the right side. I will choose

$x^2 = x + 1$ , or

$x = 1 + \displaystyle \frac{1}{x}$ .

Here’s the idea of the method of successive approximations to obtain a recursively defined sequence that (hopefully) convergence to a solution of this equation:

Start with an initial guess $x_0$ .
Plug $x_0$ into the right-hand side to get a new guess, $x_1$ .
Plug $x_1$ into the right-hand side to get a new guess, $x_2$ .
And repeat.

For example, suppose that we choose $x_0 = 1$ . Then

$x_1 = 1 + \displaystyle \frac{1}{x_0} = 1 + \displaystyle \frac{1}{1} = 2$

$x_2 = 1 + \displaystyle \frac{1}{x_1} = 1 + \displaystyle \frac{1}{2} = \displaystyle \frac{3}{2} = 1.5$

$x_3 = 1 + \displaystyle \frac{1}{x_2} = 1 + \displaystyle \frac{1}{3/2} = \displaystyle \frac{5}{3} \approx 1.667$

$x_4 = 1 + \displaystyle \frac{1}{x_3} = 1 + \displaystyle \frac{1}{5/3} = \displaystyle \frac{8}{5} = 1.6$

$x_5 = 1 + \displaystyle \frac{1}{x_4} = 1 + \displaystyle \frac{1}{8/5} = \displaystyle \frac{13}{8} = 1.625$

$x_6 = 1 + \displaystyle \frac{1}{x_5} = 1 + \displaystyle \frac{1}{13/8} = \displaystyle \frac{21}{13} \approx 1.615$

$x_7 = 1 + \displaystyle \frac{1}{x_6} = 1 + \displaystyle \frac{1}{21/13} = \displaystyle \frac{34}{21} \approx 1.619$

This sequence can be computed by entering $1$ into a calculator, then entering $1 + 1 \div \hbox{Ans}$ , and then repeatedly hitting the $=$ button.

We see that the sequence appears to be converging to something, and that something is a root of the equation $x^2 - x - 1 = 0$ , which we now find via the quadratic formula:

$x = \displaystyle \frac{1 \pm \sqrt{1 - 4 \cdot 1 \cdot (-1)}}{2} = \frac{1 \pm \sqrt{5}}{2}$ .

So it looks like the above sequence is converging to the positive root $(1 + \sqrt{5})/2 \approx 1.618$ .

(Parenthetically, you might notice that the Fibonacci sequence appears in the numerators and denominators of this sequence. As you might guess, that’s not a coincidence.)

Like most numerical techniques, this method doesn’t always work like we think it would. Another solution is the negative root $(1 - \sqrt{5})/2 \approx -0.618$ . Unfortunately, if we start with a guess near this root, like $x_0 = -0.62$ , the sequence unexpectedly diverges from $-0.618\dots$ but eventually converges to the positive root $1.618\dots$ :

$x_1 = 1 + \displaystyle \frac{1}{x_0} = 1 + \displaystyle \frac{1}{-0.62} = -0.6129\dots$

$x_2 = 1 + \displaystyle \frac{1}{x_1} = 1 + \displaystyle \frac{1}{-0.6129\dots} = -0.6315\dots$

$x_3 = 1 + \displaystyle \frac{1}{x_2} = 1 + \displaystyle \frac{1}{-0.6315\dots} = -0.5833\dots$

$x_4 = 1 + \displaystyle \frac{1}{x_3} = 1 + \displaystyle \frac{1}{-0.5833\dots} = -0.7142\dots$

$x_5 = 1 + \displaystyle \frac{1}{x_4} = 1 + \displaystyle \frac{1}{-0.5833\dots} = -0.4$

$x_6 = 1 + \displaystyle \frac{1}{x_5} = 1 + \displaystyle \frac{1}{-0.4\dots} = -1.5$

$x_7 = 1 + \displaystyle \frac{1}{x_6} = 1 + \displaystyle \frac{1}{-1.5\dots} = 0.3333\dots$

$x_8 = 1 + \displaystyle \frac{1}{x_7} = 1 + \displaystyle \frac{1}{0.3333\dots} = 4$

$x_9 = 1 + \displaystyle \frac{1}{x_8} = 1 + \displaystyle \frac{1}{4} = 1.25$

$x_{10} = 1 + \displaystyle \frac{1}{x_9} = 1 + \displaystyle \frac{1}{1.25} = 1.8$

$x_{11} = 1 + \displaystyle \frac{1}{x_{10}} = 1 + \displaystyle \frac{1}{1.8} = 1.555\dots$

$x_{12} = 1 + \displaystyle \frac{1}{x_{11}} = 1 + \displaystyle \frac{1}{1.555\dots} = 1.6428\dots$

$x_{13} = 1 + \displaystyle \frac{1}{x_{12}} = 1 + \displaystyle \frac{1}{1.6428\dots} = 1.6086\dots$

$x_{14} = 1 + \displaystyle \frac{1}{x_{13}} = 1 + \displaystyle \frac{1}{1.6086\dots} = 1.6216\dots$

$x_{15} = 1 + \displaystyle \frac{1}{x_{14}} = 1 + \displaystyle \frac{1}{1.6216\dots} = 1.6166\dots$

$x_{16} = 1 + \displaystyle \frac{1}{x_{15}} = 1 + \displaystyle \frac{1}{1.6216\dots} = 1.6185\dots$

I should note that the method of successive approximations generally converges at a slower pace than Newton’s method. However, this method will be good enough when we use it to predict the precession in Mercury’s orbit.

A Postcard from Spokane

A brief aside from the current series on general relativity — and the mysterious 43 seconds of arc per century in Mercury’s orbit — that turned into further discussion about angle measurement.

A few months ago, I received this clever postcard from someone visiting Spokane, Washington. The sender clearly knew the recipient (me) well: rather than sending me a postcard showing the jaw-dropping beauty of the Spokane area, I was impressed with the mathematical precision given for Spokane’s location.

I started wondering about exactly how precisely the postcard was measuring the location of Spokane — was it the location of City Hall or some other important landmark? — and I went to Google Maps to find out. (For what it’s worth, xkcd had a comic about this some time ago.)

And then it finally hit me, after far longer than it should have taken, that the postcard is utterly nonsensical.

We would never say that someone’s height is 4 feet, 20 inches. There are 12 inches in a foot, and so we would instead say that the height is 5 feet, 8 inches.

Likewise, when specifying an angle with minutes and seconds, there are (just like with ordinary time) 60 seconds in a minute and 60 minutes in a degree (so that there are 3600 seconds in a degree). Therefore, specifying an angle with 67′ or 66″, as in the postcard, makes absolutely no sense.

Furthermore, if converted into standard notation, we obtain a location of $48^{\circ} 7' 36''$ north, $117^{\circ} 42' 6''$ west, which is about 40 miles NNW of Spokane. (Images made by https://www.gps-coordinates.net/). Note on the conversion into decimal:

$47 + \displaystyle \frac{67}{60} + \displaystyle \frac{36}{3600} = 48 + \displaystyle \frac{7}{60} + \displaystyle \frac{36}{3600} = 48.12666\dots$

and

$117 + \displaystyle \frac{41}{60} + \displaystyle \frac{66}{3600} = 117 + \displaystyle \frac{42}{60} + \displaystyle \frac{6}{3600} = 117.701666\dots$

It’s a shame that the designer of the postcard made this error, as I genuinely thought this was a clever and aesthetically pleasing design idea for a postcard.

While I’m not sure how this mistake happened, my best guess is that the designer used the location of $47.6736^\circ$ north, $117.4166^\circ$ west — which is indeed in Spokane — and then misconverted from decimal notation to minutes and seconds.

Confirming Einstein’s Theory of General Relativity With Calculus, Part 2d: Hyperbolas and Polar Coordinates

In a previous post, we showed that the polar equation

$r = \displaystyle \frac{a}{1 + e \cos \theta}$

is equivalent to the rectangular equation

$\displaystyle \frac{\left(x + \displaystyle \frac{\alpha e}{1-e^2} \right)^2}{\displaystyle \frac{\alpha^2}{(1-e^2)^2}} + \frac{y^2}{\displaystyle \frac{\alpha^2}{1-e^2}} = 1$

as long as $e \ne 0$ . Furthermore, if $0 < e < 1$ , then this represents an ellipse with eccentricity $e$ whose major axis lies on the $x-$ axis, with one focus located at the origin.

While not directly related to our discussion of precession, it turns out that this equation represents a hyperbola if $e > 1$ . Under this assumption, $1-e^2 < 0$ and $e^2-1>0$ , so let me rewrite the previous equation in terms of $e^2-1$ :

$\displaystyle \frac{\left(x - \displaystyle \frac{\alpha e}{e^2-1} \right)^2}{\displaystyle \frac{\alpha^2}{(e^2-1)^2}} - \frac{y^2}{\displaystyle \frac{\alpha^2}{e^2-1}} = 1$

This matches the form of a left-right hyperbola

$\displaystyle \frac{(x-h)^2}{a^2} - \frac{(y-k)^2}{b^2} = 1$ ,

where the center of the hyperbola is located at

$(h,k) = \displaystyle \left( \frac{\alpha e}{e^2-1} , 0 \right)$

Also, for a hyperbola, the distance $c$ from the center to the foci satisfies

$c^2 = a^2 + b^2$ ,

so that

$c^2 = \displaystyle \frac{\alpha^2}{(e^2-1)^2} + \displaystyle \frac{\alpha^2}{e^2-1}$

$c^2 = \displaystyle \frac{\alpha^2 + \alpha^2 (e^2 - 1)}{(e^2-1)^2}$

$c^2 = \displaystyle \frac{\alpha^2 e^2}{(e^2-1)^2}$

$c = \displaystyle \frac{\alpha e}{e^2-1}$

The two foci are located a distance $c$ to the left of the right of the center. Since it happened to happen that $c = h$ , this means that the origin is, once again, one of the foci of the hyperbola.

Furthermore, the eccentricity $c/a$ of the hyperbola is easily computed as

$\displaystyle \frac{c}{a} = \frac{ \displaystyle \frac{\alpha e}{e^2-1} }{ \displaystyle \frac{\alpha}{e^2-1}} = e$ ,

so that, once again, the well-chosen parameter $e$ is the eccentricity.

Confirming Einstein’s Theory of General Relativity With Calculus, Part 2c: Circles, Parabolas, and Polar Coordinates

In the previous post, we showed that the polar equation

$r = \displaystyle \frac{\alpha}{1 + e \cos \theta}$

converts to

$x^2 (1-e^2) + 2 \alpha e x + y^2 = \alpha^2$

in rectangular coordinates. Furthermore, if $0 < e < 1$ , then this represents an ellipse with eccentricity $e$ whose semi-major axis lies along the $x-$ axis with one focus at the origin.

It turns out that, for different non-negative values of $e$ , the same polar equation represents different conic sections. These are not particularly relevant for our study of precession, but I’m including this anyway in this series as a small tangential discussion.

Let’s take a look at the easy case of $e = 0$ . With this substitution, the equation in rectangular coordinates simplifies to

$x^2 + y^2 = \alpha^2$ .

Of course, this is the equation of a circle that is centered at the origin with radius $\alpha$ .

The other easy case is $e = 1$ , so that $1-e^2 = 0$ . Then the equation in rectangular coordinates simplifies to

$2 \alpha x + y^2 = \alpha^2$

$y^2 = -2\alpha x + \alpha^2$

$y^2 = -2 \alpha \displaystyle \left( x - \frac{\alpha}{2} \right)$

$y^2 = -4 \cdot \displaystyle \frac{\alpha}{2} \left( x - \frac{\alpha}{2} \right)$

This matches the form of a parabola that opens to the left with a horizontal axis of symmetry:

$(y-k)^2 = -4 p (x-h)$ .

In this case, the vertex of the parabola is located at

$(h,k) = \displaystyle \left( \frac{\alpha}{2} , 0 \right)$ ,

while the focus of the parabola is located a distance $p = \displaystyle \frac{\alpha}{2}$ to the left of the vertex. In other words, the origin is the focus of the parabola. (For what it’s worth, the directrix of the parabola would be the vertical line $y = \alpha$ , located $p$ to the right of the vertex.)

Confirming Einstein’s Theory of General Relativity With Calculus, Part 2b: Ellipses and Polar Coordinates

As part of our derivation, we’ll need to use the fact that, in polar coordinates, the graph of

$r = \displaystyle \frac{\alpha}{1 + e \cos \theta}$

turns out to be an ellipse if $0 < e < 1$ , with the origin at one focus.

We now prove this. Clearing the denominator, we obtain

$r + re \cos \theta = \alpha$ .

Switching to rectangular coordinates, this becomes

$\sqrt{x^2 + y^2} + e x = \alpha$

$\sqrt{x^2 + y^2} = \alpha - ex$

$x^2 + y^2 = (\alpha - ex)^2$

$x^2 + y^2 = \alpha^2 - 2\alpha ex + e^2 x^2$

$x^2 (1-e^2) + 2 \alpha e x + y^2 = \alpha^2$

$(1-e^2)\left(x^2 + 2 \displaystyle \frac{\alpha e}{1-e^2} x \right) + y^2 = \alpha^2$

$(1-e^2)\left(x^2 + 2 \displaystyle \frac{\alpha e}{1-e^2} x + \frac{\alpha^2 e^2}{(1-e^2)^2} \right) + y^2 = \alpha^2 + \displaystyle \frac{\alpha^2e^2}{1-e^2}$

$(1-e^2) \left(x + \displaystyle \frac{\alpha e}{1-e^2} \right)^2 + y^2 = \displaystyle \frac{\alpha^2(1-e^2)+\alpha^2 e^2}{1-e^2}$

$(1-e^2) \left(x + \displaystyle \frac{\alpha e}{1-e^2} \right)^2 + y^2 = \displaystyle \frac{\alpha^2}{1-e^2}$

$\displaystyle \frac{(1-e^2)^2}{\alpha^2} \left(x + \displaystyle \frac{\alpha e}{1-e^2} \right)^2 + \frac{1-e^2}{\alpha^2} y^2 = 1$

$\displaystyle \frac{\left(x + \displaystyle \frac{\alpha e}{1-e^2} \right)^2}{\displaystyle \frac{\alpha^2}{(1-e^2)^2}} + \frac{y^2}{\displaystyle \frac{\alpha^2}{1-e^2}} = 1$

Since we assumed that $0 < e < 1$ , we have $0 < 1 - e^2 < 1$ so that

$\displaystyle \frac{\alpha^2}{(1-e^2)^2} > \displaystyle \frac{\alpha^2}{1-e^2}$ .

Therefore, this matches the usual form of an ellipse in rectangular coordinates

$\displaystyle \frac{(x-h)^2}{a^2} + \frac{(y-k)^2}{b^2} = 1$ ,

where the center of the ellipse is located at

$(h,k) = \displaystyle \left( -\displaystyle \frac{\alpha e}{1-e^2}, 0 \right)$ ,

the semi-major axis is horizontal with length

$a = \displaystyle \frac{\alpha}{1-e^2}$ ,

and the semi-minor axis is vertical with length

$b = \displaystyle \frac{\alpha}{\sqrt{1-e^2}}$ .

Furthermore, the distance $c$ of the foci from the center of the ellipse satisfies the equation

$b^2 + c^2 = a^2$ ,

so that

$\displaystyle \frac{\alpha^2}{1-e^2} + c^2 = \displaystyle \frac{\alpha^2}{(1-e^2)^2}$

$c^2 = \displaystyle \frac{\alpha^2}{(1-e^2)^2} - \displaystyle \frac{\alpha^2}{1-e^2}$

$c^2 = \displaystyle \frac{\alpha^2 - \alpha^2(1-e^2)}{(1-e^2)^2}$

$c^2 = \displaystyle \frac{\alpha^2 e^2}{(1-e^2)^2}$

$c = \displaystyle \frac{\alpha e}{1-e^2}$

From this, we derive two nice properties of the ellipse. First, looking back on previous work, we see that $c = -h$ . Therefore, since the foci of the ellipse are distance $c$ away from the center along the major axis, we conclude that one focus of the ellipse is located at $(-h+c,0)$ , or $(0,0)$ . That is, the origin is one focus of the ellipse. (For the little it’s worth, the other focus is located at $(-2h,0)$ .

Second, the eccentricity of the ellipse is defined to be the ratio $c/a$ . This is now easily computed:

$\displaystyle \frac{c}{a} = \displaystyle \frac{\displaystyle \frac{\alpha e}{1-e^2}}{\displaystyle \frac{\alpha}{1-e^2}} = e$ .

In other words, the letter $e$ was well-chosen to represent the eccentricity of the ellipse.

For what it’s worth, here’s an alternate derivation of the formulas for $a$ and $b$ . For this ellipse, the planet’s closest approach to the Sun occurs at $\theta = 0$ :

$r(0) = \displaystyle \frac{\alpha}{1 + e \cos 0} = \frac{\alpha}{1 + e}$ ,

and the planet’s further distance from the Sun occurs at $\theta = \pi$ :

$r(\pi) = \displaystyle \frac{\alpha}{1 + e \cos \pi} = \frac{\alpha}{1 - e}$ .

Therefore, the length $2a$ of the major axis of the ellipse is the sum of these two distances:

$2a = \displaystyle \frac{\alpha}{1 + e} + \frac{\alpha}{1 - e}$

$2a = \displaystyle \frac{\alpha(1-e) + \alpha(1+e)}{(1 + e)(1 -e)}$

$2a= \displaystyle \frac{2\alpha}{1 - e^2}$

$a = \displaystyle \frac{\alpha}{1 - e^2}$ .

Since $c = a\epsilon$ , we can also compute $b$ :

$b^2 = a^2 - c^2$

$b^2 = a^2 - a^2 e^2$

$b^2 = a^2 ( 1-e^2)$

$b^2 = \displaystyle \frac{\alpha^2}{(1 - e^2)^2} (1-e^2)$

$b^2 = \displaystyle \frac{\alpha^2}{1 - e^2}$

$b = \displaystyle \frac{\alpha}{\sqrt{1 - e^2}}$

Confirming Einstein’s Theory of General Relativity With Calculus, Part 2a: Graphically Exploring Precession

But what is precession? To explore this concept, let’s explore the graph of

$r = \displaystyle \frac{a}{1 + e \cos (1-k)\theta}$

for various values of $a$ , $e$ , and $k$ using Desmos. (Note that, in this context, the number $e$ does not mean Euler’s constant $2.718\dots$ . The reason for choosing the letter $e$ for this parameter will become clear shortly.) Naturally, this demonstration could also be done with other tools like a graphing calculator.

I suggest beginning by setting $e=0$ and $k=0$ and altering the value of $a$ . This is the easiest behavior to explain. From the equation, $a$ is directly proportional to the distance from the origin $r$ . So, not surprisingly, increasing $a$ produces a larger graph, and decreasing $a$ produces a smaller graph.

Second, I suggest setting $a=3$ and $k=0$ but altering the value of $e$ . Starting at $e=0$ , the graph is a circle. This makes complete sense: if $e=0$ , then the equation simply becomes $r=a$ , so the distance from the origin is the same for all angles. However, as $e$ increases, the original circle becomes more and more stretched out. We will prove this analytically in a later post, but it turns out that, for $0 < e < 1$ , the graph is an ellipse, and the origin is one of the foci of the ellipse. The number $e$ is called the eccentricity of the ellipse (hence the letter $e$ ).

Again, if the value of $e$ is fixed but $a$ varies, the graph becomes either larger or smaller as $a$ becomes larger or smaller.

We notice that if $0 < e < 1$ and $k=0$ , then the denominator of

$r = \displaystyle \frac{r}{1 + e \cos \theta}$

varies between $1-e$ and $1+e$ . In particular, the denominator is always positive. Therefore, the value of $r$ is least positive — the graph is closest to the origin — when the denominator is greatest. This happens when $\theta$ is a multiple of $2\pi$ . So, for example, when $\theta = 0$ , then $r = a/(1+e)$ is as close as the graph gets to the origin. Let’s call this closest distance $P$ ; in the context of a planet’s orbit around the sun, this represent perihelion. Then we have $P = \displaystyle \frac{a}{1+e}$ .

When $e=1$ , the graph switches from an ellipse to a parabola, where the origin is the focus of the parabola. For $e>1$ , the graph becomes a hyperbola. However, since we’re mostly going to be concerned with stable planetary orbits in this series, we won’t dwell too much on the case $e \ge 1$ .

Third, I suggest setting $a=3$ , $e=0.8$ , and then alter the value of $k$ . For $k=0$ , the graph is simply a single ellipse. However, by changing the value of $k$ , the graph changes into a spiral.

In the above figure, the spiral stopped “spiraling” because I had asked Desmos only to show the graph between $0 \le \theta \le 20 \pi$ . If I had changed the upper bound to something larger than $20\pi$ , the spiral would continue.

The precession in the spiral is defined to be the angular offset between each loop of the spiral. Clearly, this is a function of $k$ . To find this function, we again examine the function

$r = \displaystyle \frac{a}{1 + e \cos (1-k)\theta}$

Once again, if $0 < e < 1$ , then the denominator varies between $1-e$ and $1+e$ . In particular, the denominator is always positive. Therefore, the value of $r$ is least positive when the denominator is greatest, and the denominator is greatest when $(1-k)\theta$ is a multiple of $2\pi$ . So, for example, when $\theta = 0$ , then $r = a/(1+e)$ is as close as the graph gets to the origin.

When does the graph return to its closest point to the origin next? This would occur when $(1-k)\theta = 2\pi$ , or $\theta = \displaystyle \frac{2\pi}{1-k}$ . If $k =0$ , then the angle of closest approach to the origin would $\theta =2\pi$ , and the graph simply cycles over itself. However, if $k > 0$ , then this angle $\theta$ will be larger than $2\pi$ , thus producing a spiral. Indeed, the amount of precession would be equal to

$\displaystyle \frac{2\pi}{1-k} - 2\pi = \frac{2\pi k}{1-k}$ .

In the picture above, $k = 0.05$ . Therefore, the amount of precession would be $\displaystyle \frac{2\pi (0.05)}{1-0.05} = \frac{2\pi}{19}$ radians $\approx 18.95^\circ$ . Therefore, after 19 “leafs” of the spiral, the graph would begin to cycle on top of itself.

Confirming Einstein’s Theory of General Relativity With Calculus, Part 1c: Outline of Argument

This is going to be a very long series, so I’d like to provide a tree-top view of how the argument will unfold.

We begin by using three principles from Newtonian physics — the Law of Conservation of Angular Momentum, Newton’s Second Law, and Newton’s Law of Gravitation — to show that the orbit of a planet, under Newtonian physics, satisfies the initial-value problem

$u''(\theta) + u(\theta) = \displaystyle \frac{GMm^2}{\ell^2}$ ,

$u(0) = \displaystyle \frac{1}{P}$ ,

$u'(0) = 0$ .

In these equations:

The orbit of the planet is in polar coordinates $(r,\theta)$ , where the Sun is placed at the origin.
The planet’s perihelion — closest distance from the Sun — is a distance of $P$ at angle $\theta = 0$ .
The function $u(\theta)$ is equal to $\displaystyle \frac{1}{r(\theta)}$ .
$G$ is the gravitational constant of the universe.
$M$ is the mass of the Sun.
$m$ is the mass of the planet.
$\ell$ is the angular momentum of the planet.

The solution of this differential equation is

$u(\theta) = \displaystyle \frac{1 + \epsilon \cos \theta}{\alpha}$ ,

so that

$r(\theta) = \displaystyle \frac{\alpha}{1 + \epsilon \cos \theta}$ .

In polar coordinates, this is the graph of an ellipse. Substituting $\theta = 0$ , we see that

$P = \displaystyle \frac{1 + \epsilon}{\alpha}$ .

In the solution for $u(\theta)$ , we have $\alpha = \displaystyle \frac{\ell^2}{GMm^2}$ and $\epsilon = \displaystyle \frac{\alpha - P}{P}$ . The number $\epsilon$ is the eccentricity of the ellipse, while $\alpha = \displaystyle \frac{P}{1+\epsilon}$ is proportional to the size of the ellipse.

Under general relativity, the governing initial-value problem changes to

$u''(\theta) + u(\theta) = \displaystyle \frac{GMm^2}{\ell^2} + \frac{3GM}{c^2} [u(\theta)]^2$ ,

$u(0) = \displaystyle \frac{1}{P}$ ,

$u'(0) = 0$ ,

where $c$ is the speed of light. We will see that the solution of this new differential equation can be well approximated by

$u(\theta) = \displaystyle \frac{1 + \epsilon}{\alpha} + \frac{\delta}{\alpha^2} + \frac{\delta \epsilon^2}{2\alpha^2} + \frac{\delta\epsilon}{\alpha^2} \theta \sin \theta - \frac{\delta \epsilon^2}{6\alpha^2} \cos 2\theta - \frac{\delta(3+\epsilon^2)}{3\alpha^2} \cos \theta$

$\approx \displaystyle \frac{1}{\alpha} \left[1 + \epsilon \cos \left(\theta - \frac{\delta \theta}{\alpha} \right) \right]$ .

This last equation describes a spiral that precesses by approximately

$\displaystyle \frac{2\pi \delta}{\alpha} \quad$ radians per orbit

$\displaystyle \frac{6\pi G M}{a c^2 (1-\epsilon^2)} \quad$ radians per orbit,

where $a$ is the length of the semimajor axis of the orbit.

This matches the amount of precession in Mercury’s orbit that is not explained by Newtonian physics, thus confirming Einstein’s theory of general relativity.

To the extent possible, I will take the perspective of a good student who has taken Precalculus and Calculus I. However, I will have to break this perspective a couple of times when I discuss principles from physics and derive the solutions of the above differential equations.

Here we go…

Confirming Einstein’s Theory of General Relativity With Calculus, Part 1b: Precession of Mercury

The figure below shows the (greatly exaggerated) effect of precession on a planet’s otherwise elliptical orbit. In the figure, each perihelion is precessed by an angle of $40^circ$ . After nine orbits, the planet returns to its original position. Suppose, for the sake of argument, that each orbit of the planet depicted in the figure is four months, or one third of Earth’s year. Then the amount of precession would be $40^\circ$ per four months, or $120^\circ$ per year, or $12,000^\circ$ per century.

As I said, the figure above is greatly exaggerated. As we’ll see by the end of this series, Einstein’s general relativity predicts that, on top of the gravitational influences of the other planets, the orbit of Mercury should precess by 43″ of arc per century. That’s a really small angle, since 1 $^\circ$ is equal to 60′ (minutes) of arc and each 1′ is equal to 60″ (seconds) of arc, that means 1″ of arc is the same as $(1/3600)^\circ$ , so that 43″ of arc per century is about $0.012^\circ$ per century. That’s about a million times smaller than the precession of the fictitious planet in the above figure.

How small is $0.012^\circ$ , really?

Courtesy of Wikipedia, the pictures below are the Copernicus crater on the Moon as well as an indicator of its location on the Moon. It is visible with binoculars.

The diameter of the crater is 93 km. Since the Moon is 384,400 km from Earth, that means the angle subtended by the crater, as viewed from the Earth, is about

$\arctan \left( \frac{93}{384,400} \right) \approx 0.014^\circ$ .

So how much is 43″ of arc per century? That’s about the speed as, hypothetically, pointing at the left edge of this lunar crater (which cannot be seen by the naked eye) and then slowly moving your figure so that, about 115 years later, your finger is pointing at the right edge of the crater.

Said another way, the diameter of the Moon is about 3475 km, so that the angle subtended by the Moon, as viewed from the Earth, is about

$\arctan \left( \frac{3745}{384,400} \right) \approx 0.518^\circ$ .

So, at the rate of $0.012^\circ$ per century, it would take $0.518/0.012 \approx 43$ centuries, or about 43,000 years, to trace the angle subtended by the moon.

Needless to say, 43” of arc per century is really, really slow.

Nevertheless, and remarkably, this itty, bitty precession was observable by careful 19th century astronomers with the telescopes that were available then. At the time, this precession was the great unsolved mystery of Newtonian physics that was only answered after two generations later with the discovery of general relativity.

Snell’s Law and a mystery novel

Lately, for my own leisure reading, I’ve been enjoying the murder-mystery novels of Dorothy Sayers. Her books are an enjoyable trip back in time, as she paints a very vivid portrait of English life of during the interwar years of the 1920s and 1930s. (Of course, at the time she was writing, no one had any idea that the Great War would not actually be the war to end all wars, as was the popular sentiment of the time.) Indeed, her first novel was published literally a century ago in 1923. The lead character, Lord Peter Wimsey (back then, the aristocracy was still part of English culture), has a distinctive way of speaking that makes the novels so delightful. A hallmark of the Sayers novels is that she didn’t merely write whodunit stories; instead, she strove to write novels in which a detective story happens to happen.

As an aside, I learned in her novel Gaudy Night that the adjective Oxonian means “related to Oxford,” which led me to further learn that my hometown of Oxon Hill, Maryland was so named because somebody, centuries ago, thought that the landscape of that part of the state reminded him of Oxford, England. While that comparison might have been reasonable centuries ago, it certainly would raise eyebrows today.

Anyway, with all that as background, in her story Unnatural Death, the following figure depicts an aerial view of a witness’s testimony at a key point in the story. I think I can describe this much of the scene without giving away the plot: the witnesses stood just inside the door of elderly Miss Dawson’s bedroom. A screen blocked direct observation of Miss Dawson as she lay in bed, but the witnesses could see Miss Dawson in the mirror.

As I read the novel, I immediately noticed that the mirror in the figure was not a perfect reflector… at the mirror, the angles of reflection of the dashed path of light are quite different. Indeed, I pulled out my protractor: the angle where the word “Mirror” is located has a measure of about 52 degrees, while the opposite reflected angle has a measure of about 72 degrees.

As this is was part of a murder-mystery novel, I thought: what could be the cause of this disparity? To be a good detective, any explanation, no matter how implausible, must be thought of and reasoned out.

One explanation of the different angles is that, somehow, the speed of light changed in the room. This is the same principle behind Snell’s Law, which explains the refraction of light as it travels between air and water. Since the speed of light in air ( $c_1$ ) is different than the speed of light in water ( $c_2$ ), the angle of incidence ( $\theta_1)$ is different from the angle of refraction ( $\theta_2$ ), but they are related through the formula

$\displaystyle \frac{\sin \theta_1}{c_1} = \frac{\sin \theta_2}{c_2}$ .

This relationship occurs because of Fermat’s principle, which says that light always travels in a path that requires the least amount of time. Ordinarily, this means that light travels in a straight line. However, if the speed of light should change (say, when traveling through both air and water), then the path of the light is refracted.

Fermat’s principle also explains why light reflects at equal angles if the speed of light is constant (as amusingly illustrated in this PBS video by Dianna Cowern, a.k.a. Physics Girl). However, if the speed of light should somehow change in the room at the point where the light reflects, then the light would bounce at a different angle for the same reason that Snell’s Law works.

In this case, the angles $\theta_1$ and $\theta_2$ are complementary to the 52-degree and 72-degree angles, respectively. By the cofunction trigonometric identities, this means that

$\sin \theta_1 = \cos 52^\circ \quad$ and $\quad \sin \theta_2 = \cos 72^\circ$ ,

so that Snell’s Law can be rewritten as

$\displaystyle \frac{c_1}{c_2} = \frac{\cos 52^\circ}{\cos 72^\circ} \approx 1.992$ .

In other words, one explanation for the unusual path of light is that the speed of light was almost exactly twice as fast in one part of room than in the other part… and the exact threshold of this change occurred at the point where the light hit the mirror. Perhaps there was some kind of fog, mist, or other contaminant in the air near poor Miss Dawson that was so thick that light slowed to half its usual speed. So that’s one explanation.

The other explanation, of course, is that the artist who drew the picture just did a lousy job depicting the reflected light.

As this was part of a murder-mystery, both options are still open to investigation. (Yes, that was tongue-in-cheek.)

For what it’s worth, the figure in my book was not exactly the same as Sayers’ original drawing — clearly, modern word processing was used that was unavailable in the 1930s. One of these days, I may visit the Wade Center in Wheaton, Illinois, which has an impressive collection of Sayers’ works, to peruse a first-run printing of Unnatural Death to see if the figure in my book is faithful to the one that appeared when the novel was first published.