Solving Problems Submitted to MAA Journals (Part 7b)

The following problem appeared in Volume 131, Issue 9 (2024) of The American Mathematical Monthly.

Let X and Y be independent normally distributed random variables, each with its own mean and variance. Show that the variance of X conditioned on the event X>Y is smaller than the variance of X alone.

Not quite knowing how to start, I decided to begin by simplifying the problem and assume that both X and Y follow a standard normal distribution, so that E(X) = E(Y) = 0 and \hbox{SD}(X)=\hbox{SD}(Y) = 1. This doesn’t solve the original problem, of course, but I hoped that solving this simpler case might give me some guidance about tackling the general case. I solved this special case in the previous post.

Next, to work on a special case that was somewhere between the general case and the first special case, I kept X as a standard normal distribution but changed Y to have a nonzero mean. As it turned out, this significantly complicated the problem (as we’ll see in the next post), and I got stuck.

So I changed course: for a second attempt, I kept X as a standard normal distribution but changed Y so that E(Y) = 0 and \hbox{SD}(Y) = \sigma, where \sigma could be something other than 1. The goal is to show that

\hbox{Var}(X \mid X > Y) = E(X^2 \mid X > Y) - [E(X \mid X > Y)]^2 < 1.

We begin by computing E(X \mid X > Y) = \displaystyle \frac{E(X I_{X>Y})}{P(X>Y)}. The denominator is straightforward: since X and Y are independent normal random variables, we also know that X-Y is normally distributed with E(X-Y) = E(X)-E(Y) = 0. (Also, \hbox{Var}(X-Y) = \hbox{Var}(X) + (-1)^2 \hbox{Var}(Y) = \sigma^2+1, but that’s really not needed for this problem.) Therefore, P(X>Y) = P(X-Y>0) = \frac{1}{2} since the distribution of X-Y is symmetric about its mean of 0.

Next, I wrote Y = \sigma Z, where Z has a standard normal distribution. Then

E(X I_{X>\sigma Z}) = \displaystyle \frac{1}{2\pi} \int_{-\infty}^\infty \int_{\sigma z}^\infty x e^{-x^2/2} e^{-z^2/2} \, dx dz,

where we have used the joint probability density function for the independent random variables X and Z. The region of integration is \{(x,z) \in \mathbb{R}^2 \mid x > \sigma z \}, matching the requirement X > \sigma Z. The inner integral can be directly evaluated:

E(X I_{X>Y}) = \displaystyle \frac{1}{2\pi} \int_{-\infty}^\infty \left[ -e^{-x^2/2} \right]_{\sigma z}^\infty e^{-z^2/2} \, dz

= \displaystyle \frac{1}{2\pi} \int_{-\infty}^\infty \left[ 0 + e^{-\sigma^2 z^2/2} \right] e^{-z^2/2} \, dz

= \displaystyle \frac{1}{2\pi} \int_{-\infty}^\infty e^{-(\sigma^2+1) z^2/2} \, dz.

At this point, I rewrote the integrand to be the probability density function of a random variable:

E(X I_{X>Y}) = \displaystyle \frac{1}{\sqrt{2\pi} \sqrt{\sigma^2+1}} \int_{-\infty}^\infty \frac{1}{\sqrt{2\pi} \sqrt{ \frac{1}{\sigma^2+1}}} \exp \left[ -\frac{z^2}{2 \cdot \frac{1}{\sigma^2+1}} \right] \, dz.

The integrand is the probability density function of a normal random variable with mean 0 and variance \sigma^2+1, and so the integral must be equal to 1. We conclude

E(X \mid X>Y) = \displaystyle \frac{E(X I_{X>Y})}{P(X>Y)} = \frac{ \frac{1}{\sqrt{2\pi(\sigma^2+1)}} }{ \frac{1}{2} } = \sqrt{\frac{2}{\pi(\sigma^2+1)}}.

Next, we compute the other conditional expectation:

E(X^2 \mid X > \sigma Z) = \displaystyle \frac{E(X^2 I_{X>\sigma Z})}{P(X>\sigma Z)} = \displaystyle \frac{2}{2\pi} \int_{-\infty}^\infty \int_{\sigma z}^\infty x^2 e^{-x^2/2} e^{-z^2/2} \, dx dz.

The inner integral can be computed using integration by parts:

\displaystyle \int_{\sigma z}^\infty x^2 e^{-x^2/2} \, dx = \int_{\sigma z}^\infty x \frac{d}{dx} \left( -e^{-x^2/2} \right) \, dx

= \displaystyle \left[-x e^{-x^2/2} \right]_{\sigma z}^\infty + \int_{\sigma z}^\infty e^{-x^2/2} \, dx

= \sigma z e^{-\sigma^2 z^2/2} + \displaystyle \int_{\sigma z}^\infty e^{-x^2/2} \, dx.

Therefore,

E(X^2 \mid X > \sigma Z) = \displaystyle \frac{1}{\pi} \int_{-\infty}^\infty \sigma z e^{-\sigma^2 z^2/2} e^{-z^2/2} \, dz + 2 \int_{-\infty}^\infty \int_{\sigma z}^\infty \frac{1}{2\pi} e^{-x^2/2} e^{-z^2/2} \, dx dz

= \displaystyle \frac{1}{\pi} \int_{-\infty}^\infty \sigma z e^{-(\sigma^2+1) z^2} \, dz + 2 \int_{-\infty}^\infty \int_{\sigma z}^\infty \frac{1}{2\pi} e^{-x^2/2} e^{-z^2/2} \, dx dz.

We could calculate the first integral, but we can immediately see that it’s going to be equal to 0 since the integrand z e^{-(\sigma^2+1) z^2} is an odd function. The double integral is equal to P(X>\sigma Z), which we’ve already shown is equal to \frac{1}{2}. Therefore, E(X^2 \mid X > Y) = 0 + 2 \cdot \frac{1}{2} = 1.

We conclude that

\hbox{Var}(X \mid X > Y) = E(X^2 \mid X > Y) - [E(X \mid X > Y)]^2 = 1 - \displaystyle \frac{2}{\pi(\sigma^2 + 1)},

which is indeed less than 1. If \sigma = 1, we recover the conditional variance found in the first special case.

After tackling these two special cases, we start the general case with the next post.

Solving Problems Submitted to MAA Journals (Part 7a)

The following problem appeared in Volume 131, Issue 9 (2024) of The American Mathematical Monthly.

Let X and Y be independent normally distributed random variables, each with its own mean and variance. Show that the variance of X conditioned on the event X>Y is smaller than the variance of X alone.

I admit I did a double-take when I first read this problem. If X and Y are independent, then the event X>Y contains almost no information. How then, so I thought, could the conditional distribution of X given X>Y be narrower than the unconditional distribution of X?

Then I thought: I can believe that E(X \mid X > Y) is greater than E(X): if we’re given that X>Y, then we know that X must be larger than something. So maybe it’s possible for \hbox{Var}(X \mid X>Y) to be less than \hbox{Var}(X).

Still, not quite knowing how to start, I decided to begin by simplifying the problem and assume that both X and Y follow a standard normal distribution, so that E(X) = E(Y) = 0 and \hbox{SD}(X)=\hbox{SD}(Y) = 1. This doesn’t solve the original problem, of course, but I hoped that solving this simpler case might give me some guidance about tackling the general case. I also hoped that solving this special case might give me some psychological confidence that I would eventually be able to solve the general case.

For the special case, the goal is to show that

\hbox{Var}(X \mid X > Y) = E(X^2 \mid X > Y) - [E(X \mid X > Y)]^2 < 1.

We begin by computing E(X \mid X > Y) = \displaystyle \frac{E(X I_{X>Y})}{P(X>Y)}. The denominator is straightforward: since X and Y are independent normal random variables, we also know that X-Y is normally distributed with E(X-Y) = E(X)-E(Y) = 0. (Also, \hbox{Var}(X-Y) = \hbox{Var}(X) + (-1)^2 \hbox{Var}(Y) = 2, but that’s really not needed for this problem.) Therefore, P(X>Y) = P(X-Y>0) = \frac{1}{2} since the distribution of X-Y is symmetric about its mean of 0.

Next,

E(X I_{X>Y}) = \displaystyle \frac{1}{2\pi} \int_{-\infty}^\infty \int_y^\infty x e^{-x^2/2} e^{-y^2}/2 \, dx dy,

where we have used the joint probability density function for X and Y. The region of integration is \{(x,y) \in \mathbb{R}^2 \mid x > y \}, taking care of the requirement X > Y. The inner integral can be directly evaluated:

E(X I_{X>Y}) = \displaystyle \frac{1}{2\pi} \int_{-\infty}^\infty \left[ -e^{-x^2/2} \right]_x^\infty e^{-y^2/2} \, dy

= \displaystyle \frac{1}{2\pi} \int_{-\infty}^\infty \left[ 0 + e^{-y^2/2} \right] e^{-y^2/2} \, dy

= \displaystyle \frac{1}{2\pi} \int_{-\infty}^\infty e^{-y^2} \, dy.

At this point, I used a standard technique/trick of integration by rewriting the integrand to be the probability density function of a random variable. In this case, the random variable is normally distributed with mean 0 and variance 1/2:

E(X I_{X>Y}) = \displaystyle \frac{1}{\sqrt{2\pi} \sqrt{2}} \int_{-\infty}^\infty \frac{1}{\sqrt{2\pi} \sqrt{1/2}} \exp \left[ -\frac{y^2}{2 \cdot \frac{1}{2}} \right] \, dy.

The integral must be equal to 1, and so we conclude

E(X \mid X>Y) = \displaystyle \frac{E(X I_{X>Y})}{P(X>Y)} = \frac{ \frac{1}{2\sqrt{\pi}} }{ \frac{1}{2} } = \frac{1}{\sqrt{\pi}}.

We parenthetically note that E(X \mid X>Y) > 0, matching my initial intuition.

Next, we compute the other conditional expectation:

E(X^2 \mid X > Y) = \displaystyle \frac{E(X^2 I_{X>Y})}{P(X>Y)} = \displaystyle \frac{2}{2\pi} \int_{-\infty}^\infty \int_y^\infty x^2 e^{-x^2/2} e^{-y^2/2} \, dx dy.

The inner integral can be computed using integration by parts:

\displaystyle \int_y^\infty x^2 e^{-x^2/2} \, dx = \int_y^\infty x \frac{d}{dx} \left( -e^{-x^2/2} \right) \, dx

= \displaystyle \left[-x e^{-x^2/2} \right]_y^\infty + \int_y^\infty e^{-x^2/2} \, dx

= y e^{-y^2/2} + \displaystyle \int_y^\infty e^{-x^2/2} \, dx.

Therefore,

E(X^2 \mid X > Y) = \displaystyle \frac{1}{\pi}  \int_{-\infty}^\infty y e^{-y^2/2} e^{-y^2/2} \, dy + 2 \int_{-\infty}^\infty \int_y^\infty \frac{1}{2\pi} e^{-x^2/2} e^{-y^2/2} \, dx dy

= \displaystyle \frac{1}{\pi}  \int_{-\infty}^\infty y e^{-y^2} \, dy + 2 \int_{-\infty}^\infty \int_y^\infty \frac{1}{2\pi} e^{-x^2/2} e^{-y^2/2} \, dx dy.

We could calculate the first integral, but we can immediately see that it’s going to be equal to 0 since the integrand y e^{-y^2} is an odd function. The double integral is equal to P(X>Y), which we’ve already shown is equal to \frac{1}{2}. Therefore, E(X^2 \mid X > Y) = 0 + 2 \cdot \frac{1}{2} = 1.

We conclude that

\hbox{Var}(X \mid X > Y) = E(X^2 \mid X > Y) - [E(X \mid X > Y)]^2 = 1 - \displaystyle \left( \frac{1}{\sqrt{\pi}} \right)^2 = 1 - \frac{1}{\pi},

which is indeed less than 1.

This solves the problem for the special case of two independent standard normal random variables. This of course does not yet solve the general case, but my hope was that solving this problem might give me some intuition about the general case, which I’ll develop as this series progresses.

Solving Problems Submitted to MAA Journals (Part 6e)

The following problem appeared in Volume 97, Issue 3 (2024) of Mathematics Magazine.

Two points P and Q are chosen at random (uniformly) from the interior of a unit circle. What is the probability that the circle whose diameter is segment overline{PQ} lies entirely in the interior of the unit circle?

Let D_r be the interior of the circle centered at the origin O with radius r. Also, let C(P,Q) denote the circle with diameter \overline{PQ}, and let R = OP be the distance of P from the origin.

In the previous post, we showed that

\hbox{Pr}(C(P,Q) \subset D_1 \mid R = r) = \sqrt{1-r^2}.

To find \hbox{Pr}(C(P,Q) \subset D_1), I will integrate over this conditional probability:

\hbox{Pr}(C(P,Q) \subset D_1) = \displaystyle \int_0^1 \hbox{Pr}(C(P,Q) \subset D_1 \mid R = r) F'(r) \, dr,

where F(r) is the cumulative distribution function of R. For 0 \le r \le 1,

F(r) = \hbox{Pr}(R \le r) = \hbox{Pr}(P \in D_r) = \displaystyle \frac{\hbox{area}(D_r)}{\hbox{area}(D_1)} = \frac{\pi r^2}{\pi} = r^2.

Therefore,

\hbox{Pr}(C(P,Q) \subset D_1) = \displaystyle \int_0^1 \hbox{Pr}(C(P,Q) \subset D_1 \mid R = r) F'(r) \, dr

= \displaystyle \int_0^1 2 r \sqrt{1-r^2} \, dr.

To calculate this integral, I’ll use the trigonometric substitution u = 1-r^2. Then the endpoints r=0 and r=1 become u = \sqrt{1-0^2} = 1 and u = \sqrt{1-1^2} = 0. Also, du = -2r \, dr. Therefore,

\hbox{Pr}(C(P,Q) \subset D_1) = \displaystyle \int_0^1 2 r \sqrt{1-r^2} \, dr

= \displaystyle \int_1^0 -\sqrt{u} \, du

= \displaystyle \int_0^1 \sqrt{u} \, du

= \displaystyle \frac{2}{3} \left[  u^{3/2} \right]_0^1

=\displaystyle  \frac{2}{3}\left[ (1)^{3/2} - (0)^{3/2} \right]

= \displaystyle \frac{2}{3},

confirming the answer I had guessed from simulations.

Solving Problems Submitted to MAA Journals (Part 5e)

The following problem appeared in Volume 96, Issue 3 (2023) of Mathematics Magazine.

Evaluate the following sums in closed form:

f(x) = \displaystyle \sum_{n=0}^\infty \left( \cos x - 1 + \frac{x^2}{2!} - \frac{x^4}{4!} \dots + (-1)^{n-1} \frac{x^{2n}}{(2n)!} \right)

and

g(x) = \displaystyle \sum_{n=0}^\infty \left( \sin x - x + \frac{x^3}{3!} - \frac{x^5}{5!} \dots + (-1)^{n-1} \frac{x^{2n+1}}{(2n+1)!} \right).

By using the Taylor series expansions of \sin x and \cos x and flipping the order of a double sum, I was able to show that

f(x) = -\displaystyle \frac{x \sin x}{2} \qquad \hbox{and} \qquad g(x) = \frac{x\cos x - \sin x}{2}.

I immediately got to thinking: there’s nothing particularly special about \sin x and \cos x for this analysis. Is there a way of generalizing this result to all functions with a Taylor series expansion?

Suppose

h(x) = \displaystyle \sum_{k=0}^\infty a_k x^k,

and let’s use the same technique to evaluate

\displaystyle \sum_{n=0}^\infty \left( h(x) - \sum_{k=0}^n a_k x^k \right) = \sum_{n=0}^\infty \sum_{k=n+1}^\infty a_k x^k

= \displaystyle \sum_{k=1}^\infty \sum_{n=0}^{k-1} a_k x^k

= \displaystyle \sum_{k=1}^\infty k a_k x^k

= x \displaystyle \sum_{k=1}^\infty k a_k x^{k-1}

= x \displaystyle \sum_{k=1}^\infty \left(a_k x^k \right)'

= x \displaystyle \left[ (a_0)' +  \sum_{k=1}^\infty \left(a_k x^k \right)' \right]

= x \displaystyle \sum_{k=0}^\infty \left(a_k x^k \right)'

= x \displaystyle \left( \sum_{k=0}^\infty a_k x^k \right)'

= x h'(x).

To see why this matches our above results, let’s start with h(x) = \cos x and write out the full Taylor series expansion, including zero coefficients:

\cos x = 1 + 0x - \displaystyle \frac{x^2}{2!} + 0x^3 + \frac{x^4}{4!} + 0x^5 - \frac{x^6}{6!} \dots,

so that

x (\cos x)' = \displaystyle \sum_{n=0}^\infty \left( \cos x - \sum_{k=0}^n a_k x^k \right)

or

-x \sin x= \displaystyle \left(\cos x - 1 \right) + \left(\cos x - 1 + 0x \right) + \left( \cos x -1 + 0x + \frac{x^2}{2!} \right) + \left( \cos x -1 + 0x + \frac{x^2}{2!} + 0x^3 \right)

\displaystyle + \left( \cos x -1 + 0x + \frac{x^2}{2!} + 0x^3 - \frac{x^4}{4!} \right) + \left( \cos x -1 + 0x + \frac{x^2}{2!} + 0x^3 - \frac{x^4}{4!} + 0x^5 \right) \dots

After dropping the zero terms and collecting, we obtain

-x \sin x= \displaystyle 2 \left(\cos x - 1 \right) + 2 \left( \cos x -1 + \frac{x^2}{2!} \right) + 2 \left( \cos x -1 + \frac{x^2}{2!} - \frac{x^4}{4!} \right) \dots

-x \sin x = 2 f(x)

\displaystyle -\frac{x \sin x}{2} = f(x).

A similar calculation would apply to any even function h(x).

We repeat for

h(x) = \sin x = 0 + x + 0x^2 - \displaystyle \frac{x^3}{3!} + 0x^4 + \frac{x^5}{5!} + 0x^6 - \frac{x^7}{7!} \dots,

so that

x (\sin x)' = (\sin x - 0) + (\sin x - 0 - x) + (\sin x - 0 - x + 0x^2)

+ \displaystyle \left( \sin x - 0 - x + 0x^2 + \frac{x^3}{3!} \right) + \left( \sin x - 0 - x + 0x^2 + \frac{x^3}{3!} + 0x^4 \right)

+ \displaystyle \left( \sin x - 0 - x + 0x^2 + \frac{x^3}{3!} + 0x^4 - \frac{x^5}{5!} \right) + \left( \sin x - 0 - x + 0x^2 + \frac{x^3}{3!} + 0x^4 - \frac{x^5}{5!} + 0 x^6 \right) \dots,

or

x\cos x - \sin x = 2(\sin x - x) + \displaystyle 2\left(\sin x - x + \frac{x^3}{3!} \right) + 2 \left( \sin x - x + \frac{x^3}{3!} - \frac{x^5}{5!} \right) \dots

or

x \cos x - \sin x = 2 g(x)

\displaystyle \frac{x \cos x - \sin x}{2} = g(x).

A similar argument applies for any odd function h(x).

Solving Problems Submitted to MAA Journals (Part 5d)

The following problem appeared in Volume 96, Issue 3 (2023) of Mathematics Magazine.

Evaluate the following sums in closed form:

f(x) = \displaystyle \sum_{n=0}^\infty \left( \cos x - 1 + \frac{x^2}{2!} - \frac{x^4}{4!} \dots + (-1)^{n-1} \frac{x^{2n}}{(2n)!} \right)

and

g(x) = \displaystyle \sum_{n=0}^\infty \left( \sin x - x + \frac{x^3}{3!} - \frac{x^5}{5!} \dots + (-1)^{n-1} \frac{x^{2n+1}}{(2n+1)!} \right).

In the previous two posts, I showed that

f(x) = - \displaystyle \frac{x \sin x}{2} \qquad \hbox{and} \qquad g(x) = \displaystyle \frac{x \cos x - \sin x}{2};

the technique that I used was using the Taylor series expansions of \sin x and \cos x to write f(x) and g(x) as double sums and then interchanging the order of summation.

In the post, I share an alternate way of solving for f(x) and g(x). I wish I could take credit for this, but I first learned the idea from my daughter. If we differentiate g(x), we obtain

g'(x) = \displaystyle \sum_{n=0}^\infty \left( [\sin x]' - [x]' + \left[\frac{x^3}{3!}\right]' - \left[\frac{x^5}{5!}\right]' \dots + \left[(-1)^{n-1} \frac{x^{2n+1}}{(2n+1)!}\right]' \right)

= \displaystyle \sum_{n=0}^\infty \left( \cos x - 1 + \frac{3x^2}{3!} - \frac{5x^4}{5!} \dots + (-1)^{n-1} \frac{(2n+1)x^{2n}}{(2n+1)!} \right)

= \displaystyle \sum_{n=0}^\infty \left( \cos x - 1 + \frac{3x^2}{3 \cdot 2!} - \frac{5x^4}{5 \cdot 4!} \dots + (-1)^{n-1} \frac{(2n+1)x^{2n}}{(2n+1)(2n)!} \right)

= \displaystyle \sum_{n=0}^\infty \left( \cos x - 1 + \frac{x^2}{2!} - \frac{x^4}{4!} \dots + (-1)^{n-1} \frac{x^{2n}}{(2n)!} \right)

= f(x).

Something similar happens when differentiating the series for f(x); however, it’s not quite so simple because of the -1 term. I begin by separating the n=0 term from the sum, so that a sum from n =1 to \infty remains:

f(x) = \displaystyle \sum_{n=0}^\infty \left( \cos x - 1 + \frac{x^2}{2!} - \frac{x^4}{4!} \dots + (-1)^{n-1} \frac{x^{2n}}{(2n)!} \right)

= (\cos x - 1) + \displaystyle \sum_{n=1}^\infty \left( \cos x - 1 + \frac{x^2}{2!} - \frac{x^4}{4!} \dots + (-1)^{n-1} \frac{x^{2n}}{(2n)!} \right).

I then differentiate as before:

f'(x) = (\cos x - 1)' + \displaystyle \sum_{n=1}^\infty \left( [\cos x - 1]' + \left[ \frac{x^2}{2!} \right]' - \left[ \frac{x^4}{4!} \right]' \dots + \left[ (-1)^{n-1} \frac{x^{2n}}{(2n)!} \right]' \right)

= -\sin x + \displaystyle \sum_{n=1}^\infty \left( -\sin x + \frac{2x}{2!}  - \frac{4x^3}{4!} \dots + (-1)^{n-1} \frac{(2n) x^{2n-1}}{(2n)!} \right)

= -\sin x + \displaystyle \sum_{n=1}^\infty \left( -\sin x + \frac{2x}{2 \cdot 1!}  - \frac{4x^3}{4 \cdot 3!} \dots + (-1)^{n-1} \frac{(2n) x^{2n-1}}{(2n)(2n-1)!} \right)

= -\sin x + \displaystyle \sum_{n=1}^\infty \left( -\sin x + x - \frac{x^3}{3!} + \dots + (-1)^{n-1} \frac{x^{2n-1}}{(2n-1)!} \right)

= -\sin x - \displaystyle \sum_{n=1}^\infty \left( \sin x - x + \frac{x^3}{3!} + \dots - (-1)^{n-1} \frac{x^{2n-1}}{(2n-1)!} \right).

At this point, we reindex the sum. We make the replacement k = n - 1, so that n = k+1 and k varies from k=0 to \infty. After the replacement, we then change the dummy index from k back to n.

f'(x) = -\sin x - \displaystyle \sum_{k=0}^\infty \left( \sin x - x + \frac{x^3}{3!} + \dots - (-1)^{(k+1)-1} \frac{x^{2(k+1)-1}}{(2(k+1)-1)!} \right)

= -\sin x -  \displaystyle \sum_{k=0}^\infty \left( \sin x - x + \frac{x^3}{3!} + \dots - (-1)^{k} \frac{x^{2k+1}}{(2k+1)!} \right)

= -\sin x -  \displaystyle \sum_{n=0}^\infty \left( \sin x - x + \frac{x^3}{3!} + \dots - (-1)^{n} \frac{x^{2n+1}}{(2n+1)!} \right)

With a slight alteration to the (-1)^n term, this sum is exactly the definition of g(x):

f'(x)= -\sin x -  \displaystyle \sum_{n=0}^\infty \left( \sin x - x + \frac{x^3}{3!} + \dots - (-1)^1 (-1)^{n-1} \frac{x^{2n+1}}{(2n+1)!} \right)

= -\sin x -  \displaystyle \sum_{n=0}^\infty \left( \sin x - x + \frac{x^3}{3!} + \dots + (-1)^{n-1} \frac{x^{2n+1}}{(2n+1)!} \right)

= -\sin x - g(x).

Summarizing, we have shown that g'(x) = f(x) and f'(x) = -\sin x - g(x). Differentiating f'(x) a second time, we obtain

f''(x) = -\cos x - g'(x) = -\cos x - f(x)

or

f''(x) + f(x) = -\cos x.

This last equation is a second-order nonhomogeneous linear differential equation with constant coefficients. A particular solution, using the method of undetermined coefficients, must have the form F(x) = Ax\cos x + Bx \sin x. Substituting, we see that

[Ax \cos x + B x \sin x]'' + A x \cos x + Bx \sin x = -\cos x

-2A \sin x - Ax \cos x + 2B \cos x - B x \sin x + Ax \cos x + B x \sin x = -\cos x

-2A \sin x  + 2B \cos x = -\cos x

We see that A = 0 and B = -1/2, which then lead to the particular solution

F(x) = -\displaystyle \frac{1}{2} x \sin x

Since \cos x and \sin x are solutions of the associated homogeneous equation f''(x) + f(x) = 0, we conclude that

f(x) = c_1 \cos x + c_2 \sin x - \displaystyle \frac{1}{2} x \sin x,

where the values of c_1 and c_2 depend on the initial conditions on f. As it turns out, it is straightforward to compute f(0) and f'(0), so we will choose x=0 for the initial conditions. We observe that f(0) and g(0) are both clearly equal to 0, so that f'(0) = -\sin 0 - g(0) = 0 as well.

The initial condition f(0)=0 clearly imples that c_1 = 0:

f(0) = c_1 \cos 0 + c_2 \sin 0 - \displaystyle \frac{1}{2} \cdot 0 \sin 0

0 = c_1

To find c_2, we first find f'(x):

f'(x) = c_2 \cos x - \displaystyle \frac{1}{2} \sin x - \frac{1}{2} x \cos x

f'(0) = c_2 \cos 0 - \displaystyle  \frac{1}{2} \sin 0 - \frac{1}{2} \cdot 0 \cos 0

0 = c_2.

Since c_1 = c_2 = 0, we conclude that f(x) = - \displaystyle \frac{1}{2} x \sin x, and so

g(x) = -\sin x - f'(x)

= -\sin x - \displaystyle  \left( -\frac{1}{2} \sin x - \frac{1}{2} x \cos x \right)

= \displaystyle \frac{x \cos x - \sin x}{2}.

Solving Problems Submitted to MAA Journals (Part 5c)

The following problem appeared in Volume 96, Issue 3 (2023) of Mathematics Magazine.

Evaluate the following sums in closed form:

f(x) = \displaystyle \sum_{n=0}^\infty \left( \cos x - 1 + \frac{x^2}{2!} - \frac{x^4}{4!} \dots + (-1)^{n-1} \frac{x^{2n}}{(2n)!} \right)

and

g(x) = \displaystyle \sum_{n=0}^\infty \left( \sin x - x + \frac{x^3}{3!} - \frac{x^5}{5!} \dots + (-1)^{n-1} \frac{x^{2n+1}}{(2n+1)!} \right).

In the previous post, we showed that f(x) = - \frac{1}{2} x \sin x by writing the series as a double sum and then reversing the order of summation. We proceed with very similar logic to evaluate g(x). Since

\sin x = \displaystyle \sum_{k=0}^\infty (-1)^k \frac{x^{2k+1}}{(2k+1)!}

is the Taylor series expansion of \sin x, we may write g(x) as

g(x) = \displaystyle \sum_{n=0}^\infty \left( \sum_{k=0}^\infty (-1)^k \frac{x^{2k+1}}{(2k+1)!} - \sum_{k=0}^n (-1)^k \frac{x^{2k+1}}{(2k+1)!} \right)

= \displaystyle \sum_{n=0}^\infty \sum_{k=n+1}^\infty (-1)^k \frac{x^{2k+1}}{(2k+1)!}

As before, we employ one of my favorite techniques from the bag of tricks: reversing the order of summation. Also as before, the inner sum is inner sum is independent of n, and so the inner sum is simply equal to the summand times the number of terms. We see that

g(x) = \displaystyle \sum_{k=1}^\infty \sum_{n=0}^{k-1} (-1)^k \frac{x^{2k+1}}{(2k+1)!}

= \displaystyle \sum_{k=1}^\infty (-1)^k \cdot k \frac{x^{2k+1}}{(2k+1)!}

= \displaystyle \frac{1}{2} \sum_{k=1}^\infty (-1)^k \cdot 2k \frac{x^{2k+1}}{(2k+1)!}.

At this point, the solution for g(x) diverges from the previous solution for f(x). I want to cancel the factor of 2k in the summand; however, the denominator is

(2k+1)! = (2k+1)(2k)!,

and 2k doesn’t cancel cleanly with (2k+1). Hypothetically, I could cancel as follows:

\displaystyle \frac{2k}{(2k+1)!} = \frac{2k}{(2k+1)(2k)(2k-1)!} = \frac{1}{(2k+1)(2k-1)!},

but that introduces an extra (2k+1) in the denominator that I’d rather avoid.

So, instead, I’ll write 2k as (2k+1)-1 and then distribute and split into two different sums:

g(x) = \displaystyle \frac{1}{2} \sum_{k=1}^\infty (-1)^k \cdot 2k \frac{x^{2k+1}}{(2k+1)!}

= \displaystyle \frac{1}{2} \sum_{k=1}^\infty (-1)^k (2k+1-1) \frac{x^{2k+1}}{(2k+1)!}

= \displaystyle \frac{1}{2} \sum_{k=1}^\infty \left[ (-1)^k (2k+1) \frac{x^{2k+1}}{(2k+1)!} - (-1)^k \cdot 1 \frac{x^{2k+1}}{(2k+1)!} \right]

= \displaystyle \frac{1}{2} \sum_{k=1}^\infty (-1)^k (2k+1) \frac{x^{2k+1}}{(2k+1)!} - \frac{1}{2} \sum_{k=1}^\infty (-1)^k  \frac{x^{2k+1}}{(2k+1)!}

= \displaystyle \frac{1}{2} \sum_{k=1}^\infty (-1)^k (2k+1) \frac{x^{2k+1}}{(2k+1)(2k)!} - \frac{1}{2} \sum_{k=1}^\infty (-1)^k \frac{x^{2k+1}}{(2k+1)!}

= \displaystyle \frac{1}{2} \sum_{k=1}^\infty (-1)^k \frac{x^{2k+1}}{(2k)!} - \frac{1}{2} \sum_{k=1}^\infty (-1)^k \frac{x^{2k+1}}{(2k+1)!}.

At this point, I factored out a power of x from the first sum. In this way, the two sums are the Taylor series expansions of \cos x and \sin x:

g(x) = \displaystyle \frac{x}{2} \sum_{k=1}^\infty (-1)^k \cdot \frac{x^{2k}}{(2k)!} - \frac{1}{2} \sum_{k=1}^\infty (-1)^k \frac{x^{2k+1}}{(2k+1)!}

= \displaystyle \frac{x}{2} \cos x - \frac{1}{2} \sin x

= \displaystyle \frac{x \cos x - \sin x}{2}.

This was sufficiently complicated that I was unable to guess this solution by experimenting with Mathematica; nevertheless, Mathematica can give graphical confirmation of the solution since the graphs of the two expressions overlap perfectly.

Solving Problems Submitted to MAA Journals (Part 5b)

The following problem appeared in Volume 96, Issue 3 (2023) of Mathematics Magazine.

Evaluate the following sums in closed form:

f(x) =  \displaystyle \sum_{n=0}^\infty \left( \cos x - 1 + \frac{x^2}{2!} - \frac{x^4}{4!} \dots + (-1)^{n-1} \frac{x^{2n}}{(2n)!} \right)

and

g(x) = \displaystyle \sum_{n=0}^\infty \left( \sin x - x + \frac{x^3}{3!} - \frac{x^5}{5!} \dots + (-1)^{n-1} \frac{x^{2n+1}}{(2n+1)!} \right).

We start with f(x) and the Taylor series

\cos x = \displaystyle \sum_{k=0}^\infty (-1)^k \frac{x^{2k}}{(2k)!}.

With this, f(x) can be written as

f(x) = \displaystyle \sum_{n=0}^\infty \left( \sum_{k=0}^\infty (-1)^k \frac{x^{2k}}{(2k)!} - \sum_{k=0}^n (-1)^k \frac{x^{2k}}{(2k)!} \right)

= \displaystyle \sum_{n=0}^\infty \sum_{k=n+1}^\infty (-1)^k \frac{x^{2k}}{(2k)!}.

At this point, my immediate thought was one of my favorite techniques from the bag of tricks: reversing the order of summation. (Two or three chapters of my Ph.D. theses derived from knowing when to apply this technique.) We see that

f(x) = \displaystyle \sum_{k=1}^\infty \sum_{n=0}^{k-1} (-1)^k \frac{x^{2k}}{(2k)!}.

At this point, the inner sum is independent of n, and so the inner sum is simply equal to the summand times the number of terms. Since there are k terms for the inner sum (n = 0, 1, \dots, k-1), we see

f(x) =  \displaystyle \sum_{k=1}^\infty (-1)^k \cdot k \frac{x^{2k}}{(2k)!}.

To simplify, we multiply top and bottom by 2 so that the first term of (2k)! cancels:

f(x) = \displaystyle \frac{1}{2} \sum_{k=1}^\infty (-1)^k \cdot 2k \frac{x^{2k}}{(2k)(2k-1)!}

= \displaystyle \frac{1}{2} \sum_{k=1}^\infty (-1)^k \frac{x^{2k}}{(2k-1)!}

At this point, I factored out a (-1) and a power of x to make the sum match the Taylor series for \sin x:

f(x) = \displaystyle -\frac{x}{2} \sum_{k=1}^\infty (-1)^{k-1} \frac{x^{2k-1}}{(2k-1)!} = -\frac{x \sin x}{2}.

I was unsurprised but comforted that this matched the guess I had made by experimenting with Mathematica.

Solving Problems Submitted to MAA Journals (Part 5a)

The following problem appeared in Volume 96, Issue 3 (2023) of Mathematics Magazine.

Evaluate the following sums in closed form:

\displaystyle \sum_{n=0}^\infty \left( \cos x - 1 + \frac{x^2}{2!} - \frac{x^4}{4!} \dots + (-1)^{n-1} \frac{x^{2n}}{(2n)!} \right)

and

\displaystyle \sum_{n=0}^\infty \left( \sin x - x + \frac{x^3}{3!} - \frac{x^5}{5!} \dots + (-1)^{n-1} \frac{x^{2n+1}}{(2n+1)!} \right).

When I first read this problem, I immediately noticed that

\displaystyle 1 - \frac{x^2}{2!} + \frac{x^4}{4!} \dots - (-1)^{n-1} \frac{x^{2n}}{(2n)!} \right)

is a Taylor polynomial of \cos x and

\displaystyle x - \frac{x^3}{3!} + \frac{x^5}{5!} \dots - (-1)^{n-1} \frac{x^{2n+1}}{(2n+1)!} \right)

is a Taylor polynomial of \sin x. In other words, the given expressions are the sums of the tail-sums of the Taylor series for \cos x and \sin x.

As usual when stumped, I used technology to guide me. Here’s the graph of the first sum, adding the first 50 terms.

I immediately notice that the function oscillates, which makes me suspect that the answer involves either \cos x or \sin x. I also notice that the sizes of oscillations increase as |x| increases, so that the answer should have the form g(x) \cos x or g(x) \sin x, where g is an increasing function. I also notice that the graph is symmetric about the origin, so that the function is even. I also notice that the graph passes through the origin.

So, taking all of that in, one of my first guesses was y = x \sin x, which is satisfies all of the above criteria.

That’s not it, but it’s not far off. The oscillations of my guess in orange are too big and they’re inverted from the actual graph in blue. After some guessing, I eventually landed on y = -\frac{1}{2} x \sin x.

That was a very good sign… the two graphs were pretty much on top of each other. That’s not a proof that -\frac{1}{2} x \sin x is the answer, of course, but it’s certainly a good indicator.

I didn’t have the same luck with the other sum; I could graph it but wasn’t able to just guess what the curve could be.

Confirming Einstein’s General Theory of Relativity with Calculus: Index

I’m doing something that I should have done a long time ago: collecting a series of posts into one single post. The links below show my series on general relativity and the precession of Mercury’s orbit.

Part 1: Introduction

Part 2: Precession, polar coordinates, and conic sections

  • Part 2a: Graphically exploring precession
  • Part 2b: Polar coordinates and ellipses
  • Part 2c: Polar coordinates, circles, and parabolas
  • Part 2d: Polar coordinates and hyperbolas

Part 3: Method of successive approximations

Part 4: Principles from physics

  • Part 4a: Angular momentum
  • Part 4b: Acceleration in polar coordinates
  • Part 4c: Newton’s Second Law and Newton’s Law of Gravitation

Part 5: Orbits under Newtonian mechanics

  • Part 5a: Confirmation of solution
  • Part 5b: Derivation with calculus
  • Part 5c: Derivation with differential equations and the method of undetermined coefficients
  • Part 5d: Derivation with differential equations and variation of parameters

Part 6: Orbits under general relativity

  • Part 6a: New differential equation under general relativity
  • Part 6b: Confirmation of solution
  • Part 6c: Derivation with variation of parameters
  • Parts 6d, 6e, 6f, 6g, 6h, 6i, 6j: Rationale for the method of undetermined coefficients
  • Part 6k: Derivation with undetermined coefficients

Part 7: Computing precession

Part 8: Second- and third-order solutions with the method of successive approximations

Part 9: Pedagogical thoughts

Earlier this year, I presented these ideas for the UNT Math Department’s Undergraduate Mathematics Colloquium Series. The video of my lecture is below.

Confirming Einstein’s Theory of General Relativity With Calculus, Part 9: Pedagogical Thoughts

At long last, we have reached the end of this series of posts.

The derivation is elementary; I’m confident that I could have understood this derivation had I seen it when I was in high school. That said, the word “elementary” in mathematics can be a bit loaded — this means that it is based on simple ideas that are perhaps used in a profound and surprising way. Perhaps my favorite quote along these lines was this understated gem from the book Three Pearls of Number Theory after the conclusion of a very complicated proof in Chapter 1:

You see how complicated an entirely elementary construction can sometimes be. And yet this is not an extreme case; in the next chapter you will encounter just as elementary a construction which is considerably more complicated.

Here are the elementary ideas from calculus, precalculus, and high school physics that were used in this series:

  • Physics
    • Conservation of angular momentum
    • Newton’s Second Law
    • Newton’s Law of Gravitation
  • Precalculus
    • Completing the square
    • Quadratic formula
    • Factoring polynomials
    • Complex roots of polynomials
    • Bounds on \cos \theta and \sin \theta
    • Period of \cos \theta and \sin \theta
    • Zeroes of \cos \theta and \sin \theta
    • Trigonometric identities (Pythagorean, sum and difference, double-angle)
    • Conic sections
    • Graphing in polar coordinates
    • Two-dimensional vectors
    • Dot products of two-dimensional vectors (especially perpendicular vectors)
    • Euler’s equation
  • Calculus
    • The Chain Rule
    • Derivatives of \cos \theta and \sin \theta
    • Linearizations of \cos x, \sin x, and 1/(1-x) near x \approx 0 (or, more generally, their Taylor series approximations)
    • Derivative of e^x
    • Solving initial-value problems
    • Integration by u-substitution

While these ideas from calculus are elementary, they were certainly used in clever and unusual ways throughout the derivation.

I should add that although the derivation was elementary, certain parts of the derivation could be made easier by appealing to standard concepts from differential equations.

One more thought. While this series of post was inspired by a calculation that appeared in an undergraduate physics textbook, I had thought that this series might be worthy of publication in a mathematical journal as an historical example of an important problem that can be solved by elementary tools. Unfortunately for me, Hieu D. Nguyen’s terrific article Rearing Its Ugly Head: The Cosmological Constant and Newton’s Greatest Blunder in The American Mathematical Monthly is already in the record.