Solving Problems Submitted to MAA Journals (Part 7c)

The following problem appeared in Volume 131, Issue 9 (2024) of The American Mathematical Monthly.

Let X and Y be independent normally distributed random variables, each with its own mean and variance. Show that the variance of X conditioned on the event X>Y is smaller than the variance of X alone.

Not quite knowing how to start, I decided to begin by simplifying the problem and assume that both X and Y follow a standard normal distribution, so that E(X) = E(Y) = 0 and \hbox{SD}(X)=\hbox{SD}(Y) = 1. After solving this special case, I then made a small generalization by allowing \hbox{SD}(Y) to be arbitrary. Solving these two special cases boosted my confidence that I would eventually be able to tackle the general case, which I start to consider with this post.

We suppose that E(X) = \mu_1, \hbox{SD}(X) = \sigma_1, E(Y) = \mu_2, and \hbox{SD}(Y) = \sigma_2. The goal is to show that

\hbox{Var}(X \mid X > Y) = E(X^2 \mid X > Y) - [E(X \mid X > Y)]^2 < \hbox{Var}(X) = \sigma_1^2.

Based on the experience of the special cases, it seems likely that I’ll eventually need to integrate over the joint probability density function of X and Y. However, it’s a bit easier to work with standard normal random variables than general ones, and so I wrote X = \mu_1 + \sigma_1 Z_1 and Y = \mu_2 + \sigma_2 Z_2, where Z_1 and Z_2 are independent standard normal random variables.

We recall that

E(X \mid X > Y) = \displaystyle \frac{E (X I_{X>Y})}{P(X>Y)},

and so we’ll have to compute P(X>Y). We switch to Z_1 and Z_2:

P(X>Y) = P(\mu_1 + \sigma_1 Z_1 > \mu_2 + \sigma_2 Z_2)

= P(\sigma_1 Z_1 > \mu_2 - \mu_1 + \sigma_2 Z_2)

= \displaystyle P \left( Z_1 > \frac{\mu_2 - \mu_1}{\sigma_1} + \frac{\sigma_2}{\sigma_1} Z_2 \right)

= \displaystyle P(Z_1 > a + b Z_2)

= \displaystyle P(bZ_2 - Z_1 < -a),

where we define a = (\mu_2 - \mu_1)/\sigma_1 and b = \sigma_2/\sigma_1 for the sake of simplicity. Since Z_1 and Z_2 are independent, we know that W = bZ_2 - Z_1 will also be a normal random variable with

E(W) = b E(Z_2) - E(Z_1) = 0

and

\hbox{Var}(W) = \hbox{Var}(bZ_2) + \hbox{Var}(-Z_1) = b^2 \hbox{Var}(Z_2) + (-1)^2 \hbox{Var}(Z_1) = b^2+1.

Therefore, converting to standard units,

P(W < -a) = \displaystyle \Phi \left( \frac{-a - 0}{\sqrt{b^2+1}} \right) = \Phi(c),

where c = -a/\sqrt{b^2+1} and \Phi is the cumulative distribution function of the standard normal distribution.

We already see that the general case is more complicated than the two special cases we previously considered, for which P(X>Y) was simply equal to \frac{1}{2}.

In future posts, we take up the computation of E(X I_{X>Y}) and E(X^2 I_{X>Y}).

Solving Problems Submitted to MAA Journals (Part 7b)

The following problem appeared in Volume 131, Issue 9 (2024) of The American Mathematical Monthly.

Let X and Y be independent normally distributed random variables, each with its own mean and variance. Show that the variance of X conditioned on the event X>Y is smaller than the variance of X alone.

Not quite knowing how to start, I decided to begin by simplifying the problem and assume that both X and Y follow a standard normal distribution, so that E(X) = E(Y) = 0 and \hbox{SD}(X)=\hbox{SD}(Y) = 1. This doesn’t solve the original problem, of course, but I hoped that solving this simpler case might give me some guidance about tackling the general case. I solved this special case in the previous post.

Next, to work on a special case that was somewhere between the general case and the first special case, I kept X as a standard normal distribution but changed Y to have a nonzero mean. As it turned out, this significantly complicated the problem (as we’ll see in the next post), and I got stuck.

So I changed course: for a second attempt, I kept X as a standard normal distribution but changed Y so that E(Y) = 0 and \hbox{SD}(Y) = \sigma, where \sigma could be something other than 1. The goal is to show that

\hbox{Var}(X \mid X > Y) = E(X^2 \mid X > Y) - [E(X \mid X > Y)]^2 < 1.

We begin by computing E(X \mid X > Y) = \displaystyle \frac{E(X I_{X>Y})}{P(X>Y)}. The denominator is straightforward: since X and Y are independent normal random variables, we also know that X-Y is normally distributed with E(X-Y) = E(X)-E(Y) = 0. (Also, \hbox{Var}(X-Y) = \hbox{Var}(X) + (-1)^2 \hbox{Var}(Y) = \sigma^2+1, but that’s really not needed for this problem.) Therefore, P(X>Y) = P(X-Y>0) = \frac{1}{2} since the distribution of X-Y is symmetric about its mean of 0.

Next, I wrote Y = \sigma Z, where Z has a standard normal distribution. Then

E(X I_{X>\sigma Z}) = \displaystyle \frac{1}{2\pi} \int_{-\infty}^\infty \int_{\sigma z}^\infty x e^{-x^2/2} e^{-z^2/2} \, dx dz,

where we have used the joint probability density function for the independent random variables X and Z. The region of integration is \{(x,z) \in \mathbb{R}^2 \mid x > \sigma z \}, matching the requirement X > \sigma Z. The inner integral can be directly evaluated:

E(X I_{X>Y}) = \displaystyle \frac{1}{2\pi} \int_{-\infty}^\infty \left[ -e^{-x^2/2} \right]_{\sigma z}^\infty e^{-z^2/2} \, dz

= \displaystyle \frac{1}{2\pi} \int_{-\infty}^\infty \left[ 0 + e^{-\sigma^2 z^2/2} \right] e^{-z^2/2} \, dz

= \displaystyle \frac{1}{2\pi} \int_{-\infty}^\infty e^{-(\sigma^2+1) z^2/2} \, dz.

At this point, I rewrote the integrand to be the probability density function of a random variable:

E(X I_{X>Y}) = \displaystyle \frac{1}{\sqrt{2\pi} \sqrt{\sigma^2+1}} \int_{-\infty}^\infty \frac{1}{\sqrt{2\pi} \sqrt{ \frac{1}{\sigma^2+1}}} \exp \left[ -\frac{z^2}{2 \cdot \frac{1}{\sigma^2+1}} \right] \, dz.

The integrand is the probability density function of a normal random variable with mean 0 and variance \sigma^2+1, and so the integral must be equal to 1. We conclude

E(X \mid X>Y) = \displaystyle \frac{E(X I_{X>Y})}{P(X>Y)} = \frac{ \frac{1}{\sqrt{2\pi(\sigma^2+1)}} }{ \frac{1}{2} } = \sqrt{\frac{2}{\pi(\sigma^2+1)}}.

Next, we compute the other conditional expectation:

E(X^2 \mid X > \sigma Z) = \displaystyle \frac{E(X^2 I_{X>\sigma Z})}{P(X>\sigma Z)} = \displaystyle \frac{2}{2\pi} \int_{-\infty}^\infty \int_{\sigma z}^\infty x^2 e^{-x^2/2} e^{-z^2/2} \, dx dz.

The inner integral can be computed using integration by parts:

\displaystyle \int_{\sigma z}^\infty x^2 e^{-x^2/2} \, dx = \int_{\sigma z}^\infty x \frac{d}{dx} \left( -e^{-x^2/2} \right) \, dx

= \displaystyle \left[-x e^{-x^2/2} \right]_{\sigma z}^\infty + \int_{\sigma z}^\infty e^{-x^2/2} \, dx

= \sigma z e^{-\sigma^2 z^2/2} + \displaystyle \int_{\sigma z}^\infty e^{-x^2/2} \, dx.

Therefore,

E(X^2 \mid X > \sigma Z) = \displaystyle \frac{1}{\pi} \int_{-\infty}^\infty \sigma z e^{-\sigma^2 z^2/2} e^{-z^2/2} \, dz + 2 \int_{-\infty}^\infty \int_{\sigma z}^\infty \frac{1}{2\pi} e^{-x^2/2} e^{-z^2/2} \, dx dz

= \displaystyle \frac{1}{\pi} \int_{-\infty}^\infty \sigma z e^{-(\sigma^2+1) z^2} \, dz + 2 \int_{-\infty}^\infty \int_{\sigma z}^\infty \frac{1}{2\pi} e^{-x^2/2} e^{-z^2/2} \, dx dz.

We could calculate the first integral, but we can immediately see that it’s going to be equal to 0 since the integrand z e^{-(\sigma^2+1) z^2} is an odd function. The double integral is equal to P(X>\sigma Z), which we’ve already shown is equal to \frac{1}{2}. Therefore, E(X^2 \mid X > Y) = 0 + 2 \cdot \frac{1}{2} = 1.

We conclude that

\hbox{Var}(X \mid X > Y) = E(X^2 \mid X > Y) - [E(X \mid X > Y)]^2 = 1 - \displaystyle \frac{2}{\pi(\sigma^2 + 1)},

which is indeed less than 1. If \sigma = 1, we recover the conditional variance found in the first special case.

After tackling these two special cases, we start the general case with the next post.

Solving Problems Submitted to MAA Journals (Part 7a)

The following problem appeared in Volume 131, Issue 9 (2024) of The American Mathematical Monthly.

Let X and Y be independent normally distributed random variables, each with its own mean and variance. Show that the variance of X conditioned on the event X>Y is smaller than the variance of X alone.

I admit I did a double-take when I first read this problem. If X and Y are independent, then the event X>Y contains almost no information. How then, so I thought, could the conditional distribution of X given X>Y be narrower than the unconditional distribution of X?

Then I thought: I can believe that E(X \mid X > Y) is greater than E(X): if we’re given that X>Y, then we know that X must be larger than something. So maybe it’s possible for \hbox{Var}(X \mid X>Y) to be less than \hbox{Var}(X).

Still, not quite knowing how to start, I decided to begin by simplifying the problem and assume that both X and Y follow a standard normal distribution, so that E(X) = E(Y) = 0 and \hbox{SD}(X)=\hbox{SD}(Y) = 1. This doesn’t solve the original problem, of course, but I hoped that solving this simpler case might give me some guidance about tackling the general case. I also hoped that solving this special case might give me some psychological confidence that I would eventually be able to solve the general case.

For the special case, the goal is to show that

\hbox{Var}(X \mid X > Y) = E(X^2 \mid X > Y) - [E(X \mid X > Y)]^2 < 1.

We begin by computing E(X \mid X > Y) = \displaystyle \frac{E(X I_{X>Y})}{P(X>Y)}. The denominator is straightforward: since X and Y are independent normal random variables, we also know that X-Y is normally distributed with E(X-Y) = E(X)-E(Y) = 0. (Also, \hbox{Var}(X-Y) = \hbox{Var}(X) + (-1)^2 \hbox{Var}(Y) = 2, but that’s really not needed for this problem.) Therefore, P(X>Y) = P(X-Y>0) = \frac{1}{2} since the distribution of X-Y is symmetric about its mean of 0.

Next,

E(X I_{X>Y}) = \displaystyle \frac{1}{2\pi} \int_{-\infty}^\infty \int_y^\infty x e^{-x^2/2} e^{-y^2}/2 \, dx dy,

where we have used the joint probability density function for X and Y. The region of integration is \{(x,y) \in \mathbb{R}^2 \mid x > y \}, taking care of the requirement X > Y. The inner integral can be directly evaluated:

E(X I_{X>Y}) = \displaystyle \frac{1}{2\pi} \int_{-\infty}^\infty \left[ -e^{-x^2/2} \right]_x^\infty e^{-y^2/2} \, dy

= \displaystyle \frac{1}{2\pi} \int_{-\infty}^\infty \left[ 0 + e^{-y^2/2} \right] e^{-y^2/2} \, dy

= \displaystyle \frac{1}{2\pi} \int_{-\infty}^\infty e^{-y^2} \, dy.

At this point, I used a standard technique/trick of integration by rewriting the integrand to be the probability density function of a random variable. In this case, the random variable is normally distributed with mean 0 and variance 1/2:

E(X I_{X>Y}) = \displaystyle \frac{1}{\sqrt{2\pi} \sqrt{2}} \int_{-\infty}^\infty \frac{1}{\sqrt{2\pi} \sqrt{1/2}} \exp \left[ -\frac{y^2}{2 \cdot \frac{1}{2}} \right] \, dy.

The integral must be equal to 1, and so we conclude

E(X \mid X>Y) = \displaystyle \frac{E(X I_{X>Y})}{P(X>Y)} = \frac{ \frac{1}{2\sqrt{\pi}} }{ \frac{1}{2} } = \frac{1}{\sqrt{\pi}}.

We parenthetically note that E(X \mid X>Y) > 0, matching my initial intuition.

Next, we compute the other conditional expectation:

E(X^2 \mid X > Y) = \displaystyle \frac{E(X^2 I_{X>Y})}{P(X>Y)} = \displaystyle \frac{2}{2\pi} \int_{-\infty}^\infty \int_y^\infty x^2 e^{-x^2/2} e^{-y^2/2} \, dx dy.

The inner integral can be computed using integration by parts:

\displaystyle \int_y^\infty x^2 e^{-x^2/2} \, dx = \int_y^\infty x \frac{d}{dx} \left( -e^{-x^2/2} \right) \, dx

= \displaystyle \left[-x e^{-x^2/2} \right]_y^\infty + \int_y^\infty e^{-x^2/2} \, dx

= y e^{-y^2/2} + \displaystyle \int_y^\infty e^{-x^2/2} \, dx.

Therefore,

E(X^2 \mid X > Y) = \displaystyle \frac{1}{\pi}  \int_{-\infty}^\infty y e^{-y^2/2} e^{-y^2/2} \, dy + 2 \int_{-\infty}^\infty \int_y^\infty \frac{1}{2\pi} e^{-x^2/2} e^{-y^2/2} \, dx dy

= \displaystyle \frac{1}{\pi}  \int_{-\infty}^\infty y e^{-y^2} \, dy + 2 \int_{-\infty}^\infty \int_y^\infty \frac{1}{2\pi} e^{-x^2/2} e^{-y^2/2} \, dx dy.

We could calculate the first integral, but we can immediately see that it’s going to be equal to 0 since the integrand y e^{-y^2} is an odd function. The double integral is equal to P(X>Y), which we’ve already shown is equal to \frac{1}{2}. Therefore, E(X^2 \mid X > Y) = 0 + 2 \cdot \frac{1}{2} = 1.

We conclude that

\hbox{Var}(X \mid X > Y) = E(X^2 \mid X > Y) - [E(X \mid X > Y)]^2 = 1 - \displaystyle \left( \frac{1}{\sqrt{\pi}} \right)^2 = 1 - \frac{1}{\pi},

which is indeed less than 1.

This solves the problem for the special case of two independent standard normal random variables. This of course does not yet solve the general case, but my hope was that solving this problem might give me some intuition about the general case, which I’ll develop as this series progresses.

Solving Problems Submitted to MAA Journals (Part 6e)

The following problem appeared in Volume 97, Issue 3 (2024) of Mathematics Magazine.

Two points P and Q are chosen at random (uniformly) from the interior of a unit circle. What is the probability that the circle whose diameter is segment overline{PQ} lies entirely in the interior of the unit circle?

Let D_r be the interior of the circle centered at the origin O with radius r. Also, let C(P,Q) denote the circle with diameter \overline{PQ}, and let R = OP be the distance of P from the origin.

In the previous post, we showed that

\hbox{Pr}(C(P,Q) \subset D_1 \mid R = r) = \sqrt{1-r^2}.

To find \hbox{Pr}(C(P,Q) \subset D_1), I will integrate over this conditional probability:

\hbox{Pr}(C(P,Q) \subset D_1) = \displaystyle \int_0^1 \hbox{Pr}(C(P,Q) \subset D_1 \mid R = r) F'(r) \, dr,

where F(r) is the cumulative distribution function of R. For 0 \le r \le 1,

F(r) = \hbox{Pr}(R \le r) = \hbox{Pr}(P \in D_r) = \displaystyle \frac{\hbox{area}(D_r)}{\hbox{area}(D_1)} = \frac{\pi r^2}{\pi} = r^2.

Therefore,

\hbox{Pr}(C(P,Q) \subset D_1) = \displaystyle \int_0^1 \hbox{Pr}(C(P,Q) \subset D_1 \mid R = r) F'(r) \, dr

= \displaystyle \int_0^1 2 r \sqrt{1-r^2} \, dr.

To calculate this integral, I’ll use the trigonometric substitution u = 1-r^2. Then the endpoints r=0 and r=1 become u = \sqrt{1-0^2} = 1 and u = \sqrt{1-1^2} = 0. Also, du = -2r \, dr. Therefore,

\hbox{Pr}(C(P,Q) \subset D_1) = \displaystyle \int_0^1 2 r \sqrt{1-r^2} \, dr

= \displaystyle \int_1^0 -\sqrt{u} \, du

= \displaystyle \int_0^1 \sqrt{u} \, du

= \displaystyle \frac{2}{3} \left[  u^{3/2} \right]_0^1

=\displaystyle  \frac{2}{3}\left[ (1)^{3/2} - (0)^{3/2} \right]

= \displaystyle \frac{2}{3},

confirming the answer I had guessed from simulations.

Solving Problems Submitted to MAA Journals (Part 6d)

The following problem appeared in Volume 97, Issue 3 (2024) of Mathematics Magazine.

Two points P and Q are chosen at random (uniformly) from the interior of a unit circle. What is the probability that the circle whose diameter is segment overline{PQ} lies entirely in the interior of the unit circle?

As discussed in a previous post, I guessed from simulation that the answer is 2/3. Naturally, simulation is not a proof, and so I started thinking about how to prove this.

My first thought was to make the problem simpler by letting only one point be chosen at random instead of two. Suppose that the point P is fixed at a distance t from the origin. What is the probability that the point Q, chosen at random, uniformly, from the interior of the unit circle, has the desired property?

My second thought is that, by radial symmetry, I could rotate the figure so that the point P is located at (t,0). In this way, the probability in question is ultimately going to be a function of t.

There is a very nice way to compute such probabilities since Q is chosen at uniformly from the unit circle. Let A_t be the set of all points Q within the unit circle that have the desired property. Since the area of the unit circle is \pi(1)^2 = \pi, the probability of desired property happening is

\displaystyle \frac{\hbox{area}(A_t)}{\pi}.

Based on the simulations discussed in the previous post, my guess was that A_t was the interior of an ellipse centered at the origin with a semimajor axis of length 1 and a semiminor axis of length \sqrt{1-t^2}. Now I had to think about how to prove this.

As noted earlier in this series, the circle with diameter \overline{PQ} will lie within the unit circle exactly when MO+MP < 1, where M is the midpoint of \overline{PQ}. So suppose that P has coordinates (t,0), where t is known, and let the coordinates of Q be (x,y). Then the coordinates of M will be

\displaystyle \left( \frac{x+t}{2}, \frac{y}{2} \right),

so that

MO = \displaystyle \sqrt{ \left( \frac{x+t}{2} \right)^2 + \left( \frac{y}{2} \right)^2}

and

MP = \displaystyle \sqrt{ \left( \frac{x+t}{2} - t\right)^2 + \left( \frac{y}{2} \right)^2} =  \sqrt{ \left( \frac{x-t}{2} \right)^2 + \left( \frac{y}{2} \right)^2}.

Therefore, the condition MO+MP < 1 (again, equivalent to the condition that the circle with diameter \overline{PQ} lies within the unit circle) becomes

\displaystyle \sqrt{ \left( \frac{x+t}{2} \right)^2 + \left( \frac{y}{2} \right)^2} + \sqrt{ \left( \frac{x-t}{2} \right)^2 + \left( \frac{y}{2} \right)^2} < 1,

which simplifies to

\displaystyle \sqrt{ \frac{1}{4} \left[ (x+t)^2 + y^2 \right]} + \sqrt{ \frac{1}{4} \left[ (x-t)^2 + y^2 \right]} < 1

\displaystyle \frac{1}{2}\sqrt{   (x+t)^2 + y^2} +  \frac{1}{2}\sqrt{  (x-t)^2 + y^2} < 1

\displaystyle \sqrt{   (x+t)^2 + y^2} +  \sqrt{  (x-t)^2 + y^2} < 2.

When I saw this, light finally dawned. Given two points F_1 and F_2, called the foci, an ellipse is defined to be the set of all points Q so that QF_1 + QF_2 = 2a, where a is a constant. If the coordinates of Q, F_1, and F_2 are (x,y), (c,0), and (-c,0), then this becomes

\displaystyle \sqrt{   (x+c)^2 + y^2} +  \sqrt{  (x-c)^2 + y^2} = 2a.

Therefore, the set A_t is the interior of an ellipse centered at the origin with a = 1 and c = t. Furthermore, a = 1 is the semimajor axis of the ellipse, while the semiminor axis is equal to b = \sqrt{a^2-c^2} = \sqrt{1-t^2}.

At last, I could now return to the original question. Suppose that the point P is fixed at a distance t from the origin. What is the probability that the point Q, chosen at random, uniformly, from the interior of the unit circle, has the property that the circle with diameter \overline{PQ} lies within the unit circle? Since A_t is a subset of the interior of the unit circle, we see that this probability is equal to

\displaystyle \frac{\hbox{area}(A_t)}{\hbox{area of unit circle}} = \frac{\pi \cdot 1 \cdot \sqrt{1-t^2}}{\pi (1)^2} = \sqrt{1-t^2}.

In the next post, I’ll use this intermediate step to solve the original question.

Solving Problems Submitted to MAA Journals (Part 6c)

The following problem appeared in Volume 97, Issue 3 (2024) of Mathematics Magazine.

Two points P and Q are chosen at random (uniformly) from the interior of a unit circle. What is the probability that the circle whose diameter is segment \overline{PQ} lies entirely in the interior of the unit circle?

As discussed in the previous post, I guessed from simulation that the answer is 2/3. Naturally, simulation is not a proof, and so I started thinking about how to prove this.

My first thought was to make the problem simpler by letting only one point be chosen at random instead of two. Suppose that the point P is fixed at a distance t from the origin. What is the probability that the point Q, chosen at random, uniformly, from the interior of the unit circle, has the desired property?

My second thought is that, by radial symmetry, I could rotate the figure so that the point P is located at (t,0). In this way, the probability in question is ultimately going to be a function of t.

There is a very nice way to compute such probabilities since Q is chosen at uniformly from the unit circle. Let A_t be the probability that the point Q has the desired property. Since the area of the unit circle is \pi(1)^2 = \pi, the probability of desired property happening is

\displaystyle \frac{\hbox{area}(A_t)}{\pi}.

So, if I could figure out the shape of A_t, I could compute this conditional probability given the location of the point P.

But, once again, I initially had no idea of what this shape would look like. So, once again, I turned to simulation with Mathematica. As noted earlier in this series, the circle with diameter \overline{PQ} will lie within the unit circle exactly when MO+MP < 1, where M is the midpoint of \overline{PQ}. For my initial simulation, I chose P to have coordinates (0.5,0).

To my surprise, I immediately recognized that the points had the shape of an ellipse centered at the origin. Indeed, with a little playing around, it looked like this ellipse had a semimajor axis of 1 and a semiminor axis of about 0.87.

My next thought was to attempt to find the relationship between the length of the semiminor axis at the distance t of P from the origin. I thought I’d draw of few of these simulations for different values of t and then try to see if there was some natural function connecting t to my guesses. My next attempt was t = 0.6; as it turned out, it looked like the semiminor axis now had a length of 0.8.

At this point, something clicked: (6,8,10) is a Pythagorean triple, meaning that

6^2 + 8^2 = 10^2

(0.6)^2 + (0.8)^2 = 1^2

(0.8)^2 = 1 - (0.6)^2

0.8 = \sqrt{1 - (0.6)^2}

Also, 0.87 is very close to \sqrt{3}/2, a very familiar number from trigonometry:

\displaystyle \frac{\sqrt{3}}{2} = \sqrt{1 - (0.5)^2}

So I had a guess: the semiminor axis has length \sqrt{1-t^2}. A few more simulations with different values of t confirmed this guess. For instance, here’s the picture with t = 0.9.

Now that I was psychologically certain of the answer for A_t, all that remain was proving that this guess actually worked. That’ll be the subject of the next post.

Solving Problems Submitted to MAA Journals (Part 6b)

The following problem appeared in Volume 97, Issue 3 (2024) of Mathematics Magazine.

Two points P and Q are chosen at random (uniformly) from the interior of a unit circle. What is the probability that the circle whose diameter is segment \overline{PQ} lies entirely in the interior of the unit circle?

As discussed in the previous post, I guessed from simulation that the answer is 2/3. Naturally, simulation is not a proof, and so I started thinking about how to prove this.

My first thought was to make the problem simpler by letting only one point be chosen at random instead of two. Suppose that the point P is fixed at a distance t from the origin. What is the probability that the point Q, chosen at random, uniformly, from the interior of the unit circle, has the desired property?

My second thought is that, by radial symmetry, I could rotate the figure so that the point P is located at (t,0). In this way, the probability in question is ultimately going to be a function of t.

There is a very nice way to compute such probabilities since Q is chosen at uniformly from the unit circle. Let A_t be the probability that the point Q has the desired property. Since the area of the unit circle is \pi(1)^2 = \pi, the probability of desired property happening is

\displaystyle \frac{\hbox{area}(A_t)}{\pi}.

So, if I could figure out the shape of A_t, I could compute this conditional probability given the location of the point P.

But, once again, I initially had no idea of what this shape would look like. So, once again, I turned to simulation with Mathematica.

First, a technical detail that I ignored in the previous post. To generate points (x,y) at random inside the unit circle, one might think to let x = r \cos \theta and y = r \sin \theta, where the distance from the origin r is chosen at random between 0 and 1 and the angle \theta is chosen at random from 0 to 2\pi. Unfortunately, this simple simulation generates too many points that are close to the origin and not enough that are close to the circle:

To see why this happened, let R denote the distance of a randomly chosen point from the origin. Then the event R < r is the same as saying that the point lies inside the circle centered at the origin with radius r, so that the probability of this event should be

F(r) = P(R < r) = \displaystyle \frac{\pi r^2}{\pi (1)^2} = r^2.

However, in the above simulation, R was chosen uniformly from [0,1], so that P(R < r) = r. All this to say, the above simulation did not produce points uniformly chosen from the unit circle.

To remedy this, we employ the standard technique of using the inverse of the above function F(r), which is clearly F^{-1}(r) = \sqrt{r}. In other words, we will chose randomly chosen radius to have the form R= \sqrt{U}, where U is chosen uniformly on [0,1]. In this way,

P(R < r) = P( \sqrt{U} < r) = P(U < r^2) = r^2,

as required. Making this modification (highlighted in yellow) produces points that are more evenly distributed in the unit circle; any bunching of points or empty spaces are simply due to the luck of the draw.

In the next post, I’ll turn to the simulation of A_t.

Solving Problems Submitted to MAA Journals (Part 6a)

The following problem appeared in Volume 97, Issue 3 (2024) of Mathematics Magazine.

Two points P and Q are chosen at random (uniformly) from the interior of a unit circle. What is the probability that the circle whose diameter is segment \overline{PQ} lies entirely in the interior of the unit circle?

It took me a while to wrap my head around the statement of the problem. In the figure, the points P and Q are chosen from inside the unit circle (blue). Then the circle (pink) with diameter \overline{PQ} has center M, the midpoint of \overline{PQ}. Also, the radius of the pink circle is MP=MQ.

The pink circle will lie entirely the blue circle exactly when the green line containing the origin O, the point M, and a radius of the pink circle lies within the blue circle. Said another way, the condition is that the distance MO plus the radius of the pink circle is less than 1, or

MO + MP < 1.

As a first step toward wrapping my head around this problem, I programmed a simple simulation in Mathematica to count the number of times that MO + MP < 1 when points P and Q were chosen at random from the unit circle.

In the above simulation, out of about 61,000,000 attempts, 66.6644% of the attempts were successful. This leads to the natural guess that the true probability is 2/3. Indeed, the 95% confidence confidence interval (0.666524, 0.666764) contains 2/3, so that the difference of 0.666644 from 2/3 can be plausibly attributed to chance.

I end with a quick programming note. This certainly isn’t the ideal way to perform the simulation. First, for a fast simulation, I should have programmed in C++ or Python instead of Mathematica. Second, the coordinates of P and Q are chosen from the unit square, so it’s quite possible for P or Q or both to lie outside the unit circle. Indeed, the chance that both P and Q lie in the unit disk in this simulation is (\pi/4)^2 \approx 0.617, meaning that about 38.3\% of the simulations were simply wasted. So the only sense that this was a quick simulation was that I could type it quickly in Mathematica and then let the computer churn out a result. (I’ll talk about a better way to perform the simulation in the next post.)

Solving Problems Submitted to MAA Journals (Part 5e)

The following problem appeared in Volume 96, Issue 3 (2023) of Mathematics Magazine.

Evaluate the following sums in closed form:

f(x) = \displaystyle \sum_{n=0}^\infty \left( \cos x - 1 + \frac{x^2}{2!} - \frac{x^4}{4!} \dots + (-1)^{n-1} \frac{x^{2n}}{(2n)!} \right)

and

g(x) = \displaystyle \sum_{n=0}^\infty \left( \sin x - x + \frac{x^3}{3!} - \frac{x^5}{5!} \dots + (-1)^{n-1} \frac{x^{2n+1}}{(2n+1)!} \right).

By using the Taylor series expansions of \sin x and \cos x and flipping the order of a double sum, I was able to show that

f(x) = -\displaystyle \frac{x \sin x}{2} \qquad \hbox{and} \qquad g(x) = \frac{x\cos x - \sin x}{2}.

I immediately got to thinking: there’s nothing particularly special about \sin x and \cos x for this analysis. Is there a way of generalizing this result to all functions with a Taylor series expansion?

Suppose

h(x) = \displaystyle \sum_{k=0}^\infty a_k x^k,

and let’s use the same technique to evaluate

\displaystyle \sum_{n=0}^\infty \left( h(x) - \sum_{k=0}^n a_k x^k \right) = \sum_{n=0}^\infty \sum_{k=n+1}^\infty a_k x^k

= \displaystyle \sum_{k=1}^\infty \sum_{n=0}^{k-1} a_k x^k

= \displaystyle \sum_{k=1}^\infty k a_k x^k

= x \displaystyle \sum_{k=1}^\infty k a_k x^{k-1}

= x \displaystyle \sum_{k=1}^\infty \left(a_k x^k \right)'

= x \displaystyle \left[ (a_0)' +  \sum_{k=1}^\infty \left(a_k x^k \right)' \right]

= x \displaystyle \sum_{k=0}^\infty \left(a_k x^k \right)'

= x \displaystyle \left( \sum_{k=0}^\infty a_k x^k \right)'

= x h'(x).

To see why this matches our above results, let’s start with h(x) = \cos x and write out the full Taylor series expansion, including zero coefficients:

\cos x = 1 + 0x - \displaystyle \frac{x^2}{2!} + 0x^3 + \frac{x^4}{4!} + 0x^5 - \frac{x^6}{6!} \dots,

so that

x (\cos x)' = \displaystyle \sum_{n=0}^\infty \left( \cos x - \sum_{k=0}^n a_k x^k \right)

or

-x \sin x= \displaystyle \left(\cos x - 1 \right) + \left(\cos x - 1 + 0x \right) + \left( \cos x -1 + 0x + \frac{x^2}{2!} \right) + \left( \cos x -1 + 0x + \frac{x^2}{2!} + 0x^3 \right)

\displaystyle + \left( \cos x -1 + 0x + \frac{x^2}{2!} + 0x^3 - \frac{x^4}{4!} \right) + \left( \cos x -1 + 0x + \frac{x^2}{2!} + 0x^3 - \frac{x^4}{4!} + 0x^5 \right) \dots

After dropping the zero terms and collecting, we obtain

-x \sin x= \displaystyle 2 \left(\cos x - 1 \right) + 2 \left( \cos x -1 + \frac{x^2}{2!} \right) + 2 \left( \cos x -1 + \frac{x^2}{2!} - \frac{x^4}{4!} \right) \dots

-x \sin x = 2 f(x)

\displaystyle -\frac{x \sin x}{2} = f(x).

A similar calculation would apply to any even function h(x).

We repeat for

h(x) = \sin x = 0 + x + 0x^2 - \displaystyle \frac{x^3}{3!} + 0x^4 + \frac{x^5}{5!} + 0x^6 - \frac{x^7}{7!} \dots,

so that

x (\sin x)' = (\sin x - 0) + (\sin x - 0 - x) + (\sin x - 0 - x + 0x^2)

+ \displaystyle \left( \sin x - 0 - x + 0x^2 + \frac{x^3}{3!} \right) + \left( \sin x - 0 - x + 0x^2 + \frac{x^3}{3!} + 0x^4 \right)

+ \displaystyle \left( \sin x - 0 - x + 0x^2 + \frac{x^3}{3!} + 0x^4 - \frac{x^5}{5!} \right) + \left( \sin x - 0 - x + 0x^2 + \frac{x^3}{3!} + 0x^4 - \frac{x^5}{5!} + 0 x^6 \right) \dots,

or

x\cos x - \sin x = 2(\sin x - x) + \displaystyle 2\left(\sin x - x + \frac{x^3}{3!} \right) + 2 \left( \sin x - x + \frac{x^3}{3!} - \frac{x^5}{5!} \right) \dots

or

x \cos x - \sin x = 2 g(x)

\displaystyle \frac{x \cos x - \sin x}{2} = g(x).

A similar argument applies for any odd function h(x).