Solving Problems Submitted to MAA Journals (Part 7b)

The following problem appeared in Volume 131, Issue 9 (2024) of The American Mathematical Monthly.

Let X and Y be independent normally distributed random variables, each with its own mean and variance. Show that the variance of X conditioned on the event X>Y is smaller than the variance of X alone.

Not quite knowing how to start, I decided to begin by simplifying the problem and assume that both X and Y follow a standard normal distribution, so that E(X) = E(Y) = 0 and \hbox{SD}(X)=\hbox{SD}(Y) = 1. This doesn’t solve the original problem, of course, but I hoped that solving this simpler case might give me some guidance about tackling the general case. I solved this special case in the previous post.

Next, to work on a special case that was somewhere between the general case and the first special case, I kept X as a standard normal distribution but changed Y to have a nonzero mean. As it turned out, this significantly complicated the problem (as we’ll see in the next post), and I got stuck.

So I changed course: for a second attempt, I kept X as a standard normal distribution but changed Y so that E(Y) = 0 and \hbox{SD}(Y) = \sigma, where \sigma could be something other than 1. The goal is to show that

\hbox{Var}(X \mid X > Y) = E(X^2 \mid X > Y) - [E(X \mid X > Y)]^2 < 1.

We begin by computing E(X \mid X > Y) = \displaystyle \frac{E(X I_{X>Y})}{P(X>Y)}. The denominator is straightforward: since X and Y are independent normal random variables, we also know that X-Y is normally distributed with E(X-Y) = E(X)-E(Y) = 0. (Also, \hbox{Var}(X-Y) = \hbox{Var}(X) + (-1)^2 \hbox{Var}(Y) = \sigma^2+1, but that’s really not needed for this problem.) Therefore, P(X>Y) = P(X-Y>0) = \frac{1}{2} since the distribution of X-Y is symmetric about its mean of 0.

Next, I wrote Y = \sigma Z, where Z has a standard normal distribution. Then

E(X I_{X>\sigma Z}) = \displaystyle \frac{1}{2\pi} \int_{-\infty}^\infty \int_{\sigma z}^\infty x e^{-x^2/2} e^{-z^2/2} \, dx dz,

where we have used the joint probability density function for the independent random variables X and Z. The region of integration is \{(x,z) \in \mathbb{R}^2 \mid x > \sigma z \}, matching the requirement X > \sigma Z. The inner integral can be directly evaluated:

E(X I_{X>Y}) = \displaystyle \frac{1}{2\pi} \int_{-\infty}^\infty \left[ -e^{-x^2/2} \right]_{\sigma z}^\infty e^{-z^2/2} \, dz

= \displaystyle \frac{1}{2\pi} \int_{-\infty}^\infty \left[ 0 + e^{-\sigma^2 z^2/2} \right] e^{-z^2/2} \, dz

= \displaystyle \frac{1}{2\pi} \int_{-\infty}^\infty e^{-(\sigma^2+1) z^2/2} \, dz.

At this point, I rewrote the integrand to be the probability density function of a random variable:

E(X I_{X>Y}) = \displaystyle \frac{1}{\sqrt{2\pi} \sqrt{\sigma^2+1}} \int_{-\infty}^\infty \frac{1}{\sqrt{2\pi} \sqrt{ \frac{1}{\sigma^2+1}}} \exp \left[ -\frac{z^2}{2 \cdot \frac{1}{\sigma^2+1}} \right] \, dz.

The integrand is the probability density function of a normal random variable with mean 0 and variance \sigma^2+1, and so the integral must be equal to 1. We conclude

E(X \mid X>Y) = \displaystyle \frac{E(X I_{X>Y})}{P(X>Y)} = \frac{ \frac{1}{\sqrt{2\pi(\sigma^2+1)}} }{ \frac{1}{2} } = \sqrt{\frac{2}{\pi(\sigma^2+1)}}.

Next, we compute the other conditional expectation:

E(X^2 \mid X > \sigma Z) = \displaystyle \frac{E(X^2 I_{X>\sigma Z})}{P(X>\sigma Z)} = \displaystyle \frac{2}{2\pi} \int_{-\infty}^\infty \int_{\sigma z}^\infty x^2 e^{-x^2/2} e^{-z^2/2} \, dx dz.

The inner integral can be computed using integration by parts:

\displaystyle \int_{\sigma z}^\infty x^2 e^{-x^2/2} \, dx = \int_{\sigma z}^\infty x \frac{d}{dx} \left( -e^{-x^2/2} \right) \, dx

= \displaystyle \left[-x e^{-x^2/2} \right]_{\sigma z}^\infty + \int_{\sigma z}^\infty e^{-x^2/2} \, dx

= \sigma z e^{-\sigma^2 z^2/2} + \displaystyle \int_{\sigma z}^\infty e^{-x^2/2} \, dx.

Therefore,

E(X^2 \mid X > \sigma Z) = \displaystyle \frac{1}{\pi} \int_{-\infty}^\infty \sigma z e^{-\sigma^2 z^2/2} e^{-z^2/2} \, dz + 2 \int_{-\infty}^\infty \int_{\sigma z}^\infty \frac{1}{2\pi} e^{-x^2/2} e^{-z^2/2} \, dx dz

= \displaystyle \frac{1}{\pi} \int_{-\infty}^\infty \sigma z e^{-(\sigma^2+1) z^2} \, dz + 2 \int_{-\infty}^\infty \int_{\sigma z}^\infty \frac{1}{2\pi} e^{-x^2/2} e^{-z^2/2} \, dx dz.

We could calculate the first integral, but we can immediately see that it’s going to be equal to 0 since the integrand z e^{-(\sigma^2+1) z^2} is an odd function. The double integral is equal to P(X>\sigma Z), which we’ve already shown is equal to \frac{1}{2}. Therefore, E(X^2 \mid X > Y) = 0 + 2 \cdot \frac{1}{2} = 1.

We conclude that

\hbox{Var}(X \mid X > Y) = E(X^2 \mid X > Y) - [E(X \mid X > Y)]^2 = 1 - \displaystyle \frac{2}{\pi(\sigma^2 + 1)},

which is indeed less than 1. If \sigma = 1, we recover the conditional variance found in the first special case.

After tackling these two special cases, we start the general case with the next post.

Solving Problems Submitted to MAA Journals (Part 7a)

The following problem appeared in Volume 131, Issue 9 (2024) of The American Mathematical Monthly.

Let X and Y be independent normally distributed random variables, each with its own mean and variance. Show that the variance of X conditioned on the event X>Y is smaller than the variance of X alone.

I admit I did a double-take when I first read this problem. If X and Y are independent, then the event X>Y contains almost no information. How then, so I thought, could the conditional distribution of X given X>Y be narrower than the unconditional distribution of X?

Then I thought: I can believe that E(X \mid X > Y) is greater than E(X): if we’re given that X>Y, then we know that X must be larger than something. So maybe it’s possible for \hbox{Var}(X \mid X>Y) to be less than \hbox{Var}(X).

Still, not quite knowing how to start, I decided to begin by simplifying the problem and assume that both X and Y follow a standard normal distribution, so that E(X) = E(Y) = 0 and \hbox{SD}(X)=\hbox{SD}(Y) = 1. This doesn’t solve the original problem, of course, but I hoped that solving this simpler case might give me some guidance about tackling the general case. I also hoped that solving this special case might give me some psychological confidence that I would eventually be able to solve the general case.

For the special case, the goal is to show that

\hbox{Var}(X \mid X > Y) = E(X^2 \mid X > Y) - [E(X \mid X > Y)]^2 < 1.

We begin by computing E(X \mid X > Y) = \displaystyle \frac{E(X I_{X>Y})}{P(X>Y)}. The denominator is straightforward: since X and Y are independent normal random variables, we also know that X-Y is normally distributed with E(X-Y) = E(X)-E(Y) = 0. (Also, \hbox{Var}(X-Y) = \hbox{Var}(X) + (-1)^2 \hbox{Var}(Y) = 2, but that’s really not needed for this problem.) Therefore, P(X>Y) = P(X-Y>0) = \frac{1}{2} since the distribution of X-Y is symmetric about its mean of 0.

Next,

E(X I_{X>Y}) = \displaystyle \frac{1}{2\pi} \int_{-\infty}^\infty \int_y^\infty x e^{-x^2/2} e^{-y^2}/2 \, dx dy,

where we have used the joint probability density function for X and Y. The region of integration is \{(x,y) \in \mathbb{R}^2 \mid x > y \}, taking care of the requirement X > Y. The inner integral can be directly evaluated:

E(X I_{X>Y}) = \displaystyle \frac{1}{2\pi} \int_{-\infty}^\infty \left[ -e^{-x^2/2} \right]_x^\infty e^{-y^2/2} \, dy

= \displaystyle \frac{1}{2\pi} \int_{-\infty}^\infty \left[ 0 + e^{-y^2/2} \right] e^{-y^2/2} \, dy

= \displaystyle \frac{1}{2\pi} \int_{-\infty}^\infty e^{-y^2} \, dy.

At this point, I used a standard technique/trick of integration by rewriting the integrand to be the probability density function of a random variable. In this case, the random variable is normally distributed with mean 0 and variance 1/2:

E(X I_{X>Y}) = \displaystyle \frac{1}{\sqrt{2\pi} \sqrt{2}} \int_{-\infty}^\infty \frac{1}{\sqrt{2\pi} \sqrt{1/2}} \exp \left[ -\frac{y^2}{2 \cdot \frac{1}{2}} \right] \, dy.

The integral must be equal to 1, and so we conclude

E(X \mid X>Y) = \displaystyle \frac{E(X I_{X>Y})}{P(X>Y)} = \frac{ \frac{1}{2\sqrt{\pi}} }{ \frac{1}{2} } = \frac{1}{\sqrt{\pi}}.

We parenthetically note that E(X \mid X>Y) > 0, matching my initial intuition.

Next, we compute the other conditional expectation:

E(X^2 \mid X > Y) = \displaystyle \frac{E(X^2 I_{X>Y})}{P(X>Y)} = \displaystyle \frac{2}{2\pi} \int_{-\infty}^\infty \int_y^\infty x^2 e^{-x^2/2} e^{-y^2/2} \, dx dy.

The inner integral can be computed using integration by parts:

\displaystyle \int_y^\infty x^2 e^{-x^2/2} \, dx = \int_y^\infty x \frac{d}{dx} \left( -e^{-x^2/2} \right) \, dx

= \displaystyle \left[-x e^{-x^2/2} \right]_y^\infty + \int_y^\infty e^{-x^2/2} \, dx

= y e^{-y^2/2} + \displaystyle \int_y^\infty e^{-x^2/2} \, dx.

Therefore,

E(X^2 \mid X > Y) = \displaystyle \frac{1}{\pi}  \int_{-\infty}^\infty y e^{-y^2/2} e^{-y^2/2} \, dy + 2 \int_{-\infty}^\infty \int_y^\infty \frac{1}{2\pi} e^{-x^2/2} e^{-y^2/2} \, dx dy

= \displaystyle \frac{1}{\pi}  \int_{-\infty}^\infty y e^{-y^2} \, dy + 2 \int_{-\infty}^\infty \int_y^\infty \frac{1}{2\pi} e^{-x^2/2} e^{-y^2/2} \, dx dy.

We could calculate the first integral, but we can immediately see that it’s going to be equal to 0 since the integrand y e^{-y^2} is an odd function. The double integral is equal to P(X>Y), which we’ve already shown is equal to \frac{1}{2}. Therefore, E(X^2 \mid X > Y) = 0 + 2 \cdot \frac{1}{2} = 1.

We conclude that

\hbox{Var}(X \mid X > Y) = E(X^2 \mid X > Y) - [E(X \mid X > Y)]^2 = 1 - \displaystyle \left( \frac{1}{\sqrt{\pi}} \right)^2 = 1 - \frac{1}{\pi},

which is indeed less than 1.

This solves the problem for the special case of two independent standard normal random variables. This of course does not yet solve the general case, but my hope was that solving this problem might give me some intuition about the general case, which I’ll develop as this series progresses.

Engaging students: Graphing and symmetry

In my capstone class for future secondary math teachers, I ask my students to come up with ideas for engaging their students with different topics in the secondary mathematics curriculum. In other words, the point of the assignment was not to devise a full-blown lesson plan on this topic. Instead, I asked my students to think about three different ways of getting their students interested in the topic in the first place.

I plan to share some of the best of these ideas on this blog (after asking my students’ permission, of course).

This student submission comes from my former student Dorathy Scrudder. Her topic, from Precalculus: finding symmetry when graphing a function.

green line

As a dancer, I love movement and I know my students would be appreciative of not sitting still the entire class period. Therefore, I would have my students get into groups of three or four and have one of them do a back bend (pictured below). The other students would then plot points of the first student’s hands, shoulders, stomach, knees, and feet. The students will have to work as a team to connect the points and find the function of the graph. Theoretically, the graph should be symmetrical if the student is flexible enough to do a back bend. As a class, we will look at the different graphs drawn and functions created and determine which graphs are symmetrical and which graphs are not. We will then discuss what makes a graph symmetrical versus asymmetrical. Picture is found at http://www.dreamstime.com/stock-photo-woman-back-bend-image18008780

 

backbend

green line

How can this topic be used in your students’ future courses in mathematics or science?

Finding symmetry when graphing a function will help my students in their future physics classes and math classes. Symmetry is used in physics when talking about projectile motion. When an object is thrown up into the air, it has a constant horizontal velocity and a constant vertical acceleration. This creates a symmetrical parabola when graphed. By covering symmetry when graphing a function with my students in pre-calculus, they will be better prepared to understand the concepts being introduced in their physics classes. Symmetry in functions is also used in calculus classes when discussing trigonometric functions such as sine, cosine, and tangent. Symmetry is also found in statistics classes when talking about normal bell curves. By introducing the concept of symmetry in graphing functions in pre-calculus, my students will have an easier time understanding trigonometric functions in their calculus classes and bell curves in their statistics classes as well as higher level math classes.

green line

How has this topic appeared in the news?

Weather has always been a touchy subject, especially for us here in Texas. We love claiming that we have the hottest summers and we “never see snow” (although we all know we have seen it multiple times over the past few years – including the recent ice-pocalypse). In an article by Ricochet Science, the extreme weather temperatures are analyzed. The article is titled “Extreme Weather: Are High Temperatures the New Normal?” It takes a look at the weather patterns over a series of years since the 1950s. In the graph below, we can see how the temperatures changed over the years and how the normal distribution from the first decade needs to be adjusted to fit the “new normal.”

temperature

This information was found at http://ricochetscience.com/extreme-weather-are-high-temperatures-the-new-normal/ .

 

Reminding students about Taylor series (Part 6)

Sadly, at least at my university, Taylor series is the topic that is least retained by students years after taking Calculus II. They can remember the rules for integration and differentiation, but their command of Taylor series seems to slip through the cracks. In my opinion, the reason for this lack of retention is completely understandable from a student’s perspective: Taylor series is usually the last topic covered in a semester, and so students learn them quickly for the final and quickly forget about them as soon as the final is over.

Of course, when I need to use Taylor series in an advanced course but my students have completely forgotten this prerequisite knowledge, I have to get them up to speed as soon as possible. Here’s the sequence that I use to accomplish this task. Covering this sequence usually takes me about 30 minutes of class time.

I should emphasize that I present this sequence in an inquiry-based format: I ask leading questions of my students so that the answers of my students are driving the lecture. In other words, I don’t ask my students to simply take dictation. It’s a little hard to describe a question-and-answer format in a blog, but I’ll attempt to do this below.

In the previous posts, I described how I lead students to the definition of the Maclaurin series

f(x) = \displaystyle \sum_{k=0}^{\infty} \frac{f^{(k)}(0)}{k!} x^k,

which converges to f(x) within some radius of convergence for all functions that commonly appear in the secondary mathematics curriculum.

green line

Step 7. Let’s now turn to trigonometric functions, starting with f(x) = \sin x.

What’s f(0)? Plugging in, we find f(0) = \sin 0 = 0.

As before, we continue until we find a pattern. Next, f'(x) = \cos x, so that f'(0) = 1.

Next, f''(x) = -\sin x, so that f''(0) = 0.

Next, f'''(x) = -\cos x, so that f''(0) = -1.

No pattern yet. Let’s keep going.

Next, f^{(4)}(x) = \sin x, so that f^{(4)}(0) = 0.

Next, f^{(5)}(x) = \cos x, so that f^{(5)}(0) = 1.

Next, f^{(6)}(x) = -\sin x, so that f^{(6)}(0) = 0.

Next, f^{(7)}(x) = -\cos x, so that f^{(7)}(0) = -1.

OK, it looks like we have a pattern… albeit more awkward than the patterns for e^x and \displaystyle \frac{1}{1-x}. Plugging into the series, we find that

\displaystyle \sin x= x - \frac{x^3}{3!} + \frac{x^5}{5!} - \frac{x^7}{7!} \dots

If we stare at the pattern of terms long enough, we can write this more succinctly as

\sin x = \displaystyle \sum_{n=0}^\infty (-1)^n \frac{x^{2n+1}}{(2n+1)!}

The (-1)^n term accounts for the alternating signs (starting on positive with n=0), while the 2n+1 is needed to ensure that each exponent and factorial is odd.

Let’s see… \sin x has a Taylor expansion that only has odd exponents. In what other sense are the words “sine” and “odd” associated?

In Precalculus, a function f(x) is called odd if f(-x) = -f(x) for all numbers x. For example, f(x) = x^9 is odd since f(-x) = (-x)^9 = -x^9 since 9 is a (you guessed it) an odd number. Also, \sin(-x) = -\sin x, and so \sin x is also an odd function. So we shouldn’t be that surprised to see only odd exponents in the Taylor expansion of \sin x.

A pedagogical note: In my opinion, it’s better (for review purposes) to avoid the \displaystyle \sum notation and simply use the “dot, dot, dot” expression instead. The point of this exercise is to review a topic that’s been long forgotten so that these Taylor series can be used for other purposes. My experience is that the \displaystyle \sum adds a layer of abstraction that students don’t need to overcome for the time being.

green line

Step 8. Let’s now turn try f(x) = \cos x.

What’s f(0)? Plugging in, we find f(0) = \cos 1 = 0.

Next, f'(x) = -\sin x, so that f'(0) = 0.

Next, f''(x) = -\cos x, so that f'(0) = -1.

It looks like the same pattern of numbers as above, except shifted by one derivative. Let’s keep going.

Next, f'''(x) = \sin x, so that f'''(0) = 0.

Next, f^{(4)}(x) = \cos x, so that f^{(4)}(0) = 1.

Next, f^{(5)}(x) = -\sin x, so that f^{(5)}(0) = 0.

Next, f^{(6)}(x) = -\cos x, so that f^{(6)}(0) = -1.

OK, it looks like we have a pattern somewhat similar to that of $\sin x$, except only involving the even terms. I guess that shouldn’t be surprising since, from precalculus we know that \cos x is an even function since \cos(-x) = \cos x for all x.

Plugging into the series, we find that

\displaystyle \cos x= 1 - \frac{x^2}{2!} + \frac{x^4}{4!} - \frac{x^6}{6!} \dots

If we stare at the pattern of terms long enough, we can write this more succinctly as

\cos x = \displaystyle \sum_{n=0}^\infty (-1)^n \frac{x^{2n}}{(2n)!}

green line

As we saw with e^x, the above series converge quickest for values of x near 0. In the case of \sin x and \cos x, this may be facilitated through the use of trigonometric identities, thus accelerating convergence.

For example, the series for \cos 1000^o will converge quite slowly (after converting 1000^o into radians). However, we know that

\cos 1000^o= \cos(1000^o - 720^o) =\cos 280^o

using the periodicity of \cos x. Next, since $\latex 280^o$ is in the fourth quadrant, we can use the reference angle to find an equivalent angle in the first quadrant:

\cos 1000^o = \cos 280^o = \cos(360^o - 80^o) = \cos 80^o

Finally, using the cofunction identity \cos x = \sin(90^o - x), we find

\cos 1000^o = \cos 80^o = sin(90^o - 80^o) = \sin 10^o.

In this way, the sine or cosine of any angle can be reduced to the sine or cosine of some angle between 0^o and $45^o = \pi/4$ radians. Since \pi/4 < 1, the above power series will converge reasonably rapidly.

green line

Step 10. For the final part of this review, let’s take a second look at the Taylor series

e^x = \displaystyle 1 + x + \frac{x^2}{2} + \frac{x^3}{3} + \frac{x^4}{4} + \frac{x^5}{5} + \frac{x^6}{6} + \frac{x^7}{7} + \dots

Just to be silly — for no apparent reason whatsoever, let’s replace x by ix and see what happens:

e^{ix} = \displaystyle 1 - \frac{x^2}{2!} + \frac{x^4}{4!} - \frac{x^6}{6!} \dots + i \left[\displaystyle x - \frac{x^3}{3!} + \frac{x^5}{5!} - \frac{x^7}{7!} \dots \right]

after separating the terms that do and don’t have an i.

Hmmmm… looks familiar….

So it makes sense to define

e^{ix} = \cos x + i \sin x,

which is called Euler’s formula, thus proving an unexpected connected between e^x and the trigonometric functions.