Integration Using Schwinger Parametrization

I recently read the terrific article Integration Using Schwinger Parametrization, by David M. Bradley, Albert Natian, and Sean M. Stewart in the American Mathematical Monthly. I won’t reproduce the entire article here, but I’ll hit a couple of early highlights.

The basic premise of the article is that a complicated integral can become tractable by changing it into an apparently more complicated double integral. The idea stems from the gamma integral

\Gamma(p) = \displaystyle \int_0^\infty t^{p-1} e^{-t} \, dt,

where $\Gamma(p) = (p-1)!$ if p is a positive integer. If we perform the substitution t = \phi u in the above integral, where \phi is a quantity independent of t, we obtain

\Gamma(p) = \displaystyle \int_0^\infty (\phi u)^{p-1} e^{-\phi t} \phi \, du = \displaystyle \int_0^\infty \phi^p u^{p-1} e^{-\phi u} \, du,

which may be rewritten as

\displaystyle \frac{1}{\phi^p} = \displaystyle \frac{1}{\Gamma(p)} \int_0^\infty t^{p-1} e^{-\phi t} \, dt

after changing the dummy variable back to t.

A simple (!) application of this method is the famous Dirichlet integral

I = \displaystyle \int_0^\infty \frac{\sin x}{x} \, dx

which is pretty much unsolvable using techniques from freshman calculus. However, by substituting \phi = x and p=1 in the above gamma equation, and using the fact that \Gamma(1) = 0! = 1, we obtain

I = \displaystyle \int_0^\infty \sin x \int_0^\infty e^{-xt} \, dt \, dx

= \displaystyle \int_0^\infty \int_0^\infty e^{-xt} \sin x \, dx \, dt

after interchanging the order of integration. The inner integral can be found by integration by parts and is often included in tables of integrals:

I = \displaystyle \int_0^\infty -\left[ \frac{e^{-xt} (\cos x + t \sin x)}{1+t^2} \right]_{x=0}^{x=\infty} \, dt

= \displaystyle \int_0^\infty \left[0 +\frac{e^{0} (\cos 0 + t \sin 0)}{1+t^2} \right] \, dt

= \displaystyle \int_0^\infty \frac{1}{1+t^2} \, dt.

At this point, the integral is now a standard one from freshman calculus:

I = \displaystyle \left[ \tan^{-1} t \right]_0^\infty = \displaystyle \frac{\pi}{2} - 0 = \displaystyle \frac{\pi}{2}.

In the article, the authors give many more applications of this method to other integrals, thus illustrating the famous quote, “An idea which can be used only once is a trick. If one can use it more than once it becomes a method.” The authors also add, “We present some examples to illustrate the utility of this technique in the hope that by doing so we may convince the reader that it makes a valuable addition to one’s integration toolkit.” I’m sold.

Horrible False Analogy

I had forgotten the precise assumptions on uniform convergence that guarantees that an infinite series can be differentiated term by term, so that one can safely conclude

\displaystyle \frac{d}{dx} \sum_{n=1}^\infty f_n(x) = \sum_{n=1}^\infty f_n'(x).

This was part of my studies in real analysis as a student, so I remembered there was a theorem but I had forgotten the details.

So, like just about everyone else on the planet, I went to Google to refresh my memory even though I knew that searching for mathematical results on Google can be iffy at best.

And I was not disappointed. Behold this laughably horrible false analogy (and even worse graphic) that I found on chegg.com:

Suppose Arti has to plan a birthday party and has lots of work to do like arranging stuff for decorations, planning venue for the party, arranging catering for the party, etc. All these tasks can not be done in one go and so need to be planned. Once the order of the tasks is decided, they are executed step by step so that all the arrangements are made in time and the party is a success.

Similarly, in Mathematics when a long expression needs to be differentiated or integrated, the calculation becomes cumbersome if the expression is considered as a whole but if it is broken down into small expressions, both differentiation and the integration become easy.

Pedagogically, I’m all for using whatever technique an instructor might deem necessary to to “sell” abstract mathematical concepts to students. Nevertheless, I’m pretty sure that this particular party-planning analogy has no potency for students who have progressed far enough to rigorously study infinite series.

Solving Problems Submitted to MAA Journals (Part 7i)

The following problem appeared in Volume 131, Issue 9 (2024) of The American Mathematical Monthly.

Let X and Y be independent normally distributed random variables, each with its own mean and variance. Show that the variance of X conditioned on the event X>Y is smaller than the variance of X alone.

In previous posts, we reduced the problem to showing that if f(x) = 1 + \sqrt{2\pi} x e^{x^2/2} \Phi(x), then f(x) is always positive, where

\Phi(z) = \displaystyle \frac{1}{\sqrt{2\pi}} \int_{-\infty}^z e^{-z^2/2} \, dz

is the cumulative distribution function of the standard normal distribution. If we can prove this, then the original problem will be true.

Motivated by the graph of f(x), I thought of a two-step method for showing f must be positive: show that f is an increasing function, and show that \displaystyle \lim_{x \to -\infty} f(x) = 0. If I could prove both of these claims, then that would prove that f must always be positive.

I was able to show the second step by demonstrating that, if x<0,

\displaystyle f(x) = |x| e^{x^2/2} \int_{-\infty}^x \frac{1}{t^2} e^{-t^2/2} \, dt.

As discussed in the last post, the limit \displaystyle \lim_{x \to -\infty} f(x) = 0 follows from this equality. However, I just couldn’t figure out the first step.

So I kept trying.

And trying.

And trying.

Until it finally hit me: I’m working too hard! The goal is to show that f(x) is positive. Clearly, clearly, the right-hand side of the last equation is positive! So that’s the entire proof for x<0… there was no need to prove that f is increasing!

For x \ge 0, it’s even easier. If x is non-negative, then

f(x) = 1 + \sqrt{2\pi} x e^{x^2/2} \Phi(x) \ge 1 + \sqrt{2\pi} \cdot 0 \cdot 1 \cdot \frac{1}{2} = 1 > 0.

So, in either case, f(x) must be positive. Following the logical thread in the previous posts, this demonstrates that \hbox{Var}(Z_1 \mid Z_1 > a+bZ_2) < 1, so that \hbox{Var}(X \mid X <Y) < \hbox{Var}(X), thus concluding the solution.

And I was really annoyed at myself that I stumbled over the last step for so long, when the solution was literally right in front of me.

Solving Problems Submitted to MAA Journals (Part 7h)

The following problem appeared in Volume 131, Issue 9 (2024) of The American Mathematical Monthly.

Let X and Y be independent normally distributed random variables, each with its own mean and variance. Show that the variance of X conditioned on the event X>Y is smaller than the variance of X alone.

In previous posts, we reduced the problem to showing that if g(x) = \sqrt{2\pi} x e^{x^2/2} \Phi(x), then f(x) = 1 + g(x) is always positive, where

\Phi(z) = \displaystyle \frac{1}{\sqrt{2\pi}} \int_{-\infty}^z e^{-z^2/2} \, dz

is the cumulative distribution function of the standard normal distribution. If we can prove this, then the original problem will be true.

When I was solving this problem for the first time, my progress through the first few steps was hindered by algebra mistakes and the like, but I didn’t doubt that I was progressing toward the answer. At this point in the solution, however, I was genuinely stuck: nothing immediately popped to mind for showing that g(x) must be greater than -1.

So I turned to Mathematica, just to make sure I was on the right track. Based on the graph, the function of f(x) certainly looks positive.

What’s more, the graph suggests attempting to prove a couple of things: f is an increasing function, and \displaystyle \lim_{x \to -\infty} f(x) = 0 or, equivalently, \displaystyle \lim_{x \to -\infty} g(x) = -1. If I could prove both of these claims, then that would prove that f must always be positive.

I started by trying to show

\displaystyle \lim_{x \to -\infty} g(x) = \lim_{x \to \infty}  x e^{x^2/2} \int_{-\infty}^x e^{-t^2/2} \, dt = -1.

I vaguely remembered something about the asymptotic expansion of the above integral from a course decades ago, and so I consulted that course’s textbook, by Bender and Orszag, to refresh my memory. To derive the behavior of g(x) as x \to -\infty, we integrate by parts. (This is permissible: the integrands below are well-behaved if x<0, so that 0 is not in the range of integration.)

g(x) = \displaystyle x e^{x^2/2} \int_{-\infty}^x e^{-t^2/2} \, dt

= \displaystyle x e^{x^2/2} \int_{-\infty}^x \frac{1}{t} \frac{d}{dt} \left(-e^{-t^2/2}\right) \, dt

= \displaystyle  x e^{x^2/2} \left[ -\frac{1}{t} e^{-t^2/2} \right]_{-\infty}^x - x e^{x^2/2} \int_{-\infty}^x \frac{d}{dt} \left(\frac{1}{t} \right) \left( -e^{-t^2/2} \right) \, dt

= \displaystyle  x e^{x^2/2} \left[ -\frac{1}{x} e^{-x^2/2} - 0 \right] + |x| e^{x^2/2} \int_{-\infty}^x \frac{1}{t^2} e^{-t^2/2} \, dt

\displaystyle  = - 1 +|x| e^{x^2/2} \int_{-\infty}^x \frac{1}{t^2} e^{-t^2/2} \, dt

\displaystyle = -1+ |x| e^{x^2/2} \int_{-\infty}^x \frac{1}{t^2} e^{-t^2/2} \, dt.

This is agonizingly close: the leading term is -1 as expected. However, I was stuck for the longest time trying to show that the second term goes to zero as x \to -\infty.

So, once again, I consulted Bender and Orszag, which outlined how to show this. We note that

\left|g(x) + 1\right| = \displaystyle |x| e^{x^2/2} \int_{-\infty}^x \frac{1}{t^2} e^{-t^2/2} \, dt < \displaystyle |x| e^{x^2/2} \int_{-\infty}^x \frac{1}{x^2} e^{-t^2/2} \, dt = \displaystyle \frac{g(x)}{x^2}.

Therefore,

\displaystyle \lim_{x \to -\infty} \left| \frac{g(x)+1}{g(x)} \right| \le \lim_{x \to -\infty} \frac{1}{x^2} = 0,

so that

\displaystyle \lim_{x \to -\infty} \left| \frac{g(x)+1}{g(x)} \right| =\displaystyle \lim_{x \to -\infty} \left| 1 + \frac{1}{g(x)} \right| = 0.

Therefore,

\displaystyle \lim_{x \to -\infty} \frac{1}{g(x)} = -1,

or

\displaystyle \lim_{x \to -\infty} g(x) = -1.

So (I thought) I was halfway home with the solution, and all that remained was to show that f was an increasing function.

And I was completely stuck at this point for a long time.

Until I realized — much to my utter embarrassment — that showing f was increasing was completely unnecessary, as discussed in the next post.

Solving Problems Submitted to MAA Journals (Part 7g)

The following problem appeared in Volume 131, Issue 9 (2024) of The American Mathematical Monthly.

Let X and Y be independent normally distributed random variables, each with its own mean and variance. Show that the variance of X conditioned on the event X>Y is smaller than the variance of X alone.

We suppose that E(X) = \mu_1, \hbox{SD}(X) = \sigma_1, E(Y) = \mu_2, and \hbox{SD}(Y) = \sigma_2. With these definitions, we may write X = \mu_1 + \sigma_1 Z_1 and Y = \mu_2 + \sigma_2 Z_2, where Z_1 and Z_2 are independent standard normal random variables.

The goal is to show that \hbox{Var}(X \mid X > Y) < \hbox{Var}(X). In previous posts, we showed that it will be sufficient to show that \hbox{Var}(Z_1 \mid Z_1 > a + bZ_2) < 1, where a = (\mu_2 - \mu_1)/\sigma_1 and b = \sigma_2/\sigma_1. We also showed that P(Z_1 > a + bZ_2) = \Phi(c), where c = -a/\sqrt{b^2+1} and

\Phi(z) = \displaystyle \frac{1}{\sqrt{2\pi}} \int_{-\infty}^\infty e^{-z^2/2} \, dz

is the cumulative distribution function of the standard normal distribution.

To compute

\hbox{Var}(Z_1 \mid Z_1 > a + bZ_2) = E(Z_1^2 \mid Z_1 + a bZ_2) - [E(Z_1 \mid Z_1 > a + bZ_2)]^2,

we showed in the two previous posts that

E(Z_1 \mid Z_1 > a + bZ_2) = \displaystyle \frac{e^{-c^2/2}}{\sqrt{2\pi}\sqrt{b^2+1} \Phi(c)}

and

E(Z_1^2 mid Z_1 > a + bZ_2) = 1 -\displaystyle \frac{c e^{-c^2/2}}{ \sqrt{2\pi} (b^2+1) \Phi(c)}.

Therefore,

\hbox{Var}(Z_1 \mid A) = 1 -  \displaystyle\frac{c e^{-c^2/2}}{ \sqrt{2\pi} (b^2+1) \Phi(c)} - \left( \frac{e^{-c^2/2}}{\sqrt{2\pi (b^2+1)} \Phi(c)} \right)^2

= 1 -  \displaystyle\frac{c e^{-c^2/2}}{ \sqrt{2\pi} (b^2+1) \Phi(c)} - \frac{e^{-c^2}}{2\pi (b^2+1) [\Phi(c)]^2}

= 1 -  \displaystyle\frac{c}{ \sqrt{2\pi} (b^2+1) \Phi(c) e^{c^2/2}} - \frac{1}{2\pi (b^2+1) [\Phi(c)]^2e^{c^2}}

= 1 -  \displaystyle\frac{\sqrt{2\pi} c e^{c^2/2} \Phi(c) + 1}{2\pi (b^2+1) [\Phi(c)]^2 e^{c^2}}.

To show that \hbox{Var}(Z_1 \mid A) < 1, it suffices to show that the second term must be positive. Furthermore, since the denominator of the second term is positive, it suffices to show that f(c) = 1 + \sqrt{2\pi} c e^{c^2/2} \Phi(c) must also be positive.

And, to be honest, I was stuck here for the longest time.

At some point, I decided to plot this function in Mathematica to see if I get some ideas flowing:

The function certainly looks like it’s always positive. What’s more, the graph suggests attempting to prove a couple of things: f is an increasing function, and \displaystyle \lim_{x \to -\infty} f(x) = 0. If I could prove both of these claims, then that would prove that f must always be positive.

Spoiler alert: this was almost a dead-end approach to the problem. I managed to prove one of them, but not the other. (I don’t doubt it’s true, but I didn’t find a proof.) I’ll discuss in the next post.

Solving Problems Submitted to MAA Journals (Part 7f)

The following problem appeared in Volume 131, Issue 9 (2024) of The American Mathematical Monthly.

Let X and Y be independent normally distributed random variables, each with its own mean and variance. Show that the variance of X conditioned on the event X>Y is smaller than the variance of X alone.

We suppose that E(X) = \mu_1, \hbox{SD}(X) = \sigma_1, E(Y) = \mu_2, and \hbox{SD}(Y) = \sigma_2. With these definitions, we may write X = \mu_1 + \sigma_1 Z_1 and Y = \mu_2 + \sigma_2 Z_2, where Z_1 and Z_2 are independent standard normal random variables.

The goal is to show that \hbox{Var}(X \mid X > Y) < \hbox{Var}(X). In previous posts, we showed that it will be sufficient to show that \hbox{Var}(Z_1 \mid Z_1 > a + bZ_2) < 1, where a = (\mu_2 - \mu_1)/\sigma_1 and b = \sigma_2/\sigma_1. We also showed that P(Z_1 > a + bZ_2) = \Phi(c), where c = -a/\sqrt{b^2+1} and

\Phi(z) = \displaystyle \frac{1}{\sqrt{2\pi}} \int_{-\infty}^z e^{-t^2/2} \, dt

is the cumulative distribution function of the standard normal distribution.

To compute

\hbox{Var}(Z_1 \mid Z_1 > a + bZ_2) = E(Z_1^2 \mid Z_1 + a bZ_2) - [E(Z_1 \mid Z_1 > a + bZ_2)]^2,

we showed in the previous post that

E(Z_1 \mid Z_1 > a + bZ_2) = \displaystyle \frac{e^{-c^2/2}}{\sqrt{2\pi}\sqrt{b^2+1} \Phi(c)}.

We now turn to the second conditional expectation:

E(Z_1^2 \mid Z_1 + abZ_2) = \displaystyle \frac{E(Z_1^2 I_{Z_1 > a+b Z_2})}{P(Z_1 > a + bZ_2)} = \frac{E(Z_1^2 I_{Z_1 > a+b Z_2})}{\Phi(c)}.

The expected value in the numerator is a double integral:

E(Z_1 I_{Z_1 > a+b Z_2}) = \displaystyle \int_{-\infty}^\infty \int_{-\infty}^\infty z_1^2 I_{z_1 > a + bz_2} f(z_1,z_2) \, dz_1 dz_2 = \displaystyle \int_{-\infty}^\infty \int_{a+bz_2}^\infty z_1^2 f(z_1,z_2) \, dz_1 dz_2,

where f(z_1,z_2) is the joint probability density function of Z_1 and Z_2. Since Z_1 and Z_2 are independent, f(z_1,z_2) is the product of the individual probability density functions:

f(z_1,z_2) = \displaystyle \frac{1}{\sqrt{2pi}} e^{-z_1^2/2} \frac{1}{\sqrt{2\pi}} e^{-z_2^2/2} = \frac{1}{2\pi} e^{-z_1^2/2} e^{-z_2^2/2}.

Therefore, we must compute

E(Z_1^2 I_A) = \displaystyle \frac{1}{2\pi} \int_{-\infty}^\infty \int_{a+bz_2}^\infty z_1^2 e^{-z_1^2/2} e^{-z_2^2/2} \, dz_1 dz_2,

where I wrote A for the event Z_1 > a + bZ_2.

I’m not above admitting that I first stuck this into Mathematica to make sure that this was doable. To begin, we compute the inner integral:

we begin by using integration by parts on the inner integral:

\displaystyle \int_{a+bz_2}^\infty z_1^2 e^{-z_1^2/2} \, dz_1 = \int_{a+bz_2}^\infty z_1 \frac{d}{dz_1} \left(-e^{-z_1^2/2} \right) \, dz_1

=\displaystyle \left[ -z_1 e^{-z_1^2/2} \right]_{a+bz_2}^\infty + \int_{a+bz_2}^\infty e^{-z_1^2/2} \, dz_1

= (a+bz_2) \displaystyle \exp \left[-\frac{(a+bz_2)^2}{2} \right] + \int_{a+bz_2}^\infty e^{-z_1^2/2} \, dz_1

Therefore,

E(Z_1^2 I_A) = \displaystyle \frac{1}{2\pi} \int_{-\infty}^\infty (a+bz_2) \exp \left[-\frac{(a+bz_2)^2}{2} \right] \exp \left[ -\frac{z_2^2}{2} \right] \, dz_2 + \int_{-\infty}^\infty \int_{a+bz_2}^\infty \frac{1}{2\pi} e^{-z_1^2/2} e^{-z_2^2/2} \, dz_1 dz_2.

The second term is equal to \Phi(c) since the double integral is P(Z_1 > a+bZ_2). For the first integral, we complete the square as before:

E(Z_1^2 I_A) = \Phi(c) + \displaystyle \frac{1}{2\pi} \int_{-\infty}^\infty (a+bz_2) \exp \left[-\frac{(b^2+1)z_2^2 + 2abz_2 + a^2}{2} \right] \, dz_2

= \Phi(c) + \displaystyle \frac{1}{2\pi} \int_{-\infty}^\infty (a + bz_2) \exp \left[ -\frac{b^2+1}{2} \left( z_2^2 + \frac{2abz_2}{b^2+1} \,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\, \right) \right] \exp \left[ -\frac{1}{2} \left(a^2 \,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\, \right) \right] dz_2

= \Phi(c) +\displaystyle \frac{1}{2\pi} \int_{-\infty}^\infty (a + bz_2)\exp \left[ -\frac{b^2+1}{2} \left( z_2^2 + \frac{2abz_2}{b^2+1} + \frac{a^2b^2}{(b^2+1)^2} \right) \right] \exp \left[ -\frac{1}{2} \left(a^2 - \frac{a^2b^2}{b^2+1} \right) \right] dz_2

= \Phi(c) +\displaystyle \frac{1}{2\pi} \int_{-\infty}^\infty (a + bz_2)\exp \left[ -\frac{b^2+1}{2} \left( z_2 + \frac{ab}{b^2+1} \right)^2 \right] \exp \left[ -\frac{1}{2} \left( \frac{a^2}{b^2+1} \right) \right] dz_2

= \Phi(c) +\displaystyle \frac{e^{-c^2/2}}{2\pi} \int_{-\infty}^\infty (a + bz_2)\exp \left[ -\frac{b^2+1}{2} \left( z_2 + \frac{ab}{b^2+1} \right)^2 \right] dz_2.

I now rewrite the integrand so that has the form of the probability density function of a normal distribution, writing 2\pi = \sqrt{2\pi} \sqrt{2\pi} and multiplying and dividing by \sqrt{b^2+1} in the denominator:

E(Z_1^2 I_A) = \Phi(c) + \displaystyle \frac{e^{-c^2/2}}{\sqrt{2\pi}\sqrt{b^2+1}} \int_{-\infty}^\infty (a+bz_2) \frac{1}{\sqrt{2\pi} \sqrt{ \displaystyle \frac{1}{b^2+1}}} \exp \left[ - \frac{\left(z_2 + \displaystyle \frac{ab}{b^2+1} \right)^2}{2 \cdot \displaystyle \frac{1}{b^2+1}} \right] dz_2.

This is an example of making a problem easier by apparently making it harder. The integrand has the probability density function of a normally distributed random variable V with E(V) = -ab/(b^2+1) and \hbox{Var}(V) = 1/(b^2+1). Therefore, the integral is equal to E(a + bV), so that

E(Z_1^2 I_A) = \Phi(c) + \displaystyle \frac{e^{-c^2/2}}{\sqrt{2\pi}\sqrt{b^2+1}} \left(a - b \cdot \frac{ab}{b^2+1} \right),

=  \Phi(c) + \displaystyle \frac{e^{-c^2/2}}{\sqrt{2\pi (b^2+1)}} \left( a -\frac{ab^2}{b^2+1} \right)

= \Phi(c) + \displaystyle \frac{e^{-c^2/2}}{\sqrt{2\pi (b^2+1)}} \cdot \frac{a}{b^2+1}

= \Phi(c) + \displaystyle \frac{e^{-c^2/2}}{\sqrt{2\pi} (b^2+1) } \cdot \frac{a}{\sqrt{b^2+1}}

= \Phi(c) - \displaystyle \frac{c e^{-c^2/2}}{ \sqrt{2\pi} (b^2+1) }.

Therefore,

E(Z_1^2 \mid Z_1 > a + bZ_2) = \displaystyle \frac{E(Z_1^2 I_A)}{\Phi(c)} = 1 - \displaystyle \frac{c e^{-c^2/2}}{ \sqrt{2\pi} (b^2+1) \Phi(c)}.

We note that this reduces to what we found in the second special case: if \mu_1=\mu_2=0, then a = 0 and c = 0, so that E(Z_1^2 \mid Z_1 > a + bZ_2) = 1, matching what we found earlier.

In the next post, we consider the calculation of \hbox{Var}(Z_1^2 \mid I_A).

Solving Problems Submitted to MAA Journals (Part 7e)

The following problem appeared in Volume 131, Issue 9 (2024) of The American Mathematical Monthly.

Let X and Y be independent normally distributed random variables, each with its own mean and variance. Show that the variance of X conditioned on the event X>Y is smaller than the variance of X alone.

We suppose that E(X) = \mu_1, \hbox{SD}(X) = \sigma_1, E(Y) = \mu_2, and \hbox{SD}(Y) = \sigma_2. With these definitions, we may write X = \mu_1 + \sigma_1 Z_1 and Y = \mu_2 + \sigma_2 Z_2, where Z_1 and Z_2 are independent standard normal random variables.

The goal is to show that \hbox{Var}(X \mid X > Y) < \hbox{Var}(X). In the previous two posts, we showed that it will be sufficient to show that \hbox{Var}(Z_1 \mid Z_1 > a + bZ_2) < 1, where a = (\mu_2 - \mu_1)/\sigma_1 and b = \sigma_2/\sigma_1. We also showed that P(Z_1 > a + bZ_2) = \Phi(c), where c = -a/\sqrt{b^2+1} and

\Phi(z) = \displaystyle \frac{1}{\sqrt{2\pi}} \int_{-\infty}^\infty e^{-z^2/2} \, dz

is the cumulative distribution function of the standard normal distribution.

To compute

\hbox{Var}(Z_1 \mid Z_1 > a + bZ_2) = E(Z_1^2 \mid Z_1 + a bZ_2) - [E(Z_1 \mid Z_1 > a + bZ_2)]^2,

we begin with

E(Z_1 \mid Z_1 + abZ_2) = \displaystyle \frac{E(Z_1 I_{Z_1 > a+b Z_2})}{P(Z_1 > a + bZ_2)} = \frac{E(Z_1 I_{Z_1 > a+b Z_2})}{\Phi(c)}.

The expected value in the numerator is a double integral:

E(Z_1 I_{Z_1 > a+b Z_2}) = \displaystyle \int_{-\infty}^\infty \int_{-\infty}^\infty z_1 I_{z_1 > a + bz_2} f(z_1,z_2) \, dz_1 dz_2 = \displaystyle \int_{-\infty}^\infty \int_{a+bz_2}^\infty z_1 f(z_1,z_2) \, dz_1 dz_2,

where f(z_1,z_2) is the joint probability density function of Z_1 and Z_2. Since Z_1 and Z_2 are independent, f(z_1,z_2) is the product of the individual probability density functions:

f(z_1,z_2) = \displaystyle \frac{1}{\sqrt{2\pi}} e^{-z_1^2/2} \frac{1}{\sqrt{2\pi}} e^{-z_2^2/2} = \frac{1}{2\pi} e^{-z_1^2/2} e^{-z_2^2/2}.

Therefore, we must compute

E(Z_1 I_A) = \displaystyle \frac{1}{2\pi} \int_{-\infty}^\infty \int_{a+bz_2}^\infty z_1 e^{-z_1^2/2} e^{-z_2^2/2} \, dz_1 dz_2,

where I wrote A for the event Z_1 > a + bZ_2.

I’m not above admitting that I first stuck this into Mathematica to make sure that this was doable. To begin, we compute the inner integral:

E(Z_1 I_A) = \displaystyle \frac{1}{2\pi} \int_{-\infty}^\infty \left[ - e^{-z_1^2/2} \right]_{a+bz_2}^\infty e^{-z_2^2/2} \, dz_2

= \displaystyle \frac{1}{2\pi} \int_{-\infty}^\infty \exp \left[ -\frac{(a+bz_2)^2}{2} \right] \exp\left[-\frac{z_2^2}{2} \right] dz_2

= \displaystyle \frac{1}{2\pi} \int_{-\infty}^\infty \exp \left[ -\frac{(b^2+1)z_2^2+2abz_2+a^2}{2} \right].

At this point, I used a standard technique/trick of completing the square to rewrite the integrand as a common pdf.

E(Z_1 I_A) = \displaystyle \frac{1}{2\pi} \int_{-\infty}^\infty \exp \left[ -\frac{b^2+1}{2} \left( z_2^2 + \frac{2abz_2}{b^2+1} \,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\, \right) \right] \exp \left[ -\frac{1}{2} \left(a^2 \,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\, \right) \right] dz_2

= \displaystyle \frac{1}{2\pi} \int_{-\infty}^\infty \exp \left[ -\frac{b^2+1}{2} \left( z_2^2 + \frac{2abz_2}{b^2+1} + \frac{a^2b^2}{(b^2+1)^2} \right) \right] \exp \left[ -\frac{1}{2} \left(a^2 - \frac{a^2b^2}{b^2+1} \right) \right] dz_2

= \displaystyle \frac{1}{2\pi} \int_{-\infty}^\infty \exp \left[ -\frac{b^2+1}{2} \left( z_2 + \frac{ab}{b^2+1} \right)^2 \right] \exp \left[ -\frac{1}{2} \left( \frac{a^2}{b^2+1} \right) \right] dz_2

= \displaystyle \frac{e^{-c^2/2}}{2\pi} \int_{-\infty}^\infty \exp \left[ -\frac{b^2+1}{2} \left( z_2 + \frac{ab}{b^2+1} \right)^2 \right]  dz_2.

I now rewrite the integrand so that has the form of the probability density function of a normal distribution, writing 2\pi = \sqrt{2\pi} \sqrt{2\pi} and multiplying and dividing by \sqrt{b^2+1} in the denominator:

E(Z_1 I_A) = \displaystyle \frac{e^{-c^2/2}}{\sqrt{2\pi}\sqrt{b^2+1}} \int_{-\infty}^\infty  \frac{1}{\sqrt{2\pi} \sqrt{ \displaystyle \frac{1}{b^2+1}}} \exp \left[ - \frac{\left(z_2 + \displaystyle \frac{ab}{b^2+1} \right)^2}{2 \cdot \displaystyle \frac{1}{b^2+1}} \right] dz_2.

This is an example of making a problem easier by apparently making it harder. The integrand is equal to P(-\infty < V < \infty), where V is a normally distributed random variable with E(V) = -ab/(b^2+1) and \hbox{Var}(V) = 1/(b^2+1). Since P(-\infty < V < \infty) = 1, we have

E(Z_1 I_A) = \displaystyle \frac{e^{-c^2/2}}{\sqrt{2\pi}\sqrt{b^2+1}},

and so

E(Z_1 \mid I_A) = \displaystyle \frac{E(Z_1 I_A)}{\Phi(c)} = \frac{e^{-c^2/2}}{\sqrt{2\pi}\sqrt{b^2+1} \Phi(c)}.

We note that this reduces to what we found in the second special case: if \mu_1=\mu_2=0, \sigma_1 = 1, and \sigma_2 = \sigma, then a = 0, b = \sigma, and c = 0. Since \Phi(0) = \frac{1}{2}, we have

E(Z_1 \mid I_A) = \displaystyle \frac{e^0}{\sqrt{2\pi}\sqrt{\sigma^2+1} \frac{1}{2}} = \sqrt{\frac{2}{\pi(\sigma^2+1)}},

matching what we found earlier.

In the next post, we consider the calculation of E(Z_1^2 \mid I_A).

Solving Problems Submitted to MAA Journals (Part 7b)

The following problem appeared in Volume 131, Issue 9 (2024) of The American Mathematical Monthly.

Let X and Y be independent normally distributed random variables, each with its own mean and variance. Show that the variance of X conditioned on the event X>Y is smaller than the variance of X alone.

Not quite knowing how to start, I decided to begin by simplifying the problem and assume that both X and Y follow a standard normal distribution, so that E(X) = E(Y) = 0 and \hbox{SD}(X)=\hbox{SD}(Y) = 1. This doesn’t solve the original problem, of course, but I hoped that solving this simpler case might give me some guidance about tackling the general case. I solved this special case in the previous post.

Next, to work on a special case that was somewhere between the general case and the first special case, I kept X as a standard normal distribution but changed Y to have a nonzero mean. As it turned out, this significantly complicated the problem (as we’ll see in the next post), and I got stuck.

So I changed course: for a second attempt, I kept X as a standard normal distribution but changed Y so that E(Y) = 0 and \hbox{SD}(Y) = \sigma, where \sigma could be something other than 1. The goal is to show that

\hbox{Var}(X \mid X > Y) = E(X^2 \mid X > Y) - [E(X \mid X > Y)]^2 < 1.

We begin by computing E(X \mid X > Y) = \displaystyle \frac{E(X I_{X>Y})}{P(X>Y)}. The denominator is straightforward: since X and Y are independent normal random variables, we also know that X-Y is normally distributed with E(X-Y) = E(X)-E(Y) = 0. (Also, \hbox{Var}(X-Y) = \hbox{Var}(X) + (-1)^2 \hbox{Var}(Y) = \sigma^2+1, but that’s really not needed for this problem.) Therefore, P(X>Y) = P(X-Y>0) = \frac{1}{2} since the distribution of X-Y is symmetric about its mean of 0.

Next, I wrote Y = \sigma Z, where Z has a standard normal distribution. Then

E(X I_{X>\sigma Z}) = \displaystyle \frac{1}{2\pi} \int_{-\infty}^\infty \int_{\sigma z}^\infty x e^{-x^2/2} e^{-z^2/2} \, dx dz,

where we have used the joint probability density function for the independent random variables X and Z. The region of integration is \{(x,z) \in \mathbb{R}^2 \mid x > \sigma z \}, matching the requirement X > \sigma Z. The inner integral can be directly evaluated:

E(X I_{X>Y}) = \displaystyle \frac{1}{2\pi} \int_{-\infty}^\infty \left[ -e^{-x^2/2} \right]_{\sigma z}^\infty e^{-z^2/2} \, dz

= \displaystyle \frac{1}{2\pi} \int_{-\infty}^\infty \left[ 0 + e^{-\sigma^2 z^2/2} \right] e^{-z^2/2} \, dz

= \displaystyle \frac{1}{2\pi} \int_{-\infty}^\infty e^{-(\sigma^2+1) z^2/2} \, dz.

At this point, I rewrote the integrand to be the probability density function of a random variable:

E(X I_{X>Y}) = \displaystyle \frac{1}{\sqrt{2\pi} \sqrt{\sigma^2+1}} \int_{-\infty}^\infty \frac{1}{\sqrt{2\pi} \sqrt{ \frac{1}{\sigma^2+1}}} \exp \left[ -\frac{z^2}{2 \cdot \frac{1}{\sigma^2+1}} \right] \, dz.

The integrand is the probability density function of a normal random variable with mean 0 and variance \sigma^2+1, and so the integral must be equal to 1. We conclude

E(X \mid X>Y) = \displaystyle \frac{E(X I_{X>Y})}{P(X>Y)} = \frac{ \frac{1}{\sqrt{2\pi(\sigma^2+1)}} }{ \frac{1}{2} } = \sqrt{\frac{2}{\pi(\sigma^2+1)}}.

Next, we compute the other conditional expectation:

E(X^2 \mid X > \sigma Z) = \displaystyle \frac{E(X^2 I_{X>\sigma Z})}{P(X>\sigma Z)} = \displaystyle \frac{2}{2\pi} \int_{-\infty}^\infty \int_{\sigma z}^\infty x^2 e^{-x^2/2} e^{-z^2/2} \, dx dz.

The inner integral can be computed using integration by parts:

\displaystyle \int_{\sigma z}^\infty x^2 e^{-x^2/2} \, dx = \int_{\sigma z}^\infty x \frac{d}{dx} \left( -e^{-x^2/2} \right) \, dx

= \displaystyle \left[-x e^{-x^2/2} \right]_{\sigma z}^\infty + \int_{\sigma z}^\infty e^{-x^2/2} \, dx

= \sigma z e^{-\sigma^2 z^2/2} + \displaystyle \int_{\sigma z}^\infty e^{-x^2/2} \, dx.

Therefore,

E(X^2 \mid X > \sigma Z) = \displaystyle \frac{1}{\pi} \int_{-\infty}^\infty \sigma z e^{-\sigma^2 z^2/2} e^{-z^2/2} \, dz + 2 \int_{-\infty}^\infty \int_{\sigma z}^\infty \frac{1}{2\pi} e^{-x^2/2} e^{-z^2/2} \, dx dz

= \displaystyle \frac{1}{\pi} \int_{-\infty}^\infty \sigma z e^{-(\sigma^2+1) z^2} \, dz + 2 \int_{-\infty}^\infty \int_{\sigma z}^\infty \frac{1}{2\pi} e^{-x^2/2} e^{-z^2/2} \, dx dz.

We could calculate the first integral, but we can immediately see that it’s going to be equal to 0 since the integrand z e^{-(\sigma^2+1) z^2} is an odd function. The double integral is equal to P(X>\sigma Z), which we’ve already shown is equal to \frac{1}{2}. Therefore, E(X^2 \mid X > Y) = 0 + 2 \cdot \frac{1}{2} = 1.

We conclude that

\hbox{Var}(X \mid X > Y) = E(X^2 \mid X > Y) - [E(X \mid X > Y)]^2 = 1 - \displaystyle \frac{2}{\pi(\sigma^2 + 1)},

which is indeed less than 1. If \sigma = 1, we recover the conditional variance found in the first special case.

After tackling these two special cases, we start the general case with the next post.

Solving Problems Submitted to MAA Journals (Part 7a)

The following problem appeared in Volume 131, Issue 9 (2024) of The American Mathematical Monthly.

Let X and Y be independent normally distributed random variables, each with its own mean and variance. Show that the variance of X conditioned on the event X>Y is smaller than the variance of X alone.

I admit I did a double-take when I first read this problem. If X and Y are independent, then the event X>Y contains almost no information. How then, so I thought, could the conditional distribution of X given X>Y be narrower than the unconditional distribution of X?

Then I thought: I can believe that E(X \mid X > Y) is greater than E(X): if we’re given that X>Y, then we know that X must be larger than something. So maybe it’s possible for \hbox{Var}(X \mid X>Y) to be less than \hbox{Var}(X).

Still, not quite knowing how to start, I decided to begin by simplifying the problem and assume that both X and Y follow a standard normal distribution, so that E(X) = E(Y) = 0 and \hbox{SD}(X)=\hbox{SD}(Y) = 1. This doesn’t solve the original problem, of course, but I hoped that solving this simpler case might give me some guidance about tackling the general case. I also hoped that solving this special case might give me some psychological confidence that I would eventually be able to solve the general case.

For the special case, the goal is to show that

\hbox{Var}(X \mid X > Y) = E(X^2 \mid X > Y) - [E(X \mid X > Y)]^2 < 1.

We begin by computing E(X \mid X > Y) = \displaystyle \frac{E(X I_{X>Y})}{P(X>Y)}. The denominator is straightforward: since X and Y are independent normal random variables, we also know that X-Y is normally distributed with E(X-Y) = E(X)-E(Y) = 0. (Also, \hbox{Var}(X-Y) = \hbox{Var}(X) + (-1)^2 \hbox{Var}(Y) = 2, but that’s really not needed for this problem.) Therefore, P(X>Y) = P(X-Y>0) = \frac{1}{2} since the distribution of X-Y is symmetric about its mean of 0.

Next,

E(X I_{X>Y}) = \displaystyle \frac{1}{2\pi} \int_{-\infty}^\infty \int_y^\infty x e^{-x^2/2} e^{-y^2}/2 \, dx dy,

where we have used the joint probability density function for X and Y. The region of integration is \{(x,y) \in \mathbb{R}^2 \mid x > y \}, taking care of the requirement X > Y. The inner integral can be directly evaluated:

E(X I_{X>Y}) = \displaystyle \frac{1}{2\pi} \int_{-\infty}^\infty \left[ -e^{-x^2/2} \right]_x^\infty e^{-y^2/2} \, dy

= \displaystyle \frac{1}{2\pi} \int_{-\infty}^\infty \left[ 0 + e^{-y^2/2} \right] e^{-y^2/2} \, dy

= \displaystyle \frac{1}{2\pi} \int_{-\infty}^\infty e^{-y^2} \, dy.

At this point, I used a standard technique/trick of integration by rewriting the integrand to be the probability density function of a random variable. In this case, the random variable is normally distributed with mean 0 and variance 1/2:

E(X I_{X>Y}) = \displaystyle \frac{1}{\sqrt{2\pi} \sqrt{2}} \int_{-\infty}^\infty \frac{1}{\sqrt{2\pi} \sqrt{1/2}} \exp \left[ -\frac{y^2}{2 \cdot \frac{1}{2}} \right] \, dy.

The integral must be equal to 1, and so we conclude

E(X \mid X>Y) = \displaystyle \frac{E(X I_{X>Y})}{P(X>Y)} = \frac{ \frac{1}{2\sqrt{\pi}} }{ \frac{1}{2} } = \frac{1}{\sqrt{\pi}}.

We parenthetically note that E(X \mid X>Y) > 0, matching my initial intuition.

Next, we compute the other conditional expectation:

E(X^2 \mid X > Y) = \displaystyle \frac{E(X^2 I_{X>Y})}{P(X>Y)} = \displaystyle \frac{2}{2\pi} \int_{-\infty}^\infty \int_y^\infty x^2 e^{-x^2/2} e^{-y^2/2} \, dx dy.

The inner integral can be computed using integration by parts:

\displaystyle \int_y^\infty x^2 e^{-x^2/2} \, dx = \int_y^\infty x \frac{d}{dx} \left( -e^{-x^2/2} \right) \, dx

= \displaystyle \left[-x e^{-x^2/2} \right]_y^\infty + \int_y^\infty e^{-x^2/2} \, dx

= y e^{-y^2/2} + \displaystyle \int_y^\infty e^{-x^2/2} \, dx.

Therefore,

E(X^2 \mid X > Y) = \displaystyle \frac{1}{\pi}  \int_{-\infty}^\infty y e^{-y^2/2} e^{-y^2/2} \, dy + 2 \int_{-\infty}^\infty \int_y^\infty \frac{1}{2\pi} e^{-x^2/2} e^{-y^2/2} \, dx dy

= \displaystyle \frac{1}{\pi}  \int_{-\infty}^\infty y e^{-y^2} \, dy + 2 \int_{-\infty}^\infty \int_y^\infty \frac{1}{2\pi} e^{-x^2/2} e^{-y^2/2} \, dx dy.

We could calculate the first integral, but we can immediately see that it’s going to be equal to 0 since the integrand y e^{-y^2} is an odd function. The double integral is equal to P(X>Y), which we’ve already shown is equal to \frac{1}{2}. Therefore, E(X^2 \mid X > Y) = 0 + 2 \cdot \frac{1}{2} = 1.

We conclude that

\hbox{Var}(X \mid X > Y) = E(X^2 \mid X > Y) - [E(X \mid X > Y)]^2 = 1 - \displaystyle \left( \frac{1}{\sqrt{\pi}} \right)^2 = 1 - \frac{1}{\pi},

which is indeed less than 1.

This solves the problem for the special case of two independent standard normal random variables. This of course does not yet solve the general case, but my hope was that solving this problem might give me some intuition about the general case, which I’ll develop as this series progresses.

Solving Problems Submitted to MAA Journals (Part 6e)

The following problem appeared in Volume 97, Issue 3 (2024) of Mathematics Magazine.

Two points P and Q are chosen at random (uniformly) from the interior of a unit circle. What is the probability that the circle whose diameter is segment overline{PQ} lies entirely in the interior of the unit circle?

Let D_r be the interior of the circle centered at the origin O with radius r. Also, let C(P,Q) denote the circle with diameter \overline{PQ}, and let R = OP be the distance of P from the origin.

In the previous post, we showed that

\hbox{Pr}(C(P,Q) \subset D_1 \mid R = r) = \sqrt{1-r^2}.

To find \hbox{Pr}(C(P,Q) \subset D_1), I will integrate over this conditional probability:

\hbox{Pr}(C(P,Q) \subset D_1) = \displaystyle \int_0^1 \hbox{Pr}(C(P,Q) \subset D_1 \mid R = r) F'(r) \, dr,

where F(r) is the cumulative distribution function of R. For 0 \le r \le 1,

F(r) = \hbox{Pr}(R \le r) = \hbox{Pr}(P \in D_r) = \displaystyle \frac{\hbox{area}(D_r)}{\hbox{area}(D_1)} = \frac{\pi r^2}{\pi} = r^2.

Therefore,

\hbox{Pr}(C(P,Q) \subset D_1) = \displaystyle \int_0^1 \hbox{Pr}(C(P,Q) \subset D_1 \mid R = r) F'(r) \, dr

= \displaystyle \int_0^1 2 r \sqrt{1-r^2} \, dr.

To calculate this integral, I’ll use the trigonometric substitution u = 1-r^2. Then the endpoints r=0 and r=1 become u = \sqrt{1-0^2} = 1 and u = \sqrt{1-1^2} = 0. Also, du = -2r \, dr. Therefore,

\hbox{Pr}(C(P,Q) \subset D_1) = \displaystyle \int_0^1 2 r \sqrt{1-r^2} \, dr

= \displaystyle \int_1^0 -\sqrt{u} \, du

= \displaystyle \int_0^1 \sqrt{u} \, du

= \displaystyle \frac{2}{3} \left[  u^{3/2} \right]_0^1

=\displaystyle  \frac{2}{3}\left[ (1)^{3/2} - (0)^{3/2} \right]

= \displaystyle \frac{2}{3},

confirming the answer I had guessed from simulations.