Solving Problems Submitted to MAA Journals (Part 7h)

The following problem appeared in Volume 131, Issue 9 (2024) of The American Mathematical Monthly.

Let X and Y be independent normally distributed random variables, each with its own mean and variance. Show that the variance of X conditioned on the event X>Y is smaller than the variance of X alone.

In previous posts, we reduced the problem to showing that if g(x) = \sqrt{2\pi} x e^{x^2/2} \Phi(x), then f(x) = 1 + g(x) is always positive, where

\Phi(z) = \displaystyle \frac{1}{\sqrt{2\pi}} \int_{-\infty}^z e^{-z^2/2} \, dz

is the cumulative distribution function of the standard normal distribution. If we can prove this, then the original problem will be true.

When I was solving this problem for the first time, my progress through the first few steps was hindered by algebra mistakes and the like, but I didn’t doubt that I was progressing toward the answer. At this point in the solution, however, I was genuinely stuck: nothing immediately popped to mind for showing that g(x) must be greater than -1.

So I turned to Mathematica, just to make sure I was on the right track. Based on the graph, the function of f(x) certainly looks positive.

What’s more, the graph suggests attempting to prove a couple of things: f is an increasing function, and \displaystyle \lim_{x \to -\infty} f(x) = 0 or, equivalently, \displaystyle \lim_{x \to -\infty} g(x) = -1. If I could prove both of these claims, then that would prove that f must always be positive.

I started by trying to show

\displaystyle \lim_{x \to -\infty} g(x) = \lim_{x \to \infty}  x e^{x^2/2} \int_{-\infty}^x e^{-t^2/2} \, dt = -1.

I vaguely remembered something about the asymptotic expansion of the above integral from a course decades ago, and so I consulted that course’s textbook, by Bender and Orszag, to refresh my memory. To derive the behavior of g(x) as x \to -\infty, we integrate by parts. (This is permissible: the integrands below are well-behaved if x<0, so that 0 is not in the range of integration.)

g(x) = \displaystyle x e^{x^2/2} \int_{-\infty}^x e^{-t^2/2} \, dt

= \displaystyle x e^{x^2/2} \int_{-\infty}^x \frac{1}{t} \frac{d}{dt} \left(-e^{-t^2/2}\right) \, dt

= \displaystyle  x e^{x^2/2} \left[ -\frac{1}{t} e^{-t^2/2} \right]_{-\infty}^x - x e^{x^2/2} \int_{-\infty}^x \frac{d}{dt} \left(\frac{1}{t} \right) \left( -e^{-t^2/2} \right) \, dt

= \displaystyle  x e^{x^2/2} \left[ -\frac{1}{x} e^{-x^2/2} - 0 \right] + |x| e^{x^2/2} \int_{-\infty}^x \frac{1}{t^2} e^{-t^2/2} \, dt

\displaystyle  = - 1 +|x| e^{x^2/2} \int_{-\infty}^x \frac{1}{t^2} e^{-t^2/2} \, dt

\displaystyle = -1+ |x| e^{x^2/2} \int_{-\infty}^x \frac{1}{t^2} e^{-t^2/2} \, dt.

This is agonizingly close: the leading term is -1 as expected. However, I was stuck for the longest time trying to show that the second term goes to zero as x \to -\infty.

So, once again, I consulted Bender and Orszag, which outlined how to show this. We note that

\left|g(x) + 1\right| = \displaystyle |x| e^{x^2/2} \int_{-\infty}^x \frac{1}{t^2} e^{-t^2/2} \, dt < \displaystyle |x| e^{x^2/2} \int_{-\infty}^x \frac{1}{x^2} e^{-t^2/2} \, dt = \displaystyle \frac{g(x)}{x^2}.

Therefore,

\displaystyle \lim_{x \to -\infty} \left| \frac{g(x)+1}{g(x)} \right| \le \lim_{x \to -\infty} \frac{1}{x^2} = 0,

so that

\displaystyle \lim_{x \to -\infty} \left| \frac{g(x)+1}{g(x)} \right| =\displaystyle \lim_{x \to -\infty} \left| 1 + \frac{1}{g(x)} \right| = 0.

Therefore,

\displaystyle \lim_{x \to -\infty} \frac{1}{g(x)} = -1,

or

\displaystyle \lim_{x \to -\infty} g(x) = -1.

So (I thought) I was halfway home with the solution, and all that remained was to show that f was an increasing function.

And I was completely stuck at this point for a long time.

Until I realized — much to my utter embarrassment — that showing f was increasing was completely unnecessary, as discussed in the next post.

Solving Problems Submitted to MAA Journals (Part 7g)

The following problem appeared in Volume 131, Issue 9 (2024) of The American Mathematical Monthly.

Let X and Y be independent normally distributed random variables, each with its own mean and variance. Show that the variance of X conditioned on the event X>Y is smaller than the variance of X alone.

We suppose that E(X) = \mu_1, \hbox{SD}(X) = \sigma_1, E(Y) = \mu_2, and \hbox{SD}(Y) = \sigma_2. With these definitions, we may write X = \mu_1 + \sigma_1 Z_1 and Y = \mu_2 + \sigma_2 Z_2, where Z_1 and Z_2 are independent standard normal random variables.

The goal is to show that \hbox{Var}(X \mid X > Y) < \hbox{Var}(X). In previous posts, we showed that it will be sufficient to show that \hbox{Var}(Z_1 \mid Z_1 > a + bZ_2) < 1, where a = (\mu_2 - \mu_1)/\sigma_1 and b = \sigma_2/\sigma_1. We also showed that P(Z_1 > a + bZ_2) = \Phi(c), where c = -a/\sqrt{b^2+1} and

\Phi(z) = \displaystyle \frac{1}{\sqrt{2\pi}} \int_{-\infty}^\infty e^{-z^2/2} \, dz

is the cumulative distribution function of the standard normal distribution.

To compute

\hbox{Var}(Z_1 \mid Z_1 > a + bZ_2) = E(Z_1^2 \mid Z_1 + a bZ_2) - [E(Z_1 \mid Z_1 > a + bZ_2)]^2,

we showed in the two previous posts that

E(Z_1 \mid Z_1 > a + bZ_2) = \displaystyle \frac{e^{-c^2/2}}{\sqrt{2\pi}\sqrt{b^2+1} \Phi(c)}

and

E(Z_1^2 mid Z_1 > a + bZ_2) = 1 -\displaystyle \frac{c e^{-c^2/2}}{ \sqrt{2\pi} (b^2+1) \Phi(c)}.

Therefore,

\hbox{Var}(Z_1 \mid A) = 1 -  \displaystyle\frac{c e^{-c^2/2}}{ \sqrt{2\pi} (b^2+1) \Phi(c)} - \left( \frac{e^{-c^2/2}}{\sqrt{2\pi (b^2+1)} \Phi(c)} \right)^2

= 1 -  \displaystyle\frac{c e^{-c^2/2}}{ \sqrt{2\pi} (b^2+1) \Phi(c)} - \frac{e^{-c^2}}{2\pi (b^2+1) [\Phi(c)]^2}

= 1 -  \displaystyle\frac{c}{ \sqrt{2\pi} (b^2+1) \Phi(c) e^{c^2/2}} - \frac{1}{2\pi (b^2+1) [\Phi(c)]^2e^{c^2}}

= 1 -  \displaystyle\frac{\sqrt{2\pi} c e^{c^2/2} \Phi(c) + 1}{2\pi (b^2+1) [\Phi(c)]^2 e^{c^2}}.

To show that \hbox{Var}(Z_1 \mid A) < 1, it suffices to show that the second term must be positive. Furthermore, since the denominator of the second term is positive, it suffices to show that f(c) = 1 + \sqrt{2\pi} c e^{c^2/2} \Phi(c) must also be positive.

And, to be honest, I was stuck here for the longest time.

At some point, I decided to plot this function in Mathematica to see if I get some ideas flowing:

The function certainly looks like it’s always positive. What’s more, the graph suggests attempting to prove a couple of things: f is an increasing function, and \displaystyle \lim_{x \to -\infty} f(x) = 0. If I could prove both of these claims, then that would prove that f must always be positive.

Spoiler alert: this was almost a dead-end approach to the problem. I managed to prove one of them, but not the other. (I don’t doubt it’s true, but I didn’t find a proof.) I’ll discuss in the next post.

Solving Problems Submitted to MAA Journals (Part 7f)

The following problem appeared in Volume 131, Issue 9 (2024) of The American Mathematical Monthly.

Let X and Y be independent normally distributed random variables, each with its own mean and variance. Show that the variance of X conditioned on the event X>Y is smaller than the variance of X alone.

We suppose that E(X) = \mu_1, \hbox{SD}(X) = \sigma_1, E(Y) = \mu_2, and \hbox{SD}(Y) = \sigma_2. With these definitions, we may write X = \mu_1 + \sigma_1 Z_1 and Y = \mu_2 + \sigma_2 Z_2, where Z_1 and Z_2 are independent standard normal random variables.

The goal is to show that \hbox{Var}(X \mid X > Y) < \hbox{Var}(X). In previous posts, we showed that it will be sufficient to show that \hbox{Var}(Z_1 \mid Z_1 > a + bZ_2) < 1, where a = (\mu_2 - \mu_1)/\sigma_1 and b = \sigma_2/\sigma_1. We also showed that P(Z_1 > a + bZ_2) = \Phi(c), where c = -a/\sqrt{b^2+1} and

\Phi(z) = \displaystyle \frac{1}{\sqrt{2\pi}} \int_{-\infty}^z e^{-t^2/2} \, dt

is the cumulative distribution function of the standard normal distribution.

To compute

\hbox{Var}(Z_1 \mid Z_1 > a + bZ_2) = E(Z_1^2 \mid Z_1 + a bZ_2) - [E(Z_1 \mid Z_1 > a + bZ_2)]^2,

we showed in the previous post that

E(Z_1 \mid Z_1 > a + bZ_2) = \displaystyle \frac{e^{-c^2/2}}{\sqrt{2\pi}\sqrt{b^2+1} \Phi(c)}.

We now turn to the second conditional expectation:

E(Z_1^2 \mid Z_1 + abZ_2) = \displaystyle \frac{E(Z_1^2 I_{Z_1 > a+b Z_2})}{P(Z_1 > a + bZ_2)} = \frac{E(Z_1^2 I_{Z_1 > a+b Z_2})}{\Phi(c)}.

The expected value in the numerator is a double integral:

E(Z_1 I_{Z_1 > a+b Z_2}) = \displaystyle \int_{-\infty}^\infty \int_{-\infty}^\infty z_1^2 I_{z_1 > a + bz_2} f(z_1,z_2) \, dz_1 dz_2 = \displaystyle \int_{-\infty}^\infty \int_{a+bz_2}^\infty z_1^2 f(z_1,z_2) \, dz_1 dz_2,

where f(z_1,z_2) is the joint probability density function of Z_1 and Z_2. Since Z_1 and Z_2 are independent, f(z_1,z_2) is the product of the individual probability density functions:

f(z_1,z_2) = \displaystyle \frac{1}{\sqrt{2pi}} e^{-z_1^2/2} \frac{1}{\sqrt{2\pi}} e^{-z_2^2/2} = \frac{1}{2\pi} e^{-z_1^2/2} e^{-z_2^2/2}.

Therefore, we must compute

E(Z_1^2 I_A) = \displaystyle \frac{1}{2\pi} \int_{-\infty}^\infty \int_{a+bz_2}^\infty z_1^2 e^{-z_1^2/2} e^{-z_2^2/2} \, dz_1 dz_2,

where I wrote A for the event Z_1 > a + bZ_2.

I’m not above admitting that I first stuck this into Mathematica to make sure that this was doable. To begin, we compute the inner integral:

we begin by using integration by parts on the inner integral:

\displaystyle \int_{a+bz_2}^\infty z_1^2 e^{-z_1^2/2} \, dz_1 = \int_{a+bz_2}^\infty z_1 \frac{d}{dz_1} \left(-e^{-z_1^2/2} \right) \, dz_1

=\displaystyle \left[ -z_1 e^{-z_1^2/2} \right]_{a+bz_2}^\infty + \int_{a+bz_2}^\infty e^{-z_1^2/2} \, dz_1

= (a+bz_2) \displaystyle \exp \left[-\frac{(a+bz_2)^2}{2} \right] + \int_{a+bz_2}^\infty e^{-z_1^2/2} \, dz_1

Therefore,

E(Z_1^2 I_A) = \displaystyle \frac{1}{2\pi} \int_{-\infty}^\infty (a+bz_2) \exp \left[-\frac{(a+bz_2)^2}{2} \right] \exp \left[ -\frac{z_2^2}{2} \right] \, dz_2 + \int_{-\infty}^\infty \int_{a+bz_2}^\infty \frac{1}{2\pi} e^{-z_1^2/2} e^{-z_2^2/2} \, dz_1 dz_2.

The second term is equal to \Phi(c) since the double integral is P(Z_1 > a+bZ_2). For the first integral, we complete the square as before:

E(Z_1^2 I_A) = \Phi(c) + \displaystyle \frac{1}{2\pi} \int_{-\infty}^\infty (a+bz_2) \exp \left[-\frac{(b^2+1)z_2^2 + 2abz_2 + a^2}{2} \right] \, dz_2

= \Phi(c) + \displaystyle \frac{1}{2\pi} \int_{-\infty}^\infty (a + bz_2) \exp \left[ -\frac{b^2+1}{2} \left( z_2^2 + \frac{2abz_2}{b^2+1} \,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\, \right) \right] \exp \left[ -\frac{1}{2} \left(a^2 \,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\, \right) \right] dz_2

= \Phi(c) +\displaystyle \frac{1}{2\pi} \int_{-\infty}^\infty (a + bz_2)\exp \left[ -\frac{b^2+1}{2} \left( z_2^2 + \frac{2abz_2}{b^2+1} + \frac{a^2b^2}{(b^2+1)^2} \right) \right] \exp \left[ -\frac{1}{2} \left(a^2 - \frac{a^2b^2}{b^2+1} \right) \right] dz_2

= \Phi(c) +\displaystyle \frac{1}{2\pi} \int_{-\infty}^\infty (a + bz_2)\exp \left[ -\frac{b^2+1}{2} \left( z_2 + \frac{ab}{b^2+1} \right)^2 \right] \exp \left[ -\frac{1}{2} \left( \frac{a^2}{b^2+1} \right) \right] dz_2

= \Phi(c) +\displaystyle \frac{e^{-c^2/2}}{2\pi} \int_{-\infty}^\infty (a + bz_2)\exp \left[ -\frac{b^2+1}{2} \left( z_2 + \frac{ab}{b^2+1} \right)^2 \right] dz_2.

I now rewrite the integrand so that has the form of the probability density function of a normal distribution, writing 2\pi = \sqrt{2\pi} \sqrt{2\pi} and multiplying and dividing by \sqrt{b^2+1} in the denominator:

E(Z_1^2 I_A) = \Phi(c) + \displaystyle \frac{e^{-c^2/2}}{\sqrt{2\pi}\sqrt{b^2+1}} \int_{-\infty}^\infty (a+bz_2) \frac{1}{\sqrt{2\pi} \sqrt{ \displaystyle \frac{1}{b^2+1}}} \exp \left[ - \frac{\left(z_2 + \displaystyle \frac{ab}{b^2+1} \right)^2}{2 \cdot \displaystyle \frac{1}{b^2+1}} \right] dz_2.

This is an example of making a problem easier by apparently making it harder. The integrand has the probability density function of a normally distributed random variable V with E(V) = -ab/(b^2+1) and \hbox{Var}(V) = 1/(b^2+1). Therefore, the integral is equal to E(a + bV), so that

E(Z_1^2 I_A) = \Phi(c) + \displaystyle \frac{e^{-c^2/2}}{\sqrt{2\pi}\sqrt{b^2+1}} \left(a - b \cdot \frac{ab}{b^2+1} \right),

=  \Phi(c) + \displaystyle \frac{e^{-c^2/2}}{\sqrt{2\pi (b^2+1)}} \left( a -\frac{ab^2}{b^2+1} \right)

= \Phi(c) + \displaystyle \frac{e^{-c^2/2}}{\sqrt{2\pi (b^2+1)}} \cdot \frac{a}{b^2+1}

= \Phi(c) + \displaystyle \frac{e^{-c^2/2}}{\sqrt{2\pi} (b^2+1) } \cdot \frac{a}{\sqrt{b^2+1}}

= \Phi(c) - \displaystyle \frac{c e^{-c^2/2}}{ \sqrt{2\pi} (b^2+1) }.

Therefore,

E(Z_1^2 \mid Z_1 > a + bZ_2) = \displaystyle \frac{E(Z_1^2 I_A)}{\Phi(c)} = 1 - \displaystyle \frac{c e^{-c^2/2}}{ \sqrt{2\pi} (b^2+1) \Phi(c)}.

We note that this reduces to what we found in the second special case: if \mu_1=\mu_2=0, then a = 0 and c = 0, so that E(Z_1^2 \mid Z_1 > a + bZ_2) = 1, matching what we found earlier.

In the next post, we consider the calculation of \hbox{Var}(Z_1^2 \mid I_A).

Solving Problems Submitted to MAA Journals (Part 7e)

The following problem appeared in Volume 131, Issue 9 (2024) of The American Mathematical Monthly.

Let X and Y be independent normally distributed random variables, each with its own mean and variance. Show that the variance of X conditioned on the event X>Y is smaller than the variance of X alone.

We suppose that E(X) = \mu_1, \hbox{SD}(X) = \sigma_1, E(Y) = \mu_2, and \hbox{SD}(Y) = \sigma_2. With these definitions, we may write X = \mu_1 + \sigma_1 Z_1 and Y = \mu_2 + \sigma_2 Z_2, where Z_1 and Z_2 are independent standard normal random variables.

The goal is to show that \hbox{Var}(X \mid X > Y) < \hbox{Var}(X). In the previous two posts, we showed that it will be sufficient to show that \hbox{Var}(Z_1 \mid Z_1 > a + bZ_2) < 1, where a = (\mu_2 - \mu_1)/\sigma_1 and b = \sigma_2/\sigma_1. We also showed that P(Z_1 > a + bZ_2) = \Phi(c), where c = -a/\sqrt{b^2+1} and

\Phi(z) = \displaystyle \frac{1}{\sqrt{2\pi}} \int_{-\infty}^\infty e^{-z^2/2} \, dz

is the cumulative distribution function of the standard normal distribution.

To compute

\hbox{Var}(Z_1 \mid Z_1 > a + bZ_2) = E(Z_1^2 \mid Z_1 + a bZ_2) - [E(Z_1 \mid Z_1 > a + bZ_2)]^2,

we begin with

E(Z_1 \mid Z_1 + abZ_2) = \displaystyle \frac{E(Z_1 I_{Z_1 > a+b Z_2})}{P(Z_1 > a + bZ_2)} = \frac{E(Z_1 I_{Z_1 > a+b Z_2})}{\Phi(c)}.

The expected value in the numerator is a double integral:

E(Z_1 I_{Z_1 > a+b Z_2}) = \displaystyle \int_{-\infty}^\infty \int_{-\infty}^\infty z_1 I_{z_1 > a + bz_2} f(z_1,z_2) \, dz_1 dz_2 = \displaystyle \int_{-\infty}^\infty \int_{a+bz_2}^\infty z_1 f(z_1,z_2) \, dz_1 dz_2,

where f(z_1,z_2) is the joint probability density function of Z_1 and Z_2. Since Z_1 and Z_2 are independent, f(z_1,z_2) is the product of the individual probability density functions:

f(z_1,z_2) = \displaystyle \frac{1}{\sqrt{2\pi}} e^{-z_1^2/2} \frac{1}{\sqrt{2\pi}} e^{-z_2^2/2} = \frac{1}{2\pi} e^{-z_1^2/2} e^{-z_2^2/2}.

Therefore, we must compute

E(Z_1 I_A) = \displaystyle \frac{1}{2\pi} \int_{-\infty}^\infty \int_{a+bz_2}^\infty z_1 e^{-z_1^2/2} e^{-z_2^2/2} \, dz_1 dz_2,

where I wrote A for the event Z_1 > a + bZ_2.

I’m not above admitting that I first stuck this into Mathematica to make sure that this was doable. To begin, we compute the inner integral:

E(Z_1 I_A) = \displaystyle \frac{1}{2\pi} \int_{-\infty}^\infty \left[ - e^{-z_1^2/2} \right]_{a+bz_2}^\infty e^{-z_2^2/2} \, dz_2

= \displaystyle \frac{1}{2\pi} \int_{-\infty}^\infty \exp \left[ -\frac{(a+bz_2)^2}{2} \right] \exp\left[-\frac{z_2^2}{2} \right] dz_2

= \displaystyle \frac{1}{2\pi} \int_{-\infty}^\infty \exp \left[ -\frac{(b^2+1)z_2^2+2abz_2+a^2}{2} \right].

At this point, I used a standard technique/trick of completing the square to rewrite the integrand as a common pdf.

E(Z_1 I_A) = \displaystyle \frac{1}{2\pi} \int_{-\infty}^\infty \exp \left[ -\frac{b^2+1}{2} \left( z_2^2 + \frac{2abz_2}{b^2+1} \,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\, \right) \right] \exp \left[ -\frac{1}{2} \left(a^2 \,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\, \right) \right] dz_2

= \displaystyle \frac{1}{2\pi} \int_{-\infty}^\infty \exp \left[ -\frac{b^2+1}{2} \left( z_2^2 + \frac{2abz_2}{b^2+1} + \frac{a^2b^2}{(b^2+1)^2} \right) \right] \exp \left[ -\frac{1}{2} \left(a^2 - \frac{a^2b^2}{b^2+1} \right) \right] dz_2

= \displaystyle \frac{1}{2\pi} \int_{-\infty}^\infty \exp \left[ -\frac{b^2+1}{2} \left( z_2 + \frac{ab}{b^2+1} \right)^2 \right] \exp \left[ -\frac{1}{2} \left( \frac{a^2}{b^2+1} \right) \right] dz_2

= \displaystyle \frac{e^{-c^2/2}}{2\pi} \int_{-\infty}^\infty \exp \left[ -\frac{b^2+1}{2} \left( z_2 + \frac{ab}{b^2+1} \right)^2 \right]  dz_2.

I now rewrite the integrand so that has the form of the probability density function of a normal distribution, writing 2\pi = \sqrt{2\pi} \sqrt{2\pi} and multiplying and dividing by \sqrt{b^2+1} in the denominator:

E(Z_1 I_A) = \displaystyle \frac{e^{-c^2/2}}{\sqrt{2\pi}\sqrt{b^2+1}} \int_{-\infty}^\infty  \frac{1}{\sqrt{2\pi} \sqrt{ \displaystyle \frac{1}{b^2+1}}} \exp \left[ - \frac{\left(z_2 + \displaystyle \frac{ab}{b^2+1} \right)^2}{2 \cdot \displaystyle \frac{1}{b^2+1}} \right] dz_2.

This is an example of making a problem easier by apparently making it harder. The integrand is equal to P(-\infty < V < \infty), where V is a normally distributed random variable with E(V) = -ab/(b^2+1) and \hbox{Var}(V) = 1/(b^2+1). Since P(-\infty < V < \infty) = 1, we have

E(Z_1 I_A) = \displaystyle \frac{e^{-c^2/2}}{\sqrt{2\pi}\sqrt{b^2+1}},

and so

E(Z_1 \mid I_A) = \displaystyle \frac{E(Z_1 I_A)}{\Phi(c)} = \frac{e^{-c^2/2}}{\sqrt{2\pi}\sqrt{b^2+1} \Phi(c)}.

We note that this reduces to what we found in the second special case: if \mu_1=\mu_2=0, \sigma_1 = 1, and \sigma_2 = \sigma, then a = 0, b = \sigma, and c = 0. Since \Phi(0) = \frac{1}{2}, we have

E(Z_1 \mid I_A) = \displaystyle \frac{e^0}{\sqrt{2\pi}\sqrt{\sigma^2+1} \frac{1}{2}} = \sqrt{\frac{2}{\pi(\sigma^2+1)}},

matching what we found earlier.

In the next post, we consider the calculation of E(Z_1^2 \mid I_A).

Solving Problems Submitted to MAA Journals (Part 7b)

The following problem appeared in Volume 131, Issue 9 (2024) of The American Mathematical Monthly.

Let X and Y be independent normally distributed random variables, each with its own mean and variance. Show that the variance of X conditioned on the event X>Y is smaller than the variance of X alone.

Not quite knowing how to start, I decided to begin by simplifying the problem and assume that both X and Y follow a standard normal distribution, so that E(X) = E(Y) = 0 and \hbox{SD}(X)=\hbox{SD}(Y) = 1. This doesn’t solve the original problem, of course, but I hoped that solving this simpler case might give me some guidance about tackling the general case. I solved this special case in the previous post.

Next, to work on a special case that was somewhere between the general case and the first special case, I kept X as a standard normal distribution but changed Y to have a nonzero mean. As it turned out, this significantly complicated the problem (as we’ll see in the next post), and I got stuck.

So I changed course: for a second attempt, I kept X as a standard normal distribution but changed Y so that E(Y) = 0 and \hbox{SD}(Y) = \sigma, where \sigma could be something other than 1. The goal is to show that

\hbox{Var}(X \mid X > Y) = E(X^2 \mid X > Y) - [E(X \mid X > Y)]^2 < 1.

We begin by computing E(X \mid X > Y) = \displaystyle \frac{E(X I_{X>Y})}{P(X>Y)}. The denominator is straightforward: since X and Y are independent normal random variables, we also know that X-Y is normally distributed with E(X-Y) = E(X)-E(Y) = 0. (Also, \hbox{Var}(X-Y) = \hbox{Var}(X) + (-1)^2 \hbox{Var}(Y) = \sigma^2+1, but that’s really not needed for this problem.) Therefore, P(X>Y) = P(X-Y>0) = \frac{1}{2} since the distribution of X-Y is symmetric about its mean of 0.

Next, I wrote Y = \sigma Z, where Z has a standard normal distribution. Then

E(X I_{X>\sigma Z}) = \displaystyle \frac{1}{2\pi} \int_{-\infty}^\infty \int_{\sigma z}^\infty x e^{-x^2/2} e^{-z^2/2} \, dx dz,

where we have used the joint probability density function for the independent random variables X and Z. The region of integration is \{(x,z) \in \mathbb{R}^2 \mid x > \sigma z \}, matching the requirement X > \sigma Z. The inner integral can be directly evaluated:

E(X I_{X>Y}) = \displaystyle \frac{1}{2\pi} \int_{-\infty}^\infty \left[ -e^{-x^2/2} \right]_{\sigma z}^\infty e^{-z^2/2} \, dz

= \displaystyle \frac{1}{2\pi} \int_{-\infty}^\infty \left[ 0 + e^{-\sigma^2 z^2/2} \right] e^{-z^2/2} \, dz

= \displaystyle \frac{1}{2\pi} \int_{-\infty}^\infty e^{-(\sigma^2+1) z^2/2} \, dz.

At this point, I rewrote the integrand to be the probability density function of a random variable:

E(X I_{X>Y}) = \displaystyle \frac{1}{\sqrt{2\pi} \sqrt{\sigma^2+1}} \int_{-\infty}^\infty \frac{1}{\sqrt{2\pi} \sqrt{ \frac{1}{\sigma^2+1}}} \exp \left[ -\frac{z^2}{2 \cdot \frac{1}{\sigma^2+1}} \right] \, dz.

The integrand is the probability density function of a normal random variable with mean 0 and variance \sigma^2+1, and so the integral must be equal to 1. We conclude

E(X \mid X>Y) = \displaystyle \frac{E(X I_{X>Y})}{P(X>Y)} = \frac{ \frac{1}{\sqrt{2\pi(\sigma^2+1)}} }{ \frac{1}{2} } = \sqrt{\frac{2}{\pi(\sigma^2+1)}}.

Next, we compute the other conditional expectation:

E(X^2 \mid X > \sigma Z) = \displaystyle \frac{E(X^2 I_{X>\sigma Z})}{P(X>\sigma Z)} = \displaystyle \frac{2}{2\pi} \int_{-\infty}^\infty \int_{\sigma z}^\infty x^2 e^{-x^2/2} e^{-z^2/2} \, dx dz.

The inner integral can be computed using integration by parts:

\displaystyle \int_{\sigma z}^\infty x^2 e^{-x^2/2} \, dx = \int_{\sigma z}^\infty x \frac{d}{dx} \left( -e^{-x^2/2} \right) \, dx

= \displaystyle \left[-x e^{-x^2/2} \right]_{\sigma z}^\infty + \int_{\sigma z}^\infty e^{-x^2/2} \, dx

= \sigma z e^{-\sigma^2 z^2/2} + \displaystyle \int_{\sigma z}^\infty e^{-x^2/2} \, dx.

Therefore,

E(X^2 \mid X > \sigma Z) = \displaystyle \frac{1}{\pi} \int_{-\infty}^\infty \sigma z e^{-\sigma^2 z^2/2} e^{-z^2/2} \, dz + 2 \int_{-\infty}^\infty \int_{\sigma z}^\infty \frac{1}{2\pi} e^{-x^2/2} e^{-z^2/2} \, dx dz

= \displaystyle \frac{1}{\pi} \int_{-\infty}^\infty \sigma z e^{-(\sigma^2+1) z^2} \, dz + 2 \int_{-\infty}^\infty \int_{\sigma z}^\infty \frac{1}{2\pi} e^{-x^2/2} e^{-z^2/2} \, dx dz.

We could calculate the first integral, but we can immediately see that it’s going to be equal to 0 since the integrand z e^{-(\sigma^2+1) z^2} is an odd function. The double integral is equal to P(X>\sigma Z), which we’ve already shown is equal to \frac{1}{2}. Therefore, E(X^2 \mid X > Y) = 0 + 2 \cdot \frac{1}{2} = 1.

We conclude that

\hbox{Var}(X \mid X > Y) = E(X^2 \mid X > Y) - [E(X \mid X > Y)]^2 = 1 - \displaystyle \frac{2}{\pi(\sigma^2 + 1)},

which is indeed less than 1. If \sigma = 1, we recover the conditional variance found in the first special case.

After tackling these two special cases, we start the general case with the next post.

Solving Problems Submitted to MAA Journals (Part 7a)

The following problem appeared in Volume 131, Issue 9 (2024) of The American Mathematical Monthly.

Let X and Y be independent normally distributed random variables, each with its own mean and variance. Show that the variance of X conditioned on the event X>Y is smaller than the variance of X alone.

I admit I did a double-take when I first read this problem. If X and Y are independent, then the event X>Y contains almost no information. How then, so I thought, could the conditional distribution of X given X>Y be narrower than the unconditional distribution of X?

Then I thought: I can believe that E(X \mid X > Y) is greater than E(X): if we’re given that X>Y, then we know that X must be larger than something. So maybe it’s possible for \hbox{Var}(X \mid X>Y) to be less than \hbox{Var}(X).

Still, not quite knowing how to start, I decided to begin by simplifying the problem and assume that both X and Y follow a standard normal distribution, so that E(X) = E(Y) = 0 and \hbox{SD}(X)=\hbox{SD}(Y) = 1. This doesn’t solve the original problem, of course, but I hoped that solving this simpler case might give me some guidance about tackling the general case. I also hoped that solving this special case might give me some psychological confidence that I would eventually be able to solve the general case.

For the special case, the goal is to show that

\hbox{Var}(X \mid X > Y) = E(X^2 \mid X > Y) - [E(X \mid X > Y)]^2 < 1.

We begin by computing E(X \mid X > Y) = \displaystyle \frac{E(X I_{X>Y})}{P(X>Y)}. The denominator is straightforward: since X and Y are independent normal random variables, we also know that X-Y is normally distributed with E(X-Y) = E(X)-E(Y) = 0. (Also, \hbox{Var}(X-Y) = \hbox{Var}(X) + (-1)^2 \hbox{Var}(Y) = 2, but that’s really not needed for this problem.) Therefore, P(X>Y) = P(X-Y>0) = \frac{1}{2} since the distribution of X-Y is symmetric about its mean of 0.

Next,

E(X I_{X>Y}) = \displaystyle \frac{1}{2\pi} \int_{-\infty}^\infty \int_y^\infty x e^{-x^2/2} e^{-y^2}/2 \, dx dy,

where we have used the joint probability density function for X and Y. The region of integration is \{(x,y) \in \mathbb{R}^2 \mid x > y \}, taking care of the requirement X > Y. The inner integral can be directly evaluated:

E(X I_{X>Y}) = \displaystyle \frac{1}{2\pi} \int_{-\infty}^\infty \left[ -e^{-x^2/2} \right]_x^\infty e^{-y^2/2} \, dy

= \displaystyle \frac{1}{2\pi} \int_{-\infty}^\infty \left[ 0 + e^{-y^2/2} \right] e^{-y^2/2} \, dy

= \displaystyle \frac{1}{2\pi} \int_{-\infty}^\infty e^{-y^2} \, dy.

At this point, I used a standard technique/trick of integration by rewriting the integrand to be the probability density function of a random variable. In this case, the random variable is normally distributed with mean 0 and variance 1/2:

E(X I_{X>Y}) = \displaystyle \frac{1}{\sqrt{2\pi} \sqrt{2}} \int_{-\infty}^\infty \frac{1}{\sqrt{2\pi} \sqrt{1/2}} \exp \left[ -\frac{y^2}{2 \cdot \frac{1}{2}} \right] \, dy.

The integral must be equal to 1, and so we conclude

E(X \mid X>Y) = \displaystyle \frac{E(X I_{X>Y})}{P(X>Y)} = \frac{ \frac{1}{2\sqrt{\pi}} }{ \frac{1}{2} } = \frac{1}{\sqrt{\pi}}.

We parenthetically note that E(X \mid X>Y) > 0, matching my initial intuition.

Next, we compute the other conditional expectation:

E(X^2 \mid X > Y) = \displaystyle \frac{E(X^2 I_{X>Y})}{P(X>Y)} = \displaystyle \frac{2}{2\pi} \int_{-\infty}^\infty \int_y^\infty x^2 e^{-x^2/2} e^{-y^2/2} \, dx dy.

The inner integral can be computed using integration by parts:

\displaystyle \int_y^\infty x^2 e^{-x^2/2} \, dx = \int_y^\infty x \frac{d}{dx} \left( -e^{-x^2/2} \right) \, dx

= \displaystyle \left[-x e^{-x^2/2} \right]_y^\infty + \int_y^\infty e^{-x^2/2} \, dx

= y e^{-y^2/2} + \displaystyle \int_y^\infty e^{-x^2/2} \, dx.

Therefore,

E(X^2 \mid X > Y) = \displaystyle \frac{1}{\pi}  \int_{-\infty}^\infty y e^{-y^2/2} e^{-y^2/2} \, dy + 2 \int_{-\infty}^\infty \int_y^\infty \frac{1}{2\pi} e^{-x^2/2} e^{-y^2/2} \, dx dy

= \displaystyle \frac{1}{\pi}  \int_{-\infty}^\infty y e^{-y^2} \, dy + 2 \int_{-\infty}^\infty \int_y^\infty \frac{1}{2\pi} e^{-x^2/2} e^{-y^2/2} \, dx dy.

We could calculate the first integral, but we can immediately see that it’s going to be equal to 0 since the integrand y e^{-y^2} is an odd function. The double integral is equal to P(X>Y), which we’ve already shown is equal to \frac{1}{2}. Therefore, E(X^2 \mid X > Y) = 0 + 2 \cdot \frac{1}{2} = 1.

We conclude that

\hbox{Var}(X \mid X > Y) = E(X^2 \mid X > Y) - [E(X \mid X > Y)]^2 = 1 - \displaystyle \left( \frac{1}{\sqrt{\pi}} \right)^2 = 1 - \frac{1}{\pi},

which is indeed less than 1.

This solves the problem for the special case of two independent standard normal random variables. This of course does not yet solve the general case, but my hope was that solving this problem might give me some intuition about the general case, which I’ll develop as this series progresses.

Solving Problems Submitted to MAA Journals (Part 6e)

The following problem appeared in Volume 97, Issue 3 (2024) of Mathematics Magazine.

Two points P and Q are chosen at random (uniformly) from the interior of a unit circle. What is the probability that the circle whose diameter is segment overline{PQ} lies entirely in the interior of the unit circle?

Let D_r be the interior of the circle centered at the origin O with radius r. Also, let C(P,Q) denote the circle with diameter \overline{PQ}, and let R = OP be the distance of P from the origin.

In the previous post, we showed that

\hbox{Pr}(C(P,Q) \subset D_1 \mid R = r) = \sqrt{1-r^2}.

To find \hbox{Pr}(C(P,Q) \subset D_1), I will integrate over this conditional probability:

\hbox{Pr}(C(P,Q) \subset D_1) = \displaystyle \int_0^1 \hbox{Pr}(C(P,Q) \subset D_1 \mid R = r) F'(r) \, dr,

where F(r) is the cumulative distribution function of R. For 0 \le r \le 1,

F(r) = \hbox{Pr}(R \le r) = \hbox{Pr}(P \in D_r) = \displaystyle \frac{\hbox{area}(D_r)}{\hbox{area}(D_1)} = \frac{\pi r^2}{\pi} = r^2.

Therefore,

\hbox{Pr}(C(P,Q) \subset D_1) = \displaystyle \int_0^1 \hbox{Pr}(C(P,Q) \subset D_1 \mid R = r) F'(r) \, dr

= \displaystyle \int_0^1 2 r \sqrt{1-r^2} \, dr.

To calculate this integral, I’ll use the trigonometric substitution u = 1-r^2. Then the endpoints r=0 and r=1 become u = \sqrt{1-0^2} = 1 and u = \sqrt{1-1^2} = 0. Also, du = -2r \, dr. Therefore,

\hbox{Pr}(C(P,Q) \subset D_1) = \displaystyle \int_0^1 2 r \sqrt{1-r^2} \, dr

= \displaystyle \int_1^0 -\sqrt{u} \, du

= \displaystyle \int_0^1 \sqrt{u} \, du

= \displaystyle \frac{2}{3} \left[  u^{3/2} \right]_0^1

=\displaystyle  \frac{2}{3}\left[ (1)^{3/2} - (0)^{3/2} \right]

= \displaystyle \frac{2}{3},

confirming the answer I had guessed from simulations.

Confirming Einstein’s Theory of General Relativity With Calculus, Part 5d: Deriving Orbits under Newtonian Mechanics Using Variation of Parameters

In this series, I’m discussing how ideas from calculus and precalculus (with a touch of differential equations) can predict the precession in Mercury’s orbit and thus confirm Einstein’s theory of general relativity. The origins of this series came from a class project that I assigned to my Differential Equations students maybe 20 years ago.

We previously showed that if the motion of a planet around the Sun is expressed in polar coordinates (r,theta), with the Sun at the origin, then under Newtonian mechanics (i.e., without general relativity) the motion of the planet follows the differential equation

u''(\theta) + u(\theta) = \displaystyle \frac{1}{\alpha},

where u = 1/r and \alpha is a certain constant. We will also impose the initial condition that the planet is at perihelion (i.e., is closest to the sun), at a distance of P, when \theta = 0. This means that u obtains its maximum value of 1/P when \theta = 0. This leads to the two initial conditions

u(0) = \displaystyle \frac{1}{P} \qquad \hbox{and} \qquad u'(0) = 0;

the second equation arises since u has a local extremum at \theta = 0.

We now take the perspective of a student who is taking a first-semester course in differential equations. There are two standard techniques for solving a second-order non-homogeneous differential equations with constant coefficients. One of these is the method of variation of parameters. First, we solve the associated homogeneous differential equation

u''(\theta) + u(\theta) = 0.

The characteristic equation of this differential equation is r^2 + 1 = 0, which clearly has the two imaginary roots r = \pm i. Therefore, two linearly independent solutions of the associated homogeneous equation are u_1(\theta) = \cos \theta and u_2(\theta) = \sin \theta.

(As an aside, this is one answer to the common question, “What are complex numbers good for?” The answer is naturally above the heads of Algebra II students when they first encounter the mysterious number i, but complex numbers provide a way of solving the differential equations that model multiple problems in statics and dynamics.)

According to the method of variation of parameters, the general solution of the original nonhomogeneous differential equation

u''(\theta) + u(\theta) = g(\theta)

is

u(\theta) = f_1(\theta) u_1(\theta) + f_2(\theta) u_2(\theta),

where

f_1(\theta) = -\displaystyle \int \frac{u_2(\theta) g(\theta)}{W(\theta)} d\theta ,

f_2(\theta) = \displaystyle \int \frac{u_1(\theta) g(\theta)}{W(\theta)} d\theta ,

and W(\theta) is the Wronskian of u_1(\theta) and u_2(\theta), defined by the determinant

W(\theta) = \displaystyle \begin{vmatrix} u_1(\theta) & u_2(\theta) \\ u_1'(\theta) & u_2'(\theta) \end{vmatrix}  = u_1(\theta) u_2'(\theta) - u_1'(\theta) u_2(\theta).

Well, that’s a mouthful.

Fortunately, for the example at hand, these computations are pretty easy. First, since u_1(\theta) = \cos \theta and u_2(\theta) = \sin \theta, we have

W(\theta) = (\cos \theta)(\cos \theta) - (\sin \theta)(-\sin \theta) = \cos^2 \theta + \sin^2 \theta = 1

from the usual Pythagorean trigonometric identity. Therefore, the denominators in the integrals for f_1(\theta) and f_2(\theta) essentially disappear.

Since g(\theta) = \displaystyle \frac{1}{\alpha}, the integrals for f_1(\theta) and f_2(\theta) are straightforward to compute:

f_1(\theta) = -\displaystyle \int u_2(\theta) \frac{1}{\alpha} d\theta = -\displaystyle \frac{1}{\alpha} \int \sin \theta \, d\theta = \displaystyle \frac{1}{\alpha}\cos \theta + a,

where we use +a for the constant of integration instead of the usual +C. Second,

f_2(\theta) = \displaystyle \int u_1(\theta)  \frac{1}{\alpha} d\theta = \displaystyle \frac{1}{\alpha} \int \cos \theta \, d\theta = \displaystyle \frac{1}{\alpha}\sin \theta + b,

using +b for the constant of integration. Therefore, by variation of parameters, the general solution of the nonhomogeneous differential equation is

u(\theta) = f_1(\theta) u_1(\theta) + f_2(\theta) u_2(\theta)

= \left( \displaystyle \frac{1}{\alpha}\cos \theta + a \right) \cos \theta + \left( \displaystyle \frac{1}{\alpha}\sin\theta + b \right) \sin \theta

= a \cos \theta + b\sin \theta + \displaystyle \frac{\cos^2 \theta + \sin^2 \theta}{\alpha}

= a \cos \theta + b \sin \theta + \displaystyle \frac{1}{\alpha}.

Unsurprisingly, this matches the answer in the previous post that was found by the method of undetermined coefficients.

For the sake of completeness, I repeat the argument used in the previous two posts to determine a and b. This is require using the initial conditions u(0) = \displaystyle \frac{1}{P} and u'(0) = 0. From the first initial condition,

u(0) = a \cos 0 + b \sin 0 + \displaystyle \frac{1}{\alpha}

\displaystyle \frac{1}{P} = a + \frac{1}{\alpha}

\displaystyle \frac{1}{P} - \frac{1}{\alpha} = a

\displaystyle \frac{\alpha - P}{\alpha P} = a

From the second initial condition,

u'(\theta) = -a \sin \theta + b \cos \theta

u'(0) = -a \sin 0 + b \cos 0

0 = b.

From these two constants, we obtain

u(\theta) = \displaystyle \frac{\alpha - P}{\alpha P}  \cos \theta + 0 \sin \theta + \displaystyle \frac{1}{\alpha}

= \displaystyle \frac{1}{\alpha} \left(  1 + \frac{\alpha-P}{P} \cos \theta \right)

= \displaystyle \frac{1 + \epsilon \cos \theta}{\alpha},

where \epsilon = \displaystyle \frac{\alpha - P}{P}.

Finally, since r = 1/u, we see that the planet’s orbit satisfies

r = \displaystyle \frac{\alpha}{1 + \epsilon \cos \theta},

so that, as shown earlier in this series, the orbit is an ellipse with eccentricity \epsilon.

Thoughts on Numerical Integration (Part 23): The normalcdf function on TI calculators

I end this series about numerical integration by returning to the most common (if hidden) application of numerical integration in the secondary mathematics curriculum: finding the area under the normal curve. This is a critically important tool for problems in both probability and statistics; however, the antiderivative of \displaystyle \frac{1}{\sqrt{2\pi}} e^{-x^2/2} cannot be expressed using finitely many elementary functions. Therefore, we must resort to numerical methods instead.

In days of old, of course, students relied on tables in the back of the textbook to find areas under the bell curve, and I suppose that such tables are still being printed. For students with access to modern scientific calculators, of course, there’s no need for tables because this is a built-in function on many calculators. For the line of TI calculators, the command is normalcdf.

Unfortunately, it’s a sad (but not well-known) fact of life that the TI-83 and TI-84 calculators are not terribly accurate at computing these areas. For example:

TI-84: \displaystyle \int_0^1 \frac{e^{-x^2/2}}{\sqrt{2\pi}} \, dx \approx 0.3413447\underline{399}

Correct answer, with Mathematica: 0.3413447\underline{467}\dots

TI-84: \displaystyle \int_1^2 \frac{e^{-x^2/2}}{\sqrt{2\pi}} \, dx \approx 0.1359051\underline{975}

Correct answer, with Mathematica: 0.1359051\underline{219}\dots

TI-84: \displaystyle \int_2^3 \frac{e^{-x^2/2}}{\sqrt{2\pi}} \, dx \approx 0.021400\underline{0948}

Correct answer, with Mathematica: 0.021400\underline{2339}\dots

TI-84: \displaystyle \int_3^4 \frac{e^{-x^2/2}}{\sqrt{2\pi}} \, dx \approx 0.0013182\underline{812}

Correct answer, with Mathematica: 0.0013182\underline{267}\dots

TI-84: \displaystyle \int_4^5 \frac{e^{-x^2/2}}{\sqrt{2\pi}} \, dx \approx 0.0000313\underline{9892959}

Correct answer, with Mathematica: 0.0000313\underline{84590261}\dots

TI-84: \displaystyle \int_5^6 \frac{e^{-x^2/2}}{\sqrt{2\pi}} \, dx \approx 2.8\underline{61148776} \times 10^{-7}

Correct answer, with Mathematica: 2.8\underline{56649842}\dots \times 10^{-7}

I don’t presume to know the proprietary algorithm used to implement normalcdf on TI-83 and TI-84 calculators. My honest if brutal assessment is that it’s probably not worth knowing: in the best case (when the endpoints are close to 0), the calculator provides an answer that is accurate to only 7 significant digits while presenting the illusion of a higher degree of accuracy. I can say that Simpson’s Rule with only n = 26 subintervals provides a better approximation to \displaystyle \int_0^1 \frac{e^{-x^2/2}}{\sqrt{2\pi}} \, dx than the normalcdf function.

For what it’s worth, I also looked at the accuracy of the NORMSDIST function in Microsoft Excel. This is much better, almost always producing answers that are accurate to 11 or 12 significant digits, which is all that can be realistically expected in floating-point double-precision arithmetic (in which numbers are usually stored accurate to 13 significant digits prior to any computations).

Thoughts on Numerical Integration (Part 22): Comparison to theorems about magnitudes of errors

Numerical integration is a standard topic in first-semester calculus. From time to time, I have received questions from students on various aspects of this topic, including:

  • Why is numerical integration necessary in the first place?
  • Where do these formulas come from (especially Simpson’s Rule)?
  • How can I do all of these formulas quickly?
  • Is there a reason why the Midpoint Rule is better than the Trapezoid Rule?
  • Is there a reason why both the Midpoint Rule and the Trapezoid Rule converge quadratically?
  • Is there a reason why Simpson’s Rule converges like the fourth power of the number of subintervals?

In this series, I hope to answer these questions. While these are standard questions in a introductory college course in numerical analysis, and full and rigorous proofs can be found on Wikipedia and Mathworld, I will approach these questions from the point of view of a bright student who is currently enrolled in calculus and hasn’t yet taken real analysis or numerical analysis.

In this series, we have shown the following approximations of errors when using various numerical approximations for \int_a^b x^k \, dx. We obtained these approximations using only techniques within the reach of a talented high school student who has mastered Precalculus — especially the Binomial Theorem — and elementary techniques of integration.

As we now present, the formulas that we derived are (of course) easily connected to known theorems for the convergence of these techniques. These proofs, however, require some fairly advanced techniques from calculus. So, while the formulas derived in this series of posts only apply to f(x) = x^k (and, by an easy extension, any polynomial), the formulas that we do obtain easily foreshadow the actual formulas found on Wikipedia or Mathworld or calculus textbooks, thus (hopefully) taking some of the mystery out of these formulas.

Left and right endpoints: Our formula was

E \approx \displaystyle \frac{k}{2} x_*^{k-1} (b-a)h,

where x_* is some number between a and b. By comparison, the actual formula for the error is

E = \displaystyle \frac{f'(x_*) (b-a)^2}{2n} = \frac{f'(x_*)}{2} (b-a)h.

This reduces to the formula that we derived since f'(x) = kx^{k-1}.
 

Midpoint Rule: Our formula was

E \approx \displaystyle \frac{k(k-1)}{24} x_*^{k-1} (b-a)h,

where x_* is some number between a and b. By comparison, the actual formula for the error is

E = \displaystyle \frac{f''(x_*) (b-a)^3}{24n^2} = \frac{f''(x_*)}{24} (b-a)h^2.

This reduces to the formula that we derived since f''(x) = k(k-1)x^{k-2}.

Trapezoid Rule: Our formula was

E \approx \displaystyle \frac{k(k-1)}{12} x_*^{k-1} (b-a)h,

where x_* is some number between a and b. By comparison, the actual formula for the error is

E = \displaystyle \frac{f''(x_*) (b-a)^3}{12n^2} = \frac{f''(x_*)}{12} (b-a)h^2.

This reduces to the formula that we derived since f''(x) = k(k-1)x^{k-2}.

This reduces to the formula that we derived since f''(x) = k(k-1)x^{k-2}.

Simpson’s Rule: Our formula was

E \approx \displaystyle \frac{k(k-1)(k-2)(k-3)}{180} x_*^{k-4} (b-a)h^4,

where x_* is some number between a and b. By comparison, the actual formula for the error is

E = \displaystyle \frac{f^{(4)}(x_*)}{180} (b-a)h^4.

This reduces to the formula that we derived since f^{(4)}(x) = k(k-1)(k-2)(k-3)x^{k-4}.