# My Favorite One-Liners: Part 71

In this series, I’m compiling some of the quips and one-liners that I’ll use with my students to hopefully make my lessons more memorable for them.

Some of the algorithms that I teach are pretty lengthy. For example, consider the calculation of a $100(1-\alpha)\%$ confidence interval for a proportion:

$\displaystyle \frac{\hat{p} + \displaystyle \frac{z_{\alpha/2}^2}{2n}}{\displaystyle 1 + \frac{z_{\alpha/2}^2}{n} } - z_{\alpha/2} \frac{\sqrt{\displaystyle \frac{ \hat{p} \hat{q}}{n} + \displaystyle \frac{z_{\alpha/2}^2}{4n^2}}}{\displaystyle 1 + \frac{z_{\alpha/2}^2}{n} } < p < \displaystyle \frac{\hat{p} + \displaystyle \frac{z_{\alpha/2}^2}{2n}}{\displaystyle 1 + \frac{z_{\alpha/2}^2}{n} } + z_{\alpha/2} \frac{\sqrt{\displaystyle \frac{ \hat{p} \hat{q}}{n} + \displaystyle \frac{z_{\alpha/2}^2}{4n^2}}}{\displaystyle 1 + \frac{z_{\alpha/2}^2}{n} }$.

Wow.

Proficiency with this formula definitely requires practice, and so I’ll typically give a couple of practice problems so that my students can practice using this formula while in class. After the last example, when I think that my students have the hang of this very long calculation, I’ll give my one-liner to hopefully boost their confidence (no pun intended):

By now, you probably think that this calculation is dull, uninteresting, repetitive, and boring. If so, then I’ve done my job right.

# My Favorite One-Liners: Part 65

In this series, I’m compiling some of the quips and one-liners that I’ll use with my students to hopefully make my lessons more memorable for them.

I’ll use today’s one-liner just before I begin some enormous, complicated, and tedious calculation that’s going to take more than a few minutes to complete. To give a specific example of such a calculation: consider the derivation of the Agresti confidence interval for proportions. According to the central limit theorem, if $n$ is large enough, then

$Z = \displaystyle \frac{ \hat{p} - p}{ \displaystyle \sqrt{ \frac{p(1-p) }{n} } }$

is approximately normally distributed, where $p$ is the true population proportion and $\hat{p}$ is the sample proportion from a sample of size $n$. By unwrapping this equation and solving for $p$, we obtain the formula for the confidence interval for a proportion:

$z \displaystyle \sqrt{\frac{p(1-p)}{n} } = \hat{p} - p$

$\displaystyle \frac{z^2 p(1-p)}{n} = \left( \hat{p} - p \right)^2$

$z^2p - z^2 p^2 = n \hat{p}^2 - 2 n \hat{p} p + n p^2$

$0 = p^2 (z^2 + n) - p (2n \hat{p} + z^2) + n \hat{p}^2$

We now use the quadratic formula to solve for $p$:

$p = \displaystyle \frac{2n \hat{p} + z^2 \pm \sqrt{ \left(2n\hat{p} + z^2 \right)^2 - 4n\hat{p}^2 (z^2+n)}}{2(z^2+n)}$

$p = \displaystyle \frac{2n \hat{p} + z^2 \pm \sqrt{4n^2 \hat{p}^2 + 4n \hat{p} z^2 + z^4 - 4n\hat{p}^2 z^2 - 4n^2 \hat{p}^2}}{2(z^2 + n)}$

$p = \displaystyle \frac{2n \hat{p} + z^2 \pm \sqrt{4n (\hat{p}-\hat{p}^2) z^2 + z^4}}{2(z^2 + n)}$

$p = \displaystyle \frac{2n \hat{p} + z^2 \pm \sqrt{4n \hat{p}(1-\hat{p}) z^2 + z^4}}{2(z^2 + n)}$

$p = \displaystyle \frac{2n \hat{p} + z^2 \pm \sqrt{4n \hat{p} \hat{q} z^2 + z^4}}{2(z^2 + n)}$

$p = \displaystyle \frac{2n \hat{p} + z^2 \pm z \sqrt{4n \hat{p} \hat{q} + z^2}}{2(z^2 + n)}$

$p = \displaystyle \frac{2n \hat{p} + z^2 \pm z \sqrt{4n^2 \displaystyle \frac{ \hat{p} \hat{q}}{n} + \displaystyle 4n^2 \frac{z^2}{4n^2}}}{2(z^2 + n)}$

$p = \displaystyle \frac{2n \hat{p} + z^2 \pm 2nz \sqrt{\displaystyle \frac{ \hat{p} \hat{q}}{n} + \displaystyle \frac{z^2}{4n^2}}}{2(z^2 + n)}$

$p = \displaystyle \frac{2n \hat{p} + 2n \displaystyle \frac{z^2}{2n} \pm 2nz \sqrt{\displaystyle \frac{ \hat{p} \hat{q}}{n} +\displaystyle \frac{z^2}{4n^2}}}{2n \displaystyle \left(1 + \frac{z^2}{n} \right)}$

$p = \displaystyle \frac{\hat{p} + \displaystyle \frac{z^2}{2n} \pm z \sqrt{\displaystyle \frac{ \hat{p} \hat{q}}{n} + \displaystyle \frac{z^2}{4n^2}}}{\displaystyle 1 + \frac{z^2}{n} }$

From this we finally obtain the $100(1-\alpha)\%$ confidence interval

$\displaystyle \frac{\hat{p} + \displaystyle \frac{z_{\alpha/2}^2}{2n}}{\displaystyle 1 + \frac{z_{\alpha/2}^2}{n} } - z_{\alpha/2} \frac{\sqrt{\displaystyle \frac{ \hat{p} \hat{q}}{n} + \displaystyle \frac{z_{\alpha/2}^2}{4n^2}}}{\displaystyle 1 + \frac{z_{\alpha/2}^2}{n} } < p < \displaystyle \frac{\hat{p} + \displaystyle \frac{z_{\alpha/2}^2}{2n}}{\displaystyle 1 + \frac{z_{\alpha/2}^2}{n} } + z_{\alpha/2} \frac{\sqrt{\displaystyle \frac{ \hat{p} \hat{q}}{n} + \displaystyle \frac{z_{\alpha/2}^2}{4n^2}}}{\displaystyle 1 + \frac{z_{\alpha/2}^2}{n} }$.

Whew.

So, before I start such an incredibly long calculation, I’ll warn my students that this is going to take some time and we need to prepare… and I’ll start doing jumping jacks, shadow boxing, and other “exercise” in preparation for doing all of this writing.

# My Favorite One-Liners: Part 52

In this series, I’m compiling some of the quips and one-liners that I’ll use with my students to hopefully make my lessons more memorable for them. Today’s story is a continuation of yesterday’s post.

When I teach regression, I typically use this example to illustrate the regression effect:

Suppose that the heights of fathers and their adult sons both have mean 69 inches and standard deviation 3 inches. Suppose also that the correlation between the heights of the fathers and sons is 0.5. Predict the height of a son whose father is 63 inches tall. Repeat if the father is 78 inches tall.

Using the formula for the regression line

$y = \overline{y} + r \displaystyle \frac{s_y}{s_x} (x - \overline{x})$,

we obtain the equation

$y = 69 + 0.5(x-69) = 0.5x + 34.5$,

so that the predicted height of the son is 66 inches if the father is 63 inches tall. However, the prediction would be 73.5 inches if the father is 76 inches tall. As expected, tall fathers tend to have tall sons, and short fathers tend to have short sons. Then, I’ll tell my class:

However, to the psychological comfort of us short people, tall fathers tend to have sons who are not quite as tall, and short fathers tend to have sons who are not quite as short.

This was first observed by Francis Galton (see the Wikipedia article for more details), a particularly brilliant but aristocratic (read: snobbish) mathematician who had high hopes for breeding a race of super-tall people with the proper use of genetics, only to discover that the laws of statistics naturally prevented this from occurring. Defeated, he called this phenomenon “regression toward the mean,” and so we’re stuck with called fitting data to a straight line “regression” to this day.

# My Favorite One-Liners: Part 51

In this series, I’m compiling some of the quips and one-liners that I’ll use with my students to hopefully make my lessons more memorable for them.

When I teach regression, I typically use this example to illustrate the regression effect:

Suppose that the heights of fathers and their adult sons both have mean 69 inches and standard deviation 3 inches. Suppose also that the correlation between the heights of the fathers and sons is 0.5. Predict the height of a son whose father is 63 inches tall. Repeat if the father is 78 inches tall.

Using the formula for the regression line

$y = \overline{y} + r \displaystyle \frac{s_y}{s_x} (x - \overline{x})$,

we obtain the equation

$y = 69 + 0.5(x-69) = 0.5x + 34.5$,

so that the predicted height of the son is 66 inches if the father is 63 inches tall. However, the prediction would be 73.5 inches if the father is 76 inches tall.

To make this more memorable for students, I’ll observe:

As expected, tall fathers tend to have tall sons, and short fathers tend to have short sons. For example, my uncle was 6’6″. His two sons, my cousins, were 6’4″ and 6’5″ and were high school basketball stars.

My father was 5’3″. I became a math nerd.

# My Favorite One-Liners: Part 36

In this series, I’m compiling some of the quips and one-liners that I’ll use with my students to hopefully make my lessons more memorable for them.

Not everything in mathematics works out the way we’d prefer it to. For example, in statistics, a Type I error, whose probability is denoted by $\alpha$, is rejecting the null hypothesis even though the null hypothesis is true. Conversely, a Type II error, whose probability is denoted by $\beta$, is retaining the null hypothesis even though the null hypothesis is false.

Ideally, we’d like $\alpha = 0$ and $\beta = 0$, so there’s no chance of making a mistake. I’ll tell my students:

There are actually two places in the country where this can happen. One’s in California, and the other is in Florida. And that place is called Fantasyland.

# My Favorite One-Liners: Part 33

In this series, I’m compiling some of the quips and one-liners that I’ll use with my students to hopefully make my lessons more memorable for them.

Perhaps one of the more difficult things that I try to instill in my students is numeracy, or a sense of feeling if an answer to a calculation is plausible. As a initial step toward this goal, I’ll try to teach my students some basic pointers about whether an answer is even possible.

For example, when calculating a standard deviation, students have to compute $E(X)$ and $E(X^2)$:

$E(X) = \sum x p(x) \qquad \hbox{or} \qquad E(X) = \int_a^b x f(x) \, dx$

$E(X^2) = \sum x^2 p(x) \qquad \hbox{or} \qquad E(X^2) = \int_a^b x^2 f(x) \, dx$

After these are computed — which could take some time — the variance is then calculated:

$\hbox{Var}(X) = E(X^2) - [E(X)]^2$.

Finally, the standard deviation is found by taking the square root of the variance.

So, I’ll ask my students, what do you do if you calculate the variance and it’s negative, so that it’s impossible to take the square root? After a minute to students hemming and hawing, I’ll tell them emphatically what they should do:

It’s wrong… do it again.

The same principle applies when computing probabilities, which always have to be between 0 and 1. So, if ever a student computes a probability that’s either negative or else greater than 1, they can be assured that the answer is wrong and that there’s a mistake someplace in their computation that needs to be found.

# My Favorite One-Liners: Part 23

In this series, I’m compiling some of the quips and one-liners that I’ll use with my students to hopefully make my lessons more memorable for them.

Here are some sage words of wisdom that I give in my statistics class:

If the alternative hypothesis has the form $p > p_0$, then the rejection region lies to the right of $p_0$. On the other hand, if the alternative hypothesis has the form $p < p_0$, then the rejection region lies to the left of $p_0$.

On the other hand, if the alternative hypothesis has the form $p \ne p_0$, then the rejection region has two parts: one part to the left of $p_0$, and another part to the right. So it’s kind of like my single days. Back then, my rejection region had two parts: Friday night and Saturday night.

# My Favorite One-Liners: Part 22

In this series, I’m compiling some of the quips and one-liners that I’ll use with my students to hopefully make my lessons more memorable for them. Today’s example might be the most cringe-worthy pun that I use in any class that I teach.

In my statistics classes, I try to emphasize to student that a high value of the correlation coefficient $r$ is not the same thing as causation. To hopefully drive home this point, I’ll use the following picture.

Conclusion: If we want to stop global warming, we should all become pirates.

Obviously, I tell my class, there isn’t a cause-and-effect relationship here, even though there is a strong positive correlation. So, I tell my class, in my best pirate voice, “Correlation is not the same thing as a causation, even if you get a large value of ARRRRRRR.”

Without fail, my students love this awful wisecrack.

While I’m on the topic, this is too good not to share:

For further reading, see my series on correlation and causation.

# Engaging students: Defining a function of one variable

In my capstone class for future secondary math teachers, I ask my students to come up with ideas for engaging their students with different topics in the secondary mathematics curriculum. In other words, the point of the assignment was not to devise a full-blown lesson plan on this topic. Instead, I asked my students to think about three different ways of getting their students interested in the topic in the first place.

I plan to share some of the best of these ideas on this blog (after asking my students’ permission, of course).

This student submission again comes from my former student Matthew Garza. His topic, from Algebra: defining a function of one variable.

How can this topic be used in your students’ future courses in mathematics and science?

Being able to define a function of one variable is necessary for creating a model that describes the most basic phenomenon in math and science. In math, understanding these parent functions is crucial to understanding more complicated functions and, by considering some variables as temporarily fixed, multivariable equations and systems of equations can be easier to understand. In science, we often observe functions of a single variable.  In fact, even if there are multiple variables coming into play, a good lab will likely control all but one variable, so that we can understand the relationship with respect to that single variable – a function.

Consider in science, for example, the ideal gas law: PV = nRT, where P is pressure, V is volume, n is the quantity in moles of a gas, R is the gas constant, and T is temperature.  This law, taught in high school chemistry, is not taught from scratch.  The proportional, single-variable functions that make up the equation are observed individually before the ideal gas law is introduced. Students will probably be taught Boyle’s, Charles’, Gay-Lussac’s, and Avogadro’s laws first. Boyle’s law states pressure and volume are inversely proportional (for a fixed temperature and quantity of gas).  This law can be demonstrated in one lab by clamping a pipette with some water and air inside, thus fixing all but two variables.  Pressure is applied to the pipette and the volume of air is measured using the length of the air column in the pipette.  Students must then evaluate volume V as a function of the single variable pressure P.  It should be noted that the length of the air column is measured, while the diameter of the pipette is fixed, thus volume must be calculated as a function of the single variable length.  Understanding the single variable, proportional and inversely proportional relationships is crucial to understanding the ideal gas law itself.

How can technology (YouTube, Khan Academy [khanacademy.org], Vi Hart, Geometers Sketchpad, graphing calculators, etc.) be used to effectively engage students with this topic? Note: It’s not enough to say “such-and-such is a great website”; you need to explain in some detail why it’s a great website.

Generally speaking, Khan Academy has great videos to help understand math concepts.  Although it’s a little dry, this “Introduction to Functions” video is clear, concise, and touches on several ideas that I was having trouble tying in to every example.  This introductory video begins with the basic concept of a function as a mapping from one value to another single value.  The first examples it uses are a piece-wise function and a less computational function that returns the next highest number beginning with the same letter.  At first I didn’t like that these functions were discontinuous, but this actually gives something else to discuss.  The video links back prior knowledge, explaining that the dependent variable y that students are familiar with is actually a function of x, and represents the two in a table.  The last couple minutes of the video address the fundamental property that a function must produce unique outputs for each x, or it is a relationship.

How could you as a teacher create an activity or project that involves your topic?

One idea might be to examine any function in which time is the independent variable.  Basic concepts of motion in physics can supplement an activity – Have groups evaluate position and speed with respect to time of, say, a marble or hot wheels car rolling down a ramp.  Using a stop watch and marking distance on an inclined plane, students could time how long it took to reach certain points and create a graph over time of displacement.  This method might result in some students graphing time as a function of displacement, which could lead to an interesting discussion on independence and dependence, and why it might be useful to view change as a function of time.

Technology could supplement such a lesson as to avoid confusion over whether distance is a function of time or vice versa.  Using motion sensor devices to collect data, such as the CBR2, students can use less time collecting and plotting data and more time examining it.  Different trials resulting in different graphs can lead to discussion on how to model such motion as a function of time – letting an object sit still would result in a constant graph, something rolling down an incline will give a parabolic graph (until the object gets too close to a terminal velocity).

To add variety, students can examine what a graph looks like if they move toward and away from the CBR2 or try to reproduce given position graphs.  This may result in the same position at different times, but since an object can be in only one position at a given time, the utility of using position as a function of time can be represented. Sporadic motion, including changes in speed and direction (like moving back and forth and standing still) also allow discussion of piecewise functions, and that functions don’t necessarily have to have a “rule” as long as only one output is assigned per value in the domain.

# Engaging students: Standard Deviation

In my capstone class for future secondary math teachers, I ask my students to come up with ideas for engaging their students with different topics in the secondary mathematics curriculum. In other words, the point of the assignment was not to devise a full-blown lesson plan on this topic. Instead, I asked my students to think about three different ways of getting their students interested in the topic in the first place.

I plan to share some of the best of these ideas on this blog (after asking my students’ permission, of course).

This student submission comes from my former student Jillian Greene. Her topic, from statistics: standard deviations.

How could you as a teacher create an activity or project that involves your topic?

An activity that I’ve seen presented to introduce the idea of standard deviation requires students to explore the information given to them before actually being taught the math behind standard deviation. As the students settle into their seats, prompt them to work with their shoulder partner and help to measure the width of their left thumbnail (or length of their pointer finger, width of their hand, etc.) and write it on a sticky note. Once the data is collected, the students will calculate the mean of all of the measurements. The mean is then written on the board in the center, and the students are asked to go and stack their post-it notes in either the center if they are perfectly the mean, or on the right or left if it’s bigger or smaller, respectively. Have them find the mean of the distances of each measurement from the mean. When they discover this should be zero, have them discuss with each other and then in the big group what that means. If time provides, it might even be fun to ask deeper understanding questions like what would happen if everyone last half of their thumbnail, or what if just Student A’s thumbnail tripled in size. This will provide a meaningful sequitur into the sometimes confusing world of standard deviation and distances from the mean.

How has this topic appeared in pop culture (movies, TV, current music, video games, etc.)?

http://www.dailymotion.com/video/x3lc0rx

This is a full episode of Everybody Loves Raymond but the clip in reference starts at about 8:45 and lasts a minute or so.

This clip shows a scenario where the couple, Ray and Deborah, is comparing their scores on an IQ test (a very common use for standard deviation). Deborah comments on how her score is very close to Ray’s, being only 15 points higher. The brother that proctored the exam corrected her by saying that 15 points is a standard deviation higher and puts her in a “whole new class” of genius. Have students discuss and explain what it means for Deborah to be one standard deviation higher. Use the information given in the episode (100 is average, 115 is one standard deviation higher) to construct the bell curve for IQ scores. Then use the bell curve to introduce percentiles. Since Ray is the average, center-of-the-bell score, then he is in the 50th percentile. The students can then attempt to discover on their own (or with a group) what percentile Deborah’s score puts her in.

How can this topic be used in your students’ future courses in mathematics or science?

Standard deviation is a topic that pervades almost all sciences. In biology classes, students are asked to student the weather and climate of various habitats. In differentiating between the two, one must look at the overall picture. If the student is presented with the information that place A and place B both have average temperatures of 60 degrees, this information might not be good to take as face value. Place A might have a range from 40 to 80 degrees throughout the year while place B might range from 0 to 100 and then have one or two extremely hot outliers that even the average out to 60. Looking at not only the skew of the bell curve, but also what the standard deviation is for each place, might save a student from forgetting to bring a fan to hypothetical place B, or writing that that the climate of that place is cool year round.  In addition to biology, standard deviation is a very necessary operation in psychology, which is a very statistics-based science. This can easily be seen in representing IQ scores how we found earlier!