In this series, I’m compiling some of the quips and one-liners that I’ll use with my students to hopefully make my lessons more memorable for them. Today’s story is a continuation of yesterday’s post.
When I teach regression, I typically use this example to illustrate the regression effect:
Suppose that the heights of fathers and their adult sons both have mean 69 inches and standard deviation 3 inches. Suppose also that the correlation between the heights of the fathers and sons is 0.5. Predict the height of a son whose father is 63 inches tall. Repeat if the father is 78 inches tall.
Using the formula for the regression line
we obtain the equation
so that the predicted height of the son is 66 inches if the father is 63 inches tall. However, the prediction would be 73.5 inches if the father is 76 inches tall. As expected, tall fathers tend to have tall sons, and short fathers tend to have short sons. Then, I’ll tell my class:
However, to the psychological comfort of us short people, tall fathers tend to have sons who are not quite as tall, and short fathers tend to have sons who are not quite as short.
This was first observed by Francis Galton (see the Wikipedia article for more details), a particularly brilliant but aristocratic (read: snobbish) mathematician who had high hopes for breeding a race of super-tall people with the proper use of genetics, only to discover that the laws of statistics naturally prevented this from occurring. Defeated, he called this phenomenon “regression toward the mean,” and so we’re stuck with called fitting data to a straight line “regression” to this day.