My Favorite One-Liners: Part 71

In this series, I’m compiling some of the quips and one-liners that I’ll use with my students to hopefully make my lessons more memorable for them.

Some of the algorithms that I teach are pretty lengthy. For example, consider the calculation of a 100(1-\alpha)\% confidence interval for a proportion:

\displaystyle \frac{\hat{p} + \displaystyle \frac{z_{\alpha/2}^2}{2n}}{\displaystyle 1 + \frac{z_{\alpha/2}^2}{n} } - z_{\alpha/2} \frac{\sqrt{\displaystyle \frac{ \hat{p} \hat{q}}{n} + \displaystyle \frac{z_{\alpha/2}^2}{4n^2}}}{\displaystyle 1 + \frac{z_{\alpha/2}^2}{n} } < p < \displaystyle \frac{\hat{p} + \displaystyle \frac{z_{\alpha/2}^2}{2n}}{\displaystyle 1 + \frac{z_{\alpha/2}^2}{n} } + z_{\alpha/2} \frac{\sqrt{\displaystyle \frac{ \hat{p} \hat{q}}{n} + \displaystyle \frac{z_{\alpha/2}^2}{4n^2}}}{\displaystyle 1 + \frac{z_{\alpha/2}^2}{n} }.


Proficiency with this formula definitely requires practice, and so I’ll typically give a couple of practice problems so that my students can practice using this formula while in class. After the last example, when I think that my students have the hang of this very long calculation, I’ll give my one-liner to hopefully boost their confidence (no pun intended):

By now, you probably think that this calculation is dull, uninteresting, repetitive, and boring. If so, then I’ve done my job right.

My Favorite One-Liners: Part 65

In this series, I’m compiling some of the quips and one-liners that I’ll use with my students to hopefully make my lessons more memorable for them.

I’ll use today’s one-liner just before I begin some enormous, complicated, and tedious calculation that’s going to take more than a few minutes to complete. To give a specific example of such a calculation: consider the derivation of the Agresti confidence interval for proportions. According to the central limit theorem, if n is large enough, then

Z = \displaystyle \frac{ \hat{p} - p}{ \displaystyle \sqrt{ \frac{p(1-p) }{n} } }

is approximately normally distributed, where p is the true population proportion and \hat{p} is the sample proportion from a sample of size n. By unwrapping this equation and solving for p, we obtain the formula for the confidence interval for a proportion:

z \displaystyle \sqrt{\frac{p(1-p)}{n} } = \hat{p} - p

\displaystyle \frac{z^2 p(1-p)}{n} = \left( \hat{p} - p \right)^2

z^2p - z^2 p^2 = n \hat{p}^2 - 2 n \hat{p} p + n p^2

0 = p^2 (z^2 + n) - p (2n \hat{p} + z^2) + n \hat{p}^2

We now use the quadratic formula to solve for p:

p = \displaystyle \frac{2n \hat{p} + z^2 \pm \sqrt{ \left(2n\hat{p} + z^2 \right)^2 - 4n\hat{p}^2 (z^2+n)}}{2(z^2+n)}

p = \displaystyle \frac{2n \hat{p} + z^2 \pm \sqrt{4n^2 \hat{p}^2 + 4n \hat{p} z^2 + z^4 - 4n\hat{p}^2 z^2 - 4n^2 \hat{p}^2}}{2(z^2 + n)}

p = \displaystyle \frac{2n \hat{p} + z^2 \pm \sqrt{4n (\hat{p}-\hat{p}^2) z^2 + z^4}}{2(z^2 + n)}

p = \displaystyle \frac{2n \hat{p} + z^2 \pm \sqrt{4n \hat{p}(1-\hat{p}) z^2 + z^4}}{2(z^2 + n)}

p = \displaystyle \frac{2n \hat{p} + z^2 \pm \sqrt{4n \hat{p} \hat{q} z^2 + z^4}}{2(z^2 + n)}

p = \displaystyle \frac{2n \hat{p} + z^2 \pm z \sqrt{4n \hat{p} \hat{q} + z^2}}{2(z^2 + n)}

p = \displaystyle \frac{2n \hat{p} + z^2 \pm z \sqrt{4n^2 \displaystyle \frac{ \hat{p} \hat{q}}{n} + \displaystyle 4n^2 \frac{z^2}{4n^2}}}{2(z^2 + n)}

p = \displaystyle \frac{2n \hat{p} + z^2 \pm 2nz \sqrt{\displaystyle \frac{ \hat{p} \hat{q}}{n} + \displaystyle \frac{z^2}{4n^2}}}{2(z^2 + n)}

p = \displaystyle \frac{2n \hat{p} + 2n \displaystyle \frac{z^2}{2n} \pm 2nz \sqrt{\displaystyle \frac{ \hat{p} \hat{q}}{n} +\displaystyle \frac{z^2}{4n^2}}}{2n \displaystyle \left(1 + \frac{z^2}{n} \right)}

p = \displaystyle \frac{\hat{p} + \displaystyle \frac{z^2}{2n} \pm z \sqrt{\displaystyle \frac{ \hat{p} \hat{q}}{n} + \displaystyle \frac{z^2}{4n^2}}}{\displaystyle 1 + \frac{z^2}{n} }

From this we finally obtain the 100(1-\alpha)\% confidence interval

\displaystyle \frac{\hat{p} + \displaystyle \frac{z_{\alpha/2}^2}{2n}}{\displaystyle 1 + \frac{z_{\alpha/2}^2}{n} } - z_{\alpha/2} \frac{\sqrt{\displaystyle \frac{ \hat{p} \hat{q}}{n} + \displaystyle \frac{z_{\alpha/2}^2}{4n^2}}}{\displaystyle 1 + \frac{z_{\alpha/2}^2}{n} } < p < \displaystyle \frac{\hat{p} + \displaystyle \frac{z_{\alpha/2}^2}{2n}}{\displaystyle 1 + \frac{z_{\alpha/2}^2}{n} } + z_{\alpha/2} \frac{\sqrt{\displaystyle \frac{ \hat{p} \hat{q}}{n} + \displaystyle \frac{z_{\alpha/2}^2}{4n^2}}}{\displaystyle 1 + \frac{z_{\alpha/2}^2}{n} }.


So, before I start such an incredibly long calculation, I’ll warn my students that this is going to take some time and we need to prepare… and I’ll start doing jumping jacks, shadow boxing, and other “exercise” in preparation for doing all of this writing.

Statistics and percussion

I recently had a flash of insight when teaching statistics. I have completed my lectures of finding confidence intervals and conducting hypothesis testing for one-sample problems (both for averages and for proportions), and I was about to start my lectures on two-sample problems (liek the difference of two means or the difference of two proportions).

On the one hand, this section of the course is considerably more complicated because the formulas are considerably longer and hence harder to remember (and more conducive to careless mistakes when using a calculator). The formula for the standard error is longer, and (in the case of small samples) the Welch-Satterthwaite formula is especially cumbersome to use.

On the other hand, students who have mastered statistical techniques for one sample can easily extend this knowledge to the two-sample case. The test statistic (either z or t) can be found by using the formula (Observed – Expected)/(Standard Error), where the standard error formula has changed, and the critical values of the normal or t distribution is used as before.

I hadn’t prepared this ahead of time, but while I was lecturing to my students I remembered a story that I heard a music professor say about students learning how to play percussion instruments. As opposed to other musicians, the budding percussionist only has a few basic techniques to learn and master. The trick for the percussionist is not memorizing hundreds of different techniques but correctly applying a few techniques to dozens of different kinds of instruments (drums, xylophones, bells, cymbals, etc.)

It hit me that this was an apt analogy for the student of statistics. Once the techniques of the one-sample case are learned, these same techniques are applied, with slight modifications, to the two-sample case.

I’ve been using this analogy ever since, and it seems to resonate (pun intended) with my students as they learn and practice the avalanche of formulas for two-sample statistics problems.

Issues when conducting political polls (Part 3)

The classic application of confidence intervals is political polling: the science of sampling relatively few people to predict the opinions of a large population. However, in the 2010s, the art of political polling — constructing representative samples from a large population — has become more and more difficult. had a nice feature about problems that pollsters face today that were not issues a generation ago. A sampling:

The problem is simple but daunting. The foundation of opinion research has historically been the ability to draw a random sample of the population. That’s become much harder to do, at least in the United States. Response rates to telephone surveys have been declining for years and are often in the single digits, even for the highest-quality polls. The relatively few people who respond to polls may not be representative of the majority who don’t. Last week, the Federal Communications Commission proposed new guidelines that could make telephone polling even harder by enabling phone companies to block calls placed by automated dialers, a tool used in almost all surveys.

What about Internet-based surveys? They’ll almost certainly be a big part of polling’s future. But there’s not a lot of agreement on the best practices for online surveys. It’s fundamentally challenging to “ping” a random voter on the Internet in the same way that you might by giving her an unsolicited call on her phone. Many pollsters that do Internet surveys eschew the concept of the random sample, instead recruiting panels that they claim are representative of the population.

Previous posts in this series: Part 1 and Part 2.

Issues when conducting political polls (Part 2)

The classic application of confidence intervals is political polling: the science of sampling relatively few people to predict the opinions of a large population. However, in the 2010s, the art of political polling — constructing representative samples from a large population — has become more and more difficult.

The Washington Post recently ran a feature about how the prevalence of cellphones was once feared as a potential cause of bias when conducting a political poll… and what’s happened ever since:

Previous post: Issues when conducting political polls.

Student t distribution

One of my favorite anecdotes that I share with my statistics students is why the Student t distribution is called the t distribution and not the Gosset distribution.

From Wikipedia:

In the English-language literature it takes its name from William Sealy Gosset’s 1908 paper in Biometrika under the pseudonym “Student”. Gosset worked at the Guinness Brewery in Dublin, Ireland, and was interested in the problems of small samples, for example the chemical properties of barley where sample sizes might be as low as 3. One version of the origin of the pseudonym is that Gosset’s employer preferred staff to use pen names when publishing scientific papers instead of their real name, therefore he used the name “Student” to hide his identity. Another version is that Guinness did not want their competitors to know that they were using the t-test to test the quality of raw material.

Gosset’s paper refers to the distribution as the “frequency distribution of standard deviations of samples drawn from a normal population”. It became well-known through the work of Ronald A. Fisher, who called the distribution “Student’s distribution” and referred to the value as t.

From the 1963 book Experimentation and Measurement (see pages 68-69 of the PDF, which are marked as pages 69-70 on the original):

The mathematical solution to this problem was first discovered by an Irish chemist who wrote under the pen name of “Student.” Student worked for a company that was unwilling to reveal its connection with him lest its competitors discover that Student’s work would also be advantageous to them. It now seems extraordinary that the author of this classic paper on measurements was not known for more than twenty years. Eventually it was learned that his real name was William Sealy Gosset (1876-1937).

Mathematical Christmas gifts

Now that Christmas is over, I can safely share the Christmas gifts that I gave to my family this year thanks to Nausicaa Distribution (

Euler’s equation pencil pouch:

Box-and-whisker snowflakes to hang on our Christmas tree:

And, for me, a wonderfully and subtly punny “Confidence and Power” T-shirt.

Thanks to FiveThirtyEight (see for pointing me in this direction.

green lineFor the sake of completeness, here are the math-oriented gifts that I received for Christmas:



Issues when conducting political polls

The classic application of confidence intervals is political polling: the science of sampling relatively few people to predict the opinions of a large population. However, in the 2010s, the art of political polling — constructing representative samples from a large population — has become more and more difficult. wrote a recent article, Is The Polling Industry in Statis or in Crisis?, about the nuts and bolts of conducting a survey that should provide valuable background information for anyone teaching a course in statistics. From the opening paragraphs:

There is no shortage of reasons to worry about the state of the polling industry. Response rates to political polls are dismal. Even polls that make every effort to contact a representative sample of voters now get no more than 10 percent to complete their surveys — down from about 35 percent in the 1990s.

And there are fewer high-quality polls than there used to be. The cost to commission one can run well into five figures, and it has increased as response rates have declined.1 Under budgetary pressure, many news organizations have understandably preferred to trim their polling budgets rather than lay off newsroom staff.

Cheaper polling alternatives exist, but they come with plenty of problems. “Robopolls,” which use automated scripts rather than live interviewers, often get response rates in the low to mid-single digits. Most are also prohibited by law from calling cell phones, which means huge numbers of people are excluded from their surveys.

How can a poll come close to the outcome when so few people respond to it?

Nuts and Bolts of Political Polls

A standard topic in my statistics class is political polling, which is the canonical example of constructing a confidence interval with a relatively small sample to (hopefully) project the opinions of a large population. Of course, polling is only valid if the sample represents the population at large. This is a natural engagement activity in the fall semester preceding a presidential election.

A recent article on, “Are Bad Pollsters Copying Good Pollsters,” does a nice job of explaining some of the nuts and bolts of political polling in an age when selected participants are increasingly unlikely to participate… and also raises the specter of how pollsters using nontraditional methods might consciously or subconconsciously cheating.  A sample (pun intended) from the article:

What’s a nontraditional poll? One that doesn’t abide by the industry’s best practices. So, a survey is nontraditional if it:

  • doesn’t follow probability sampling;
  • doesn’t use live interviewers;
  • is released by a campaign or campaign groups (because these only selectively release data);
  • doesn’t disclose (i.e. doesn’t release raw data to the Roper Archives, isn’t a member of the National Council on Public Polls, or hasn’t signed onto the American Association for Public Opinion Research transparency initiative).

Everything else is a gold-standard poll…

Princeton University graduate student Steven Rogers and Vanderbilt University professor of political science Joshua Clinton [studied] interactive voice response (IVR) surveys in the 2012 Republican presidential primary. (IVR pollsters are in our nontraditional group.) Rogers and Clinton found that IVR pollsters were about as accurate as live-interview pollsters in races where live-interview pollsters surveyed the electorate. IVR pollsters were considerably less accurate when no live-interview poll was conducted. This effect held true even when controlling for a slew of different variables. Rogers and Clinton suggested that the IVR pollsters were taking a “cue” from the live pollsters in order to appear more accurate.

My own analysis hints at the same possibility. The nontraditional pollsters did worse in races without a live pollster.

Statistical Inference for the General Education student

From the opening and closing paragraphs:

Many mathematics departments around the country offer an introductory statistics course for the general education student. Typically these students come to the mathematics classroom with minimal skills in arithmetic and algebra. In addition it is not unusual for these students to have very poor attitudes toward mathematics.

With this target population in mind one can design courses of study, called statistics, that will differ radically depending on what priorities are held. Many people choose to teach arithmetic through statistics and thereby build most of the course around descriptive statistics with some combinatorics. Others build most of the course around combinatorics and probabilities with some descriptive statistics. Few courses offered at this level spend much time or effort on statistical inference.

We believe that for the general education student the ideas of statistical inference and the resulting decision rules are of prime importance. This belief is based on the assumption that general education courses are included in the curriculum in order to help students to gain an understanding of their own essence, of their relationship to others, of the world around them, and of how man goes about knowing.

If you inspect most of the texts on the market today, you will find that they generally require that a student spend approximately a semester of study of descriptive statistics and probability theory before attempting statistical inference. This makes it very difficult to get to the general education portion of the subject in the time allotted most general education courses. If you agree with the analysis of the problem to this point the logical question is ‘Is there a way to teach statistical inference without the traditional work in descriptive statistics and probability?’. The remainder of this article describes an approach that allows one to answer this question with a yes…

It should be pointed out that there are some unusual difficulties in this approach to statistics [since] one trades traditional weakness in arithmetic and algebra for deficiencies in writing since the write-ups of the simulations demand clear and logical exposition on the part of the student. However, if you feel that the importance of ‘statistics for the general education student’ lies in the areas of inference and decision rules, then you should try this approach. You will like it.

This article won the 1978 George Polya award for expository excellence. Several techniques described this article probably would be modified with modern computer simulation today, but are still worthy of reading.