The classic application of confidence intervals is political polling: the science of sampling relatively few people to predict the opinions of a large population. However, in the 2010s, the art of political polling — constructing representative samples from a large population — has become more and more difficult.

FiveThirtyEight.com wrote a recent article, Is The Polling Industry in Statis or in Crisis?, about the nuts and bolts of conducting a survey that should provide valuable background information for anyone teaching a course in statistics. From the opening paragraphs:

There is no shortage of reasons to worry about the state of the polling industry. Response rates to political polls are dismal. Even polls that make every effort to contact a representative sample of voters now get no more than 10 percent to complete their surveys — down from about 35 percent in the 1990s.

And there are fewer high-quality polls than there used to be. The cost to commission one can run well into five figures, and it has increased as response rates have declined.^{1} Under budgetary pressure, many news organizations have understandably preferred to trim their polling budgets rather than lay off newsroom staff.

Cheaper polling alternatives exist, but they come with plenty of problems. “Robopolls,” which use automated scripts rather than live interviewers, often get response rates in the low to mid-single digits. Most are also prohibited by law from calling cell phones, which means huge numbers of people are excluded from their surveys.

How can a poll come close to the outcome when so few people respond to it?

In my capstone class for future secondary math teachers, I ask my students to come up with ideas for engaging their students with different topics in the secondary mathematics curriculum. In other words, the point of the assignment was not to devise a full-blown lesson plan on this topic. Instead, I asked my students to think about three different ways of getting their students interested in the topic in the first place.

I plan to share some of the best of these ideas on this blog (after asking my students’ permission, of course).

This student submission comes from my former student Michael Dixon. His topic, from Precalculus: mathematical induction.

How can this topic be used in your students’future courses in mathematics or science?

The first time student are introduced to mathematical proofs is probably in high school geometry class, proving theorems using the axiomatic method. They work to prove things about Euclidean geometry with step by step deductive reasoning, as did Euclid himself in the Elements. They prove things about concrete objects that they can see and draw on paper, such as circles, angles, lines, and triangles. But then they move on to Algebra II where they are taught more abstract ways of dealing with numbers and expressions. Is there any way to prove things about numbers themselves? It’s not as easy to visualize, that’s for sure. What is a number? Is it something I can see and feel; is it the shape we write on the page? Or is it something beyond that? This aspect is one of the challenges that algebra students face as they are exposed to more and more mathematics. Mathematical Induction is one way to prove things about numbers using solid deductive reasoning that cannot be refuted. And not just about a few numbers; high school students would be more accepting of that. Mathematical induction is usually used to prove something about ALL of the natural numbers, starting from one and going on out past infinity. Induction can be used to prove what students might intuitively think about the natural numbers, such as that there are an infinite number of primes, or it can be used to prove less obvious things about numbers, such as 1 + 3 + 5 + 7 + …+ n = n^{2}. We can prove these and more without having to compute billions and billions of cases. In just a few lines of mathematical logic, we can prove that something is true for every natural integer. This is more than just telling the students something and them accepting it, this technique PROVES that some statements are always true for any number we want to choose, no matter how large it is. That’s some powerful stuff.

How was this topic adopted by the mathematical community.

Mathematical induction has been around for thousands of years. While not in the same form as we see it today, induction can be seen all the way back to Euclid’s proof that there are an infinite number of primes, or in the writings of Aristotle. They used this logic to prove a lot of things, but it was not in the formal way of proving something about n and n + 1. This formal notation did not show up until around 1575, when Maurolycus that 1 + 3 + 5 + 7 + …+ n = n^{2}, though he did not prove using n and n + 1, yet. Several mathematicians began using this formal method soon after, such as Pascal and , though no one had a name for it. Bernoulli then was one of the first to begin using the method of arguing from n to n + 1. Since then, mathematicians have been heavily using this method to prove countless things about the natural numbers. And eventually, around the 20th century the name itself, mathematical induction, finally became the standard term for the method starting over two thousand years ago.

How can technology be used to effectively engage students with this topic?

These videos cover mathematical induction in a way I hadn’t seen before, and cleared up a misconception that I had. I had always thought (because of the name) that mathematical induction was not the same kind of reasoning that is used in other axiomatic proofs. However, mathematical induction happens to actually be deductive reasoning, rather than inductive reasoning. The only similarity is that both mathematical induction and inductive reasoning deal with occurring patterns. The first video is more the engage part, while the second one goes a lithe further into the content. For the engage, showing the video at the beginning of the class is probably better, while the second might be given to the students as homework to watch on their own.

I’m taking a one-day break from my usual posts on mathematics and mathematics education to note a symbolic milestone: meangreenmath.com has had more than 25,000 total page views since its inception in June 2013. Many thanks to the followers of this blog, and I hope that you’ll continue to find this blog to be a useful resource to you.

Some other (probably useless) statistics: this blog has been viewed by readers from 145 different countries. Top viewership: the United States, India, the Philippines, Canada, the United Kingdom, Australia, Brazil, Pakistan, Singapore, and France.

Textbooks have included the occasional awful problem ever since Pebbles Flintstone and Bamm-Bamm Rubble chiseled their homework on slate tablets while attending Bedrock Elementary. But even with the understanding that there have been children have been doing awful homework problems since the dawn of time (and long before the advent of the Common Core), this gem that was assigned to Texas 5th graders is a doozy.

I get what the textbook wants the student to do: rounding to the nearest $10 and developing the skill of approximating a sum without actually laboriously computing the sum exactly. According to this logic, $54.26 gets rounded to $50 and $34.34 gets rounded to $30. So Fran spends about $80, and (according to this logic) so she has about $20 left. So the textbook wants the student to answer B.

But this is wrong on so many levels only destined to confuse parents and children alike.

First, the actual answer, without using approximations, is $11.40. Undeniably, $5 (answer C) is closer to $11.40 than $20 (answer B).

Second, it’s entirely reasonable and appropriate for students to approximate to either the nearest dollar or else the nearest $5. Indeed, nothing in this problem says that the rounding must occur to the nearest $10… I’d imagine that this could only be inferred from the context of other problems on the page. By rounding to the nearest dollar, Fran would have about $12 left. By rounding to the nearest $5, Fran would have about $10 left. And there’s nothing “wrong” with either of these approximations.

Third, in real life, Fran would not say that it would cost about $80 to buy the sneakers and shirt. In real life, Fran would always round up to be sure that she has enough money to complete the transaction. If Fran keeps rounding to the nearest $10, she’ll end up short of money at the cash register sooner or later. So while rounding up or down may be appropriate for some problems, it probably shouldn’t be advocated for the sake of financial literacy.

In short, this problem does little except confuse students and get them to hate math. I do advocate that children should be able to estimate a sum without finding it. This is one of the standards for teaching Texas 5th graders mathematics:

Number, operation, and quantitative reasoning. The student estimates to determine reasonable results. The student is expected to use strategies, including rounding and compatible numbers to estimate solutions to addition, subtraction, multiplication, and division problems. Source: http://ritter.tea.state.tx.us/rules/tac/chapter111/ch111a.html

That said, this particular problem is an exceptionally poor way of determining whether students have acquired that skill. It’s hard to believe that this problem survived the proofreading process before the textbook was published.

Prime numbers have inspired great intrigue over the last centuries, and one of the most basic unanswered questions has been the spacing between two consecutive prime numbers, or the twin prime conjecture, which states that there are infinitely many pairs of primes that differ by two. Despite many efforts at proving this conjecture, mathematicians were not able to rule out the possibility that the gaps between primes continue to expand, eventually exceeding any particular bound. Zhang’s work shows that there are infinitely many consecutive primes, or pairs of primes, closer than 70 million. In other words, as you go to larger and larger numbers, the primes will not become further and further apart—you will keep finding prime pairs that differ by less than 70 million.

His work has generated significant collaborations across the community to expand on his effort, and within months of his discovery that number was reduced from 70 million to less than 5,000.

A standard topic in my statistics class is political polling, which is the canonical example of constructing a confidence interval with a relatively small sample to (hopefully) project the opinions of a large population. Of course, polling is only valid if the sample represents the population at large. This is a natural engagement activity in the fall semester preceding a presidential election.

A recent article on FiveThirtyEight.com, “Are Bad Pollsters Copying Good Pollsters,” does a nice job of explaining some of the nuts and bolts of political polling in an age when selected participants are increasingly unlikely to participate… and also raises the specter of how pollsters using nontraditional methods might consciously or subconconsciously cheating. A sample (pun intended) from the article:

What’s a nontraditional poll? One that doesn’t abide by the industry’s best practices.^{} So, a survey is nontraditional if it:

doesn’t follow probability sampling;

doesn’t use live interviewers;

is released by a campaign or campaign groups (because these only selectively release data);

Princeton University graduate student Steven Rogers and Vanderbilt University professor of political science Joshua Clinton [studied] interactive voice response (IVR) surveys in the 2012 Republican presidential primary. (IVR pollsters are in our nontraditional group.) Rogers and Clinton found that IVR pollsters were about as accurate as live-interview pollsters in races where live-interview pollsters surveyed the electorate. IVR pollsters were considerably less accurate when no live-interview poll was conducted. This effect held true even when controlling for a slew of different variables. Rogers and Clinton suggested that the IVR pollsters were taking a “cue” from the live pollsters in order to appear more accurate.

My own analysis hints at the same possibility. The nontraditional pollsters did worse in races without a live pollster.

I’m doing something that I should have done a long time ago: collect past series of posts into a single, easy-to-reference post. The following posts formed my series on different ways (correct and incorrect) to solve a two-part probability problem.

Part 1: Two different and correct ways of solving the following problem: “Two cards are dealt from a well-shuffled deck. Find the probability that the first is an ace or the second is a ace.”

Part 2: Two different ways — one correct, one incorrect — of solving the following problem: “Two cards are dealt from a well-shuffled deck. Find the probability that the first is an ace or the second is a jack.”

Part 3: Explaining the incorrect solution, and salvaging the solution to obtain the correct answer.

In yesterday’s post, I gave two solutions that a student gave to the following probability problem. One of the solutions was correct, and one of the solutions was incorrect.

Two cards are dealt from a well-shuffled deck. Find the probability that the first is an ace or the second is a jack.

What follows is the incorrect (but, as we’ll see later, salvageable) solution.

Method #2 (incorrect):There are three ways that either the first or second card could be an a jack:

The first card is an ace and the second card is a jack.

The first card is an ace and the second card is not a jack.

The first card is not an ace and the second card is a jack.

Each of these can be computed using the rule in much the same way as above:

P(first an ace) P(second a jack, given first an ace)

P(first an ace) P(second not a jack, given first an ace)

P(first not an ace) P(second a jack, given first not an ace)

Adding these together, we obtain the answer:

When my student presented this to me, I must admit that it took me a couple of minutes before I found the hole in the student’s logic.

This answer is wrong because the second probability in Case 3 above was not calculated correctly. If we only know that the first card is not an ace, then we don’t have enough information to know how many of the remaining 51 cards are jacks. So the conditional probability is incorrect in Step 3.

Even though the above logic is slightly incorrect, it can be salvaged by splitting Case 3 above into two subcases.

Method #2:There are three ways that either the first or second card could be an a jack:

The first card is an ace and the second card is a jack.

The first card is an ace and the second card is not a jack.

The first card is a jack, and the second card is a jack.

The first card is neither an ace nor a jack, and the second card is a jack.

Each of these can be computed using the rule :

P(first an ace) P(second a jack, given first an ace)

P(first an ace) P(second not a jack, given first an ace)

P(first a jack) P(second a jack, given first a jack)

P(first neither an ace or a jack) P(second a jack, given first neither an ace or a jack)

By splitting the original Case 3 into the new Cases 3 and 4, there is no longer any ambiguity about how many jacks remain when the second card is chosen. Adding these together, we obtain the answer:

Not surprisingly, this matches the answer obtained when the formula was employed in yesterday’s post.

Here is a standard problem that could appear in an elementary probability class.

Two cards are dealt from a well-shuffled deck. Find the probability that the first is an ace or the second is a jack.

In yesterday’s post, I considered two different ways of solving a similar-looking problem, except the final word jack was replaced by ace. Yesterday, I showed that there were two legitimate ways of solving that problem, resulting (of course) in the same answer.

About a year ago, a student came into my office using these two different techniques to solve the ace/jack problem. Except she arrived at two different answers!

Method #1: One law for probability states that

Another law of probability states that

Combining these, we find that

Written more colloquially,

P(first an ace or second a jack)

= P(first an ace) + P(second a jack) – P(first an ace AND second a jack)

=P(first an ace) + P(second a jack) – P(first an ace) P(second a jack, given first an ace)

Let’s look at these three probabilities on the last line separately.

P(first an ace) is .

P(second a jack) is also . No information about the first card appears between the two parentheses, and so this is similar to pulling a card out of the middle of a deck. Since no information is given about the preceding card(s), the answer is still .

P(second an a jack, given first an ace) is different than the answer to #2 above. For this problem, the first card is known to be an ace, and the question is, given this knowledge, what is the probability that the second card is a jack? Since the first card is known to be an ace, there are still 4 jacks left out of 51 possible cards. Therefore, the answer is .

Putting these together, we find the final solution of

Here’s was the student’s second solution.

Method #2:There are three ways that either the first or second card could be an a jack:

The first card is an ace and the second card is a jack.

The first card is an ace and the second card is not a jack.

The first card is not an ace and the second card is a jack.

Each of these can be computed using the rule in much the same way as above:

P(first an ace) P(second a jack, given first an ace)

P(first an ace) P(second not a jack, given first an ace)

P(first not an ace) P(second a jack, given first not an ace)

Adding these together, we obtain the answer:

So, my student asked me, “Which one is the right answer? And why is the wrong answer wrong?” I must admit that it took me a couple of minutes before I found the student’s mistake.After all, the student’s logic perfectly paralleled the correct logic given in yesterday’s post.

I’ll discuss the mistake in tomorrow’s post. Until then, here’s a green thought cloud so that you also can think about what the student did wrong.

Here is a standard problem that could appear in an elementary probability class.

Two cards are dealt from a well-shuffled deck. Find the probability that the first is an ace or the second is a ace.

There are at least two legitimate ways to solve this problem:

Method #1: One law for probability states that

Another law of probability states that

Combining these, we find that

Written more colloquially,

P(first an ace or second an ace)

= P(first an ace) + P(second an ace) – P(first an ace AND second an ace)

=P(first an ace) + P(second an ace) – P(first an ace) P(second an ace, given first an ace)

Let’s look at these three probabilities on the last line separately.

P(first an ace) is .

P(second an ace) is also . No information about the first card appears between the two parentheses, and so this is similar to pulling a card out of the middle of a deck. Since no information is given about the preceding card(s), the answer is still .

P(second an ace, given first an ace) is different than the answer to #2 above. For this problem, the first card is known to be an ace, and the question is, given this knowledge, what is the probability that the second card is an ace? Since the first card is known to be an ace, there are only 3 aces left out of 51 possible cards. Therefore, the answer is .

Putting these together, we find the final solution of

Here’s a second legitimate solution, though it does take a little more work.

Method #2:There are three ways that either the first or second card could be an ace:

The first card is an ace and the second card is an ace.

The first card is an ace and the second card is not an ace.

The first card is not an ace and the second card is an ace.

Each of these can be computed using the rule in much the same way as above:

P(first an ace) P(second an ace, given first an ace)

P(first an ace) P(second not an ace, given first an ace)

P(first not an ace) P(second an ace, given first not an ace)

Adding these together, we obtain the answer:

Not surprisingly, the two answers are the same.

In tomorrow’s post, I’ll describe the time that a student came to me with a similar-looking probability problem, but she obtained two different answers using these two different methods.