# Facebook Birthday Problem: Part 5

Recently, I devised the following problem:

Suppose that you have n friends, and you always say “Happy Birthday” to each friend on his/her birthday. On how many days of the year will you not say “Happy Birthday” to one of your friends?

Until somebody tells me otherwise, I’m calling this the Facebook birthday problem in honor of Facebook’s daily alerts to say “Happy Birthday” to friends.

Here’s how I solved this problem. Let $I_k$ be an indicator random variable for “no friend has a birthday on day $k$, where $k = 366$ stands for February 29 and $k = 1, \dots, 365$ stand for the “usual” 365 days of the year. Therefore, the quantity $N$, representing the number of days of the year on which no friend has a birthday, can be written as

$N = I_1 + \dots + I_{365} + I_{366}$

In yesterday’s post, I began the calculation of the standard deviation of $N$ by first computing its variance. This calculation is complicated by the fact that $I_1, \dots, I_{366}$ are dependent. Yesterday, I showed that

$\hbox{Var}(N) = 365\displaystyle \left( \frac{1457}{1461} \right)^n \left[ 1 - \left( \frac{1457}{1461} \right)^n \right] + \displaystyle \left( \frac{1460}{1461} \right)^n \left[ 1 - \left( \frac{1460}{1461} \right)^n \right]$

$+ \displaystyle (365)(364) \left[ \displaystyle \left( \frac{1453}{1461} \right)^n - \displaystyle \left( \frac{1457}{1461} \right)^{2n} \right] + 2 \sum_{k=1}^{365}\hbox{Cov}(I_k,I_{366})$.

To complete this calculation, I’ll now find $\hbox{Cov}(I_k,I_{366})$, where $1 \le k \le 365$. I’ll use the usual computation formula for a covariance,

$\hbox{Cov}(I_k,I_{366}) = E(I_k I_{366}) - E(I_k) E(I_{366})$.

We have calculated $E(I_k)$ earlier in this series. In any four-year span, there are $4 \times 365 + 1 = 1461$ days, of which only one is February 29. Assuming the birthday’s are evenly distributed (which actually doesn’t happen in real life), the chance that someone’s birthday is not on day $k$ is

$\displaystyle 1 - \frac{4}{1461} = \displaystyle \frac{1457}{1461}$,

so that the probability that no friend has a birthday on day $k$ is

$\displaystyle \left( \frac{1457}{1461} \right)^n$.

Therefore, since the expected value of an indicator random variable is the probability that the event happens, we have

$E(I_k) = \displaystyle \left( \frac{1457}{1461} \right)^n$

for $k = 1, \dots, 365$. Similarly,

$E(I_{366}) = \displaystyle \left( \frac{1460}{1461} \right)^n$,

so that

$\hbox{Cov}(I_k,I_{366}) = E(I_k I_{366}) - \displaystyle \left( \frac{1457}{1461} \right)^n \left( \frac{1460}{1461} \right)^n$.

To find $E(I_k I_{366})$, we note that since $I_k$ is equal to either 0 or 1 and $I_{366}$ is equal to either 0 or 1, the product $I_k I_{366}$ can only equal 0 and 1 as well. Therefore, $I_k I_{366}$ is itself an indicator random variable. Furthermore, $I_k I_{366} = 1$ if and only if $I_k = 1$ and $I_{366} = 1$, which means that no friends has a birthday on either day $k$ or day $366$ (that is, February 29). The chance that someone doesn’t have a birthday on day $k$ or February 29 is

$\displaystyle 1 - \frac{4}{1461} - \frac{1}{1461} = \displaystyle \frac{1456}{1461}$,

so that the probability that no friend has a birthday on day $k$ or February 29 is

$\displaystyle \left( \frac{1456}{1461} \right)^n$.

Therefore, as before,

$E(I_k I_{366}) = \displaystyle \left( \frac{1456}{1461} \right)^n$,

so that

$\hbox{Cov}(I_k,I_{366}) = \displaystyle \left( \frac{1456}{1461} \right)^n - \displaystyle \left( \frac{1457}{1461} \right)^n \left( \frac{1460}{1461} \right)^n$.

Therefore,

$\hbox{Var}(N) = 365\displaystyle \left( \frac{1457}{1461} \right)^n \left[ 1 - \left( \frac{1457}{1461} \right)^n \right] + \displaystyle \left( \frac{1460}{1461} \right)^n \left[ 1 - \left( \frac{1460}{1461} \right)^n \right]$

$+ \displaystyle (365)(364) \left[ \displaystyle \left( \frac{1453}{1461} \right)^n - \displaystyle \left( \frac{1457}{1461} \right)^{2n} \right] + 2(365) \left[ \left( \frac{1456}{1461} \right)^n - \displaystyle \left( \frac{1457}{1461} \right)^n \left( \frac{1460}{1461} \right)^n \right]$,

and we find the standard deviation of $N$ using

$\hbox{SD}(N) = \sqrt{\hbox{Var}(N)}$.

The graph below shows the expected value of $N$, which was shown earlier to be

$E(N) = 365 \displaystyle \left( \frac{1457}{1461} \right)^n + \left( \frac{1460}{1461} \right)^n$,

along with error bars representing two standard deviations.

Interestingly, the standard deviation of $N$ changes for different values of $n$; a direct calculation shows that the $\hbox{SD}(N)$ is maximized at $n = 459$ with maximum value of approximately $6.1$. Accordingly, for $n = 450$ and $n = 500$, the error bars in the above figure have a total width of approximately 24 days (two standard deviations both above and below the expected value).

# Facebook Birthday Problem: Part 4

Recently, I devised the following problem:

Suppose that you have n friends, and you always say “Happy Birthday” to each friend on his/her birthday. On how many days of the year will you not say “Happy Birthday” to one of your friends?

Until somebody tells me otherwise, I’m calling this the Facebook birthday problem in honor of Facebook’s daily alerts to say “Happy Birthday” to friends.

Here’s how I solved this problem. Let $I_k$ be an indicator random variable for “no friend has a birthday on day $k$, where $k = 366$ stands for February 29 and $k = 1, \dots, 365$ stand for the “usual” 365 days of the year. Therefore, the quantity $N$, representing the number of days of the year on which no friend has a birthday, can be written as

$N = I_1 + \dots + I_{365} + I_{366}$

In yesterday’s post, I began the calculation of the standard deviation of $N$ by first computing its variance. This calculation is complicated by the fact that $I_1, \dots, I_{366}$ are dependent. Yesterday, I showed that

$\hbox{Var}(N) = 365\displaystyle \left( \frac{1457}{1461} \right)^n \left[ 1 - \left( \frac{1457}{1461} \right)^n \right] + \displaystyle \left( \frac{1460}{1461} \right)^n \left[ 1 - \left( \frac{1460}{1461} \right)^n \right]$

$+ \displaystyle 2 \!\!\!\!\! \sum_{1 \le j < k \le 365} \!\!\!\!\! \hbox{Cov}(I_j,I_k) + 2 \sum_{k=1}^{365} \hbox{Cov}(I_k,I_{366})$

To complete this calculation, I’ll now find the covariances. I’ll begin with $\hbox{Cov}(I_j,I_k)$ if $1 \le j < k \le 365$; that is, if $j$ and $k$ are days other than February 29. I’ll use the usual computation formula for a covariance,

$\hbox{Cov}(I_j,I_k) = E(I_j I_k) - E(I_j) E(I_k)$.

We have calculated $E(I_k)$ earlier in this series. In any four-year span, there are $4 \times 365 + 1 = 1461$ days, of which only one is February 29. Assuming the birthday’s are evenly distributed (which actually doesn’t happen in real life), the chance that someone’s birthday is not on day $k$ is

$\displaystyle 1 - \frac{4}{1461} = \displaystyle \frac{1457}{1461}$,

so that the probability that no friend has a birthday on day $k$ is

$\displaystyle \left( \frac{1457}{1461} \right)^n$.

Therefore, since the expected value of an indicator random variable is the probability that the event happens, we have

$E(I_k) = \displaystyle \left( \frac{1457}{1461} \right)^n$

for $k = 1, \dots, 365$. Therefore,

$\hbox{Cov}(I_j,I_k) = E(I_j I_k) - \displaystyle \left( \frac{1457}{1461} \right)^n \left( \frac{1457}{1461} \right)^n = E(I_j I_k) - \displaystyle \left( \frac{1457}{1461} \right)^{2n}$.

To find $E(I_j I_k)$, we note that since $I_j$ is equal to either 0 or 1 and $I_k$ is equal to either 0 or 1, the product $I_j I_k$ can only equal 0 and 1 as well. Therefore, $I_j I_k$ is itself an indicator random variable, which I’ll call $I_{jk}$. Furthermore, $I_{jk}$ if and only if $I_j = 1$ and $I_k = 1$, which means that no friends has a birthday on either day $j$ or day $k$. The chance that someone doesn’t have a birthday on day $j$ or day $k$ is

$\displaystyle 1 - \frac{4}{1461} - \frac{4}{1461} = \displaystyle \frac{1453}{1461}$,

so that the probability that no friend has a birthday on day $j$ or $k$ is

$\displaystyle \left( \frac{1453}{1461} \right)^n$.

Therefore, as before,

$E(I_j I_k) = \displaystyle \left( \frac{1453}{1461} \right)^n$,

so that

$\hbox{Cov}(I_j,I_k) = \displaystyle \left( \frac{1453}{1461} \right)^n - \displaystyle \left( \frac{1457}{1461} \right)^{2n}$.

Since there are $\displaystyle {365 \choose 2} = \displaystyle \frac{365\times 364}{2}$ pairs $(j,k)$ so that $1 \le j < k \le 365$, we have

$\hbox{Var}(N) = 365\displaystyle \left( \frac{1457}{1461} \right)^n \left[ 1 - \left( \frac{1457}{1461} \right)^n \right] + \displaystyle \left( \frac{1460}{1461} \right)^n \left[ 1 - \left( \frac{1460}{1461} \right)^n \right]$

$+ \displaystyle 2 \times \displaystyle \frac{365\times 364}{2} \left[ \displaystyle \left( \frac{1453}{1461} \right)^n - \displaystyle \left( \frac{1457}{1461} \right)^{2n} \right] + 2 \sum_{k=1}^{365}\hbox{Cov}(I_k,I_{366})$,

or

$\hbox{Var}(N) = 365\displaystyle \left( \frac{1457}{1461} \right)^n \left[ 1 - \left( \frac{1457}{1461} \right)^n \right] + \displaystyle \left( \frac{1460}{1461} \right)^n \left[ 1 - \left( \frac{1460}{1461} \right)^n \right]$

$+ \displaystyle (365)(364) \left[ \displaystyle \left( \frac{1453}{1461} \right)^n - \displaystyle \left( \frac{1457}{1461} \right)^{2n} \right] + 2 \sum_{k=1}^{365}\hbox{Cov}(I_k,I_{366})$.

The calculation of $\hbox{Cov}(I_k,I_{366})$ is similar to the above calculation; I’ll write this up in tomorrow’s post.

# Facebook Birthday Problem: Part 3

Recently, I devised the following problem:

Suppose that you have n friends, and you always say “Happy Birthday” to each friend on his/her birthday. On how many days of the year will you not say “Happy Birthday” to one of your friends?

Until somebody tells me otherwise, I’m calling this the Facebook birthday problem in honor of Facebook’s daily alerts to say “Happy Birthday” to friends.

Here’s how I solved this problem. Let $I_k$ be an indicator random variable for “no friend has a birthday on day $k$, where $k = 366$ stands for February 29 and $k = 1, \dots, 365$ stand for the “usual” 365 days of the year. Therefore, the quantity $N$, representing the number of days of the year on which no friend has a birthday, can be written as

$N = I_1 + \dots + I_{365} + I_{366}$

In yesterday’s post, I showed that

$E(N) = E(I_1) + \dots + E(I_{365}) + E(I_{366}) = 365 \displaystyle \left( \frac{1457}{1461} \right)^n + \left( \frac{1460}{1461} \right)^n$.

The calculation of the standard deviation of $N$ is considerably more complicated, however, since the $I_1, \dots, I_{366}$ are dependent. So we will begin by computing the variance of $N$:

$\hbox{Var}(N) = \displaystyle \sum_{k=1}^{366} \hbox{Var}(I_k) + 2 \!\!\!\!\! \sum_{1 \le j < k \le 366} \!\!\!\!\! \hbox{Cov}(I_j,I_k)$,

or

$\hbox{Var}(N) = \displaystyle \sum_{k=1}^{365} \hbox{Var}(I_k) + \hbox{Var}(I_{366}) + 2 \!\!\!\!\! \sum_{1 \le j < k \le 365} \!\!\!\!\! \hbox{Cov}(I_j,I_k) + 2 \sum_{k=1}^{365} \hbox{Cov}(I_k,I_{366})$

For the first term, we recognize that, in any four-year span, there are $4 \times 365 + 1 = 1461$ days, of which only one is February 29. Assuming the birthday’s are evenly distributed (which actually doesn’t happen in real life), the chance that someone’s birthday is not on day $k$ is

$\displaystyle 1 - \frac{4}{1461} = \displaystyle \frac{1457}{1461}$.

Therefore, the chance that all $n$ friends don’t have a birthday on day $k$ is

$\displaystyle \left( \frac{1457}{1461} \right)^n$.

Using the formula $\hbox{Var}(I) = p(1-p)$ for the variance of an indicator random variable, we see that

$\hbox{Var}(I_k) = \displaystyle \left( \frac{1457}{1461} \right)^n \left[ 1 - \left( \frac{1457}{1461} \right)^n \right]$

for $k = 1, \dots, 365$. Similarly, for the second term,

$\hbox{Var}(I_{366}) = \displaystyle \left( \frac{1460}{1461} \right)^n \left[ 1 - \left( \frac{1460}{1461} \right)^n \right]$

Therefore, so far we have shown that

$\hbox{Var}(N) = 365\displaystyle \left( \frac{1457}{1461} \right)^n \left[ 1 - \left( \frac{1457}{1461} \right)^n \right] + \displaystyle \left( \frac{1460}{1461} \right)^n \left[ 1 - \left( \frac{1460}{1461} \right)^n \right]$

$+ \displaystyle 2 \!\!\!\!\! \sum_{1 \le j < k \le 365} \!\!\!\!\! \hbox{Cov}(I_j,I_k) + 2 \sum_{k=1}^{365} \hbox{Cov}(I_k,I_{366})$

In tomorrow’s post, I’ll complete this calculation by finding the covariances.

# My Favorite One-Liners: Part 100

In this series, I’m compiling some of the quips and one-liners that I’ll use with my students to hopefully make my lessons more memorable for them.

Today’s quip is one that I’ll use surprisingly often:

If you ever meet a mathematician at a bar, ask him or her, “What is your favorite application of the Cauchy-Schwartz inequality?”

The point is that the Cauchy-Schwartz inequality arises surprisingly often in the undergraduate mathematics curriculum, and so I make a point to highlight it when I use it. For example, off the top of my head:

1. In trigonometry, the Cauchy-Schwartz inequality states that

$|{\bf u} \cdot {\bf v}| \le \; \parallel \!\! {\bf u} \!\! \parallel \cdot \parallel \!\! {\bf v} \!\! \parallel$

for all vectors ${\bf u}$ and ${\bf v}$. Consequently,

$-1 \le \displaystyle \frac{ {\bf u} \cdot {\bf v} } {\parallel \!\! {\bf u} \!\! \parallel \cdot \parallel \!\! {\bf v} \!\! \parallel} \le 1$,

which means that the angle

$\theta = \cos^{-1} \left( \displaystyle \frac{ {\bf u} \cdot {\bf v} } {\parallel \!\! {\bf u} \!\! \parallel \cdot \parallel \!\! {\bf v} \!\! \parallel} \right)$

is defined. This is the measure of the angle between the two vectors ${\bf u}$ and ${\bf v}$.

2. In probability and statistics, the standard deviation of a random variable $X$ is defined as

$\hbox{SD}(X) = \sqrt{E(X^2) - [E(X)]^2}$.

The Cauchy-Schwartz inequality assures that the quantity under the square root is nonnegative, so that the standard deviation is actually defined. Also, the Cauchy-Schwartz inequality can be used to show that $\hbox{SD}(X) = 0$ implies that $X$ is a constant almost surely.

3. Also in probability and statistics, the correlation between two random variables $X$ and $Y$ must satisfy

$-1 \le \hbox{Corr}(X,Y) \le 1$.

Furthermore, if $\hbox{Corr}(X,Y)=1$, then $Y= aX +b$ for some constants $a$ and $b$, where $a > 0$. On the other hand, if $\hbox{Corr}(X,Y)=-1$, if $\hbox{Corr}(X,Y)=1$, then $Y= aX +b$ for some constants $a$ and $b$, where $a < 0$.

Since I’m a mathematician, I guess my favorite application of the Cauchy-Schwartz inequality appears in my first professional article, where the inequality was used to confirm some new bounds that I derived with my graduate adviser.

# My Favorite One-Liners: Part 33

In this series, I’m compiling some of the quips and one-liners that I’ll use with my students to hopefully make my lessons more memorable for them.

Perhaps one of the more difficult things that I try to instill in my students is numeracy, or a sense of feeling if an answer to a calculation is plausible. As a initial step toward this goal, I’ll try to teach my students some basic pointers about whether an answer is even possible.

For example, when calculating a standard deviation, students have to compute $E(X)$ and $E(X^2)$:

$E(X) = \sum x p(x) \qquad \hbox{or} \qquad E(X) = \int_a^b x f(x) \, dx$

$E(X^2) = \sum x^2 p(x) \qquad \hbox{or} \qquad E(X^2) = \int_a^b x^2 f(x) \, dx$

After these are computed — which could take some time — the variance is then calculated:

$\hbox{Var}(X) = E(X^2) - [E(X)]^2$.

Finally, the standard deviation is found by taking the square root of the variance.

So, I’ll ask my students, what do you do if you calculate the variance and it’s negative, so that it’s impossible to take the square root? After a minute to students hemming and hawing, I’ll tell them emphatically what they should do:

It’s wrong… do it again.

The same principle applies when computing probabilities, which always have to be between 0 and 1. So, if ever a student computes a probability that’s either negative or else greater than 1, they can be assured that the answer is wrong and that there’s a mistake someplace in their computation that needs to be found.

# My Favorite One-Liners: Part 15

In this series, I’m compiling some of the quips and one-liners that I’ll use with my students to hopefully make my lessons more memorable for them. Let me describe a one-liner that I’ll use when I want my class to figure out a pattern, thus developing a theorem by inductive logic rather than deductive logic.

Today’s one-liner is easily stated: “Gosh, I’ve seen that somewhere before.”

For example, In my statistics class, here’s the very first illustration that I show to demonstrate how to compute a standard deviation:

Find the standard deviation of the following data set: 1, 4, 6, 7, 8, 10.

The first step is finding the average:

$\overline{x} = \displaystyle \frac{1+4+6+7+8+10}{6} = 6$.

We then find the deviations from average by subtracting 6 from all of the original data values:

Deviations from average = -5, -2, 0, 1, 2, 4

With these numbers, we can compute the standard deviation:

$s = \displaystyle \sqrt {\frac{ (-5)^2 + (-2)^2 + 0^2 + 1^2 + 2^2 + 4^2}{5} } = \sqrt{10}$.

After asking if there are any questions of clarification about the nuts and bolts of this calculation, I’ll proceed to the next example:

Find the standard deviation of the following data set: 5, 8, 10, 11, 12, 14.

The first step is finding the average:

$\overline{x} = \displaystyle \frac{5+8+10+11+12+14}{6} = 10$.

We then find the deviations from average by subtracting 10 from all of the original data values and then constructing the square root as before:

$s = \displaystyle \sqrt {\frac{ (-5)^2 + (-2)^2 + 0^2 + 1^2 + 2^2 + 4^2}{5} } = \sqrt{10}$.

Then, in a loud obvious voice, I’ll declare, “Gosh, I’ve seen that somewhere before” and then wait a few seconds for the answer. Students can obviously see that the two answers are the same — which gets them thinking about why that happened.

Obviously, the two answers are the same. The real conceptual question that I want my students to figure out is why the two answers are same. Eventually, someone will come up with the correct answer — the second data set was made by adding 4 to all the values in the first data set, which may change the average but does not change how spread out the numbers are… so the standard deviation should be unchanged.

I love the “Gosh, I’ve seen that somewhere before” line after a couple of carefully chosen examples, as it cues my class that they really need to think a little harder than the dull and mechanical operations toward a deeper conceptual understanding of what’s really happening.

# Engaging students: Standard Deviation

In my capstone class for future secondary math teachers, I ask my students to come up with ideas for engaging their students with different topics in the secondary mathematics curriculum. In other words, the point of the assignment was not to devise a full-blown lesson plan on this topic. Instead, I asked my students to think about three different ways of getting their students interested in the topic in the first place.

I plan to share some of the best of these ideas on this blog (after asking my students’ permission, of course).

This student submission comes from my former student Jillian Greene. Her topic, from statistics: standard deviations.

How could you as a teacher create an activity or project that involves your topic?

An activity that I’ve seen presented to introduce the idea of standard deviation requires students to explore the information given to them before actually being taught the math behind standard deviation. As the students settle into their seats, prompt them to work with their shoulder partner and help to measure the width of their left thumbnail (or length of their pointer finger, width of their hand, etc.) and write it on a sticky note. Once the data is collected, the students will calculate the mean of all of the measurements. The mean is then written on the board in the center, and the students are asked to go and stack their post-it notes in either the center if they are perfectly the mean, or on the right or left if it’s bigger or smaller, respectively. Have them find the mean of the distances of each measurement from the mean. When they discover this should be zero, have them discuss with each other and then in the big group what that means. If time provides, it might even be fun to ask deeper understanding questions like what would happen if everyone last half of their thumbnail, or what if just Student A’s thumbnail tripled in size. This will provide a meaningful sequitur into the sometimes confusing world of standard deviation and distances from the mean.

How has this topic appeared in pop culture (movies, TV, current music, video games, etc.)?

http://www.dailymotion.com/video/x3lc0rx

This is a full episode of Everybody Loves Raymond but the clip in reference starts at about 8:45 and lasts a minute or so.

This clip shows a scenario where the couple, Ray and Deborah, is comparing their scores on an IQ test (a very common use for standard deviation). Deborah comments on how her score is very close to Ray’s, being only 15 points higher. The brother that proctored the exam corrected her by saying that 15 points is a standard deviation higher and puts her in a “whole new class” of genius. Have students discuss and explain what it means for Deborah to be one standard deviation higher. Use the information given in the episode (100 is average, 115 is one standard deviation higher) to construct the bell curve for IQ scores. Then use the bell curve to introduce percentiles. Since Ray is the average, center-of-the-bell score, then he is in the 50th percentile. The students can then attempt to discover on their own (or with a group) what percentile Deborah’s score puts her in.

How can this topic be used in your students’ future courses in mathematics or science?

Standard deviation is a topic that pervades almost all sciences. In biology classes, students are asked to student the weather and climate of various habitats. In differentiating between the two, one must look at the overall picture. If the student is presented with the information that place A and place B both have average temperatures of 60 degrees, this information might not be good to take as face value. Place A might have a range from 40 to 80 degrees throughout the year while place B might range from 0 to 100 and then have one or two extremely hot outliers that even the average out to 60. Looking at not only the skew of the bell curve, but also what the standard deviation is for each place, might save a student from forgetting to bring a fan to hypothetical place B, or writing that that the climate of that place is cool year round.  In addition to biology, standard deviation is a very necessary operation in psychology, which is a very statistics-based science. This can easily be seen in representing IQ scores how we found earlier!

# Deceiving with Statistics

I really enjoyed a recent Math With Bad Drawings post on how descriptive statistics can be used to deceive. For example:

See the rest of the post for similar picture for mean, median, mode, and variance (equivalent to standard deviation); I’ll be using these in my future statistics classes.

# Why Not to Trust Statistics

Math with Bad Drawings has an excellent post on how the blind use of descriptive statistics can be deceptive. I’ll definitely be sharing a few these with my students. Here’s one of several examples; I recommend reading the whole thing.