Facebook Birthday Problem: Part 3

Recently, I devised the following problem:

Suppose that you have n friends, and you always say “Happy Birthday” to each friend on his/her birthday. On how many days of the year will you not say “Happy Birthday” to one of your friends?

Until somebody tells me otherwise, I’m calling this the Facebook birthday problem in honor of Facebook’s daily alerts to say “Happy Birthday” to friends.

Here’s how I solved this problem. Let I_k be an indicator random variable for “no friend has a birthday on day k, where k = 366 stands for February 29 and k = 1, \dots, 365 stand for the “usual” 365 days of the year. Therefore, the quantity N, representing the number of days of the year on which no friend has a birthday, can be written as

N = I_1 + \dots + I_{365} + I_{366}

In yesterday’s post, I showed that

E(N) = E(I_1) + \dots + E(I_{365}) + E(I_{366}) = 365 \displaystyle \left( \frac{1457}{1461} \right)^n + \left( \frac{1460}{1461} \right)^n.

The calculation of the standard deviation of N is considerably more complicated, however, since the I_1, \dots, I_{366} are dependent. So we will begin by computing the variance of N:

\hbox{Var}(N) = \displaystyle \sum_{k=1}^{366} \hbox{Var}(I_k) + 2 \!\!\!\!\! \sum_{1 \le j < k \le 366} \!\!\!\!\! \hbox{Cov}(I_j,I_k),

or

\hbox{Var}(N) = \displaystyle \sum_{k=1}^{365} \hbox{Var}(I_k) + \hbox{Var}(I_{366}) + 2 \!\!\!\!\! \sum_{1 \le j < k \le 365} \!\!\!\!\! \hbox{Cov}(I_j,I_k) + 2 \sum_{k=1}^{365} \hbox{Cov}(I_k,I_{366})

For the first term, we recognize that, in any four-year span, there are 4 \times 365 + 1 = 1461 days, of which only one is February 29. Assuming the birthday’s are evenly distributed (which actually doesn’t happen in real life), the chance that someone’s birthday is not on day k is

\displaystyle 1 - \frac{4}{1461} = \displaystyle \frac{1457}{1461}.

Therefore, the chance that all n friends don’t have a birthday on day k is

\displaystyle \left( \frac{1457}{1461} \right)^n.

Using the formula \hbox{Var}(I) = p(1-p) for the variance of an indicator random variable, we see that

\hbox{Var}(I_k) = \displaystyle \left( \frac{1457}{1461} \right)^n \left[ 1 - \left( \frac{1457}{1461} \right)^n \right]

for k = 1, \dots, 365. Similarly, for the second term,

\hbox{Var}(I_{366}) = \displaystyle \left( \frac{1460}{1461} \right)^n \left[ 1 - \left( \frac{1460}{1461} \right)^n \right]

Therefore, so far we have shown that

\hbox{Var}(N) = 365\displaystyle \left( \frac{1457}{1461} \right)^n \left[ 1 - \left( \frac{1457}{1461} \right)^n \right] + \displaystyle \left( \frac{1460}{1461} \right)^n \left[ 1 - \left( \frac{1460}{1461} \right)^n \right]

+ \displaystyle 2 \!\!\!\!\! \sum_{1 \le j < k \le 365} \!\!\!\!\! \hbox{Cov}(I_j,I_k) + 2 \sum_{k=1}^{365} \hbox{Cov}(I_k,I_{366})

In tomorrow’s post, I’ll complete this calculation by finding the covariances.

Leave a comment

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: