# Facebook Birthday Problem: Part 3

Recently, I devised the following problem:

Suppose that you have n friends, and you always say “Happy Birthday” to each friend on his/her birthday. On how many days of the year will you not say “Happy Birthday” to one of your friends?

Until somebody tells me otherwise, I’m calling this the Facebook birthday problem in honor of Facebook’s daily alerts to say “Happy Birthday” to friends.

Here’s how I solved this problem. Let $I_k$ be an indicator random variable for “no friend has a birthday on day $k$, where $k = 366$ stands for February 29 and $k = 1, \dots, 365$ stand for the “usual” 365 days of the year. Therefore, the quantity $N$, representing the number of days of the year on which no friend has a birthday, can be written as

$N = I_1 + \dots + I_{365} + I_{366}$

In yesterday’s post, I showed that

$E(N) = E(I_1) + \dots + E(I_{365}) + E(I_{366}) = 365 \displaystyle \left( \frac{1457}{1461} \right)^n + \left( \frac{1460}{1461} \right)^n$.

The calculation of the standard deviation of $N$ is considerably more complicated, however, since the $I_1, \dots, I_{366}$ are dependent. So we will begin by computing the variance of $N$:

$\hbox{Var}(N) = \displaystyle \sum_{k=1}^{366} \hbox{Var}(I_k) + 2 \!\!\!\!\! \sum_{1 \le j < k \le 366} \!\!\!\!\! \hbox{Cov}(I_j,I_k)$,

or

$\hbox{Var}(N) = \displaystyle \sum_{k=1}^{365} \hbox{Var}(I_k) + \hbox{Var}(I_{366}) + 2 \!\!\!\!\! \sum_{1 \le j < k \le 365} \!\!\!\!\! \hbox{Cov}(I_j,I_k) + 2 \sum_{k=1}^{365} \hbox{Cov}(I_k,I_{366})$

For the first term, we recognize that, in any four-year span, there are $4 \times 365 + 1 = 1461$ days, of which only one is February 29. Assuming the birthday’s are evenly distributed (which actually doesn’t happen in real life), the chance that someone’s birthday is not on day $k$ is

$\displaystyle 1 - \frac{4}{1461} = \displaystyle \frac{1457}{1461}$.

Therefore, the chance that all $n$ friends don’t have a birthday on day $k$ is

$\displaystyle \left( \frac{1457}{1461} \right)^n$.

Using the formula $\hbox{Var}(I) = p(1-p)$ for the variance of an indicator random variable, we see that

$\hbox{Var}(I_k) = \displaystyle \left( \frac{1457}{1461} \right)^n \left[ 1 - \left( \frac{1457}{1461} \right)^n \right]$

for $k = 1, \dots, 365$. Similarly, for the second term,

$\hbox{Var}(I_{366}) = \displaystyle \left( \frac{1460}{1461} \right)^n \left[ 1 - \left( \frac{1460}{1461} \right)^n \right]$

Therefore, so far we have shown that

$\hbox{Var}(N) = 365\displaystyle \left( \frac{1457}{1461} \right)^n \left[ 1 - \left( \frac{1457}{1461} \right)^n \right] + \displaystyle \left( \frac{1460}{1461} \right)^n \left[ 1 - \left( \frac{1460}{1461} \right)^n \right]$

$+ \displaystyle 2 \!\!\!\!\! \sum_{1 \le j < k \le 365} \!\!\!\!\! \hbox{Cov}(I_j,I_k) + 2 \sum_{k=1}^{365} \hbox{Cov}(I_k,I_{366})$

In tomorrow’s post, I’ll complete this calculation by finding the covariances.