# Facebook Birthday Problem: Part 4

Recently, I devised the following problem:

Suppose that you have n friends, and you always say “Happy Birthday” to each friend on his/her birthday. On how many days of the year will you not say “Happy Birthday” to one of your friends?

Until somebody tells me otherwise, I’m calling this the Facebook birthday problem in honor of Facebook’s daily alerts to say “Happy Birthday” to friends.

Here’s how I solved this problem. Let $I_k$ be an indicator random variable for “no friend has a birthday on day $k$, where $k = 366$ stands for February 29 and $k = 1, \dots, 365$ stand for the “usual” 365 days of the year. Therefore, the quantity $N$, representing the number of days of the year on which no friend has a birthday, can be written as

$N = I_1 + \dots + I_{365} + I_{366}$

In yesterday’s post, I began the calculation of the standard deviation of $N$ by first computing its variance. This calculation is complicated by the fact that $I_1, \dots, I_{366}$ are dependent. Yesterday, I showed that

$\hbox{Var}(N) = 365\displaystyle \left( \frac{1457}{1461} \right)^n \left[ 1 - \left( \frac{1457}{1461} \right)^n \right] + \displaystyle \left( \frac{1460}{1461} \right)^n \left[ 1 - \left( \frac{1460}{1461} \right)^n \right]$

$+ \displaystyle 2 \!\!\!\!\! \sum_{1 \le j < k \le 365} \!\!\!\!\! \hbox{Cov}(I_j,I_k) + 2 \sum_{k=1}^{365} \hbox{Cov}(I_k,I_{366})$

To complete this calculation, I’ll now find the covariances. I’ll begin with $\hbox{Cov}(I_j,I_k)$ if $1 \le j < k \le 365$; that is, if $j$ and $k$ are days other than February 29. I’ll use the usual computation formula for a covariance,

$\hbox{Cov}(I_j,I_k) = E(I_j I_k) - E(I_j) E(I_k)$.

We have calculated $E(I_k)$ earlier in this series. In any four-year span, there are $4 \times 365 + 1 = 1461$ days, of which only one is February 29. Assuming the birthday’s are evenly distributed (which actually doesn’t happen in real life), the chance that someone’s birthday is not on day $k$ is

$\displaystyle 1 - \frac{4}{1461} = \displaystyle \frac{1457}{1461}$,

so that the probability that no friend has a birthday on day $k$ is

$\displaystyle \left( \frac{1457}{1461} \right)^n$.

Therefore, since the expected value of an indicator random variable is the probability that the event happens, we have

$E(I_k) = \displaystyle \left( \frac{1457}{1461} \right)^n$

for $k = 1, \dots, 365$. Therefore,

$\hbox{Cov}(I_j,I_k) = E(I_j I_k) - \displaystyle \left( \frac{1457}{1461} \right)^n \left( \frac{1457}{1461} \right)^n = E(I_j I_k) - \displaystyle \left( \frac{1457}{1461} \right)^{2n}$.

To find $E(I_j I_k)$, we note that since $I_j$ is equal to either 0 or 1 and $I_k$ is equal to either 0 or 1, the product $I_j I_k$ can only equal 0 and 1 as well. Therefore, $I_j I_k$ is itself an indicator random variable, which I’ll call $I_{jk}$. Furthermore, $I_{jk}$ if and only if $I_j = 1$ and $I_k = 1$, which means that no friends has a birthday on either day $j$ or day $k$. The chance that someone doesn’t have a birthday on day $j$ or day $k$ is

$\displaystyle 1 - \frac{4}{1461} - \frac{4}{1461} = \displaystyle \frac{1453}{1461}$,

so that the probability that no friend has a birthday on day $j$ or $k$ is

$\displaystyle \left( \frac{1453}{1461} \right)^n$.

Therefore, as before,

$E(I_j I_k) = \displaystyle \left( \frac{1453}{1461} \right)^n$,

so that

$\hbox{Cov}(I_j,I_k) = \displaystyle \left( \frac{1453}{1461} \right)^n - \displaystyle \left( \frac{1457}{1461} \right)^{2n}$.

Since there are $\displaystyle {365 \choose 2} = \displaystyle \frac{365\times 364}{2}$ pairs $(j,k)$ so that $1 \le j < k \le 365$, we have

$\hbox{Var}(N) = 365\displaystyle \left( \frac{1457}{1461} \right)^n \left[ 1 - \left( \frac{1457}{1461} \right)^n \right] + \displaystyle \left( \frac{1460}{1461} \right)^n \left[ 1 - \left( \frac{1460}{1461} \right)^n \right]$

$+ \displaystyle 2 \times \displaystyle \frac{365\times 364}{2} \left[ \displaystyle \left( \frac{1453}{1461} \right)^n - \displaystyle \left( \frac{1457}{1461} \right)^{2n} \right] + 2 \sum_{k=1}^{365}\hbox{Cov}(I_k,I_{366})$,

or

$\hbox{Var}(N) = 365\displaystyle \left( \frac{1457}{1461} \right)^n \left[ 1 - \left( \frac{1457}{1461} \right)^n \right] + \displaystyle \left( \frac{1460}{1461} \right)^n \left[ 1 - \left( \frac{1460}{1461} \right)^n \right]$

$+ \displaystyle (365)(364) \left[ \displaystyle \left( \frac{1453}{1461} \right)^n - \displaystyle \left( \frac{1457}{1461} \right)^{2n} \right] + 2 \sum_{k=1}^{365}\hbox{Cov}(I_k,I_{366})$.

The calculation of $\hbox{Cov}(I_k,I_{366})$ is similar to the above calculation; I’ll write this up in tomorrow’s post.