Facebook Birthday Problem: Part 4

Recently, I devised the following problem:

Suppose that you have n friends, and you always say “Happy Birthday” to each friend on his/her birthday. On how many days of the year will you not say “Happy Birthday” to one of your friends?

Until somebody tells me otherwise, I’m calling this the Facebook birthday problem in honor of Facebook’s daily alerts to say “Happy Birthday” to friends.

Here’s how I solved this problem. Let I_k be an indicator random variable for “no friend has a birthday on day k, where k = 366 stands for February 29 and k = 1, \dots, 365 stand for the “usual” 365 days of the year. Therefore, the quantity N, representing the number of days of the year on which no friend has a birthday, can be written as

N = I_1 + \dots + I_{365} + I_{366}

In yesterday’s post, I began the calculation of the standard deviation of N by first computing its variance. This calculation is complicated by the fact that I_1, \dots, I_{366} are dependent. Yesterday, I showed that

\hbox{Var}(N) = 365\displaystyle \left( \frac{1457}{1461} \right)^n \left[ 1 - \left( \frac{1457}{1461} \right)^n \right] + \displaystyle \left( \frac{1460}{1461} \right)^n \left[ 1 - \left( \frac{1460}{1461} \right)^n \right]

+ \displaystyle 2 \!\!\!\!\! \sum_{1 \le j < k \le 365} \!\!\!\!\! \hbox{Cov}(I_j,I_k) + 2 \sum_{k=1}^{365} \hbox{Cov}(I_k,I_{366})

To complete this calculation, I’ll now find the covariances. I’ll begin with \hbox{Cov}(I_j,I_k) if 1 \le j < k \le 365; that is, if j and k are days other than February 29. I’ll use the usual computation formula for a covariance,

\hbox{Cov}(I_j,I_k) = E(I_j I_k) - E(I_j) E(I_k).

We have calculated E(I_k) earlier in this series. In any four-year span, there are 4 \times 365 + 1 = 1461 days, of which only one is February 29. Assuming the birthday’s are evenly distributed (which actually doesn’t happen in real life), the chance that someone’s birthday is not on day k is

\displaystyle 1 - \frac{4}{1461} = \displaystyle \frac{1457}{1461},

so that the probability that no friend has a birthday on day k is

\displaystyle \left( \frac{1457}{1461} \right)^n.

Therefore, since the expected value of an indicator random variable is the probability that the event happens, we have

E(I_k) = \displaystyle \left( \frac{1457}{1461} \right)^n

for k = 1, \dots, 365. Therefore,

\hbox{Cov}(I_j,I_k) = E(I_j I_k) - \displaystyle \left( \frac{1457}{1461} \right)^n \left( \frac{1457}{1461} \right)^n = E(I_j I_k) - \displaystyle \left( \frac{1457}{1461} \right)^{2n}.

To find E(I_j I_k), we note that since I_j is equal to either 0 or 1 and I_k is equal to either 0 or 1, the product I_j I_k can only equal 0 and 1 as well. Therefore, I_j I_k is itself an indicator random variable, which I’ll call I_{jk}. Furthermore, I_{jk} if and only if I_j = 1 and I_k = 1, which means that no friends has a birthday on either day j or day k. The chance that someone doesn’t have a birthday on day j or day k is

\displaystyle 1 - \frac{4}{1461} - \frac{4}{1461} = \displaystyle \frac{1453}{1461},

so that the probability that no friend has a birthday on day j or k is

\displaystyle \left( \frac{1453}{1461} \right)^n.

Therefore, as before,

E(I_j I_k) = \displaystyle \left( \frac{1453}{1461} \right)^n,

so that

\hbox{Cov}(I_j,I_k) = \displaystyle \left( \frac{1453}{1461} \right)^n - \displaystyle \left( \frac{1457}{1461} \right)^{2n}.

Since there are \displaystyle {365 \choose 2} = \displaystyle \frac{365\times 364}{2} pairs (j,k) so that 1 \le j < k \le 365, we have

\hbox{Var}(N) = 365\displaystyle \left( \frac{1457}{1461} \right)^n \left[ 1 - \left( \frac{1457}{1461} \right)^n \right] + \displaystyle \left( \frac{1460}{1461} \right)^n \left[ 1 - \left( \frac{1460}{1461} \right)^n \right]

+ \displaystyle 2 \times \displaystyle \frac{365\times 364}{2} \left[ \displaystyle \left( \frac{1453}{1461} \right)^n - \displaystyle \left( \frac{1457}{1461} \right)^{2n} \right] + 2 \sum_{k=1}^{365}\hbox{Cov}(I_k,I_{366}),

or

\hbox{Var}(N) = 365\displaystyle \left( \frac{1457}{1461} \right)^n \left[ 1 - \left( \frac{1457}{1461} \right)^n \right] + \displaystyle \left( \frac{1460}{1461} \right)^n \left[ 1 - \left( \frac{1460}{1461} \right)^n \right]

+ \displaystyle (365)(364) \left[ \displaystyle \left( \frac{1453}{1461} \right)^n - \displaystyle \left( \frac{1457}{1461} \right)^{2n} \right] + 2 \sum_{k=1}^{365}\hbox{Cov}(I_k,I_{366}).

The calculation of \hbox{Cov}(I_k,I_{366}) is similar to the above calculation; I’ll write this up in tomorrow’s post.

Leave a comment

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: