Recently, I devised the following problem:

Suppose that you have

nfriends, and you always say “Happy Birthday” to each friend on his/her birthday. On how many days of the year will younotsay “Happy Birthday” to one of your friends?

Until somebody tells me otherwise, I’m calling this the *Facebook birthday problem* in honor of Facebook’s daily alerts to say “Happy Birthday” to friends.

Here’s how I solved this problem. Let be an indicator random variable for “no friend has a birthday on day , where stands for February 29 and stand for the “usual” 365 days of the year. Therefore, the quantity , representing the number of days of the year on which no friend has a birthday, can be written as

In yesterday’s post, I began the calculation of the standard deviation of by first computing its variance. This calculation is complicated by the fact that are dependent. Yesterday, I showed that

.

To complete this calculation, I’ll now find , where . I’ll use the usual computation formula for a covariance,

.

We have calculated earlier in this series. In any four-year span, there are days, of which only one is February 29. Assuming the birthday’s are evenly distributed (which actually doesn’t happen in real life), the chance that someone’s birthday is not on day is

,

so that the probability that no friend has a birthday on day is

.

Therefore, since the expected value of an indicator random variable is the probability that the event happens, we have

for . Similarly,

,

so that

.

To find , we note that since is equal to either 0 or 1 and is equal to either 0 or 1, the product can only equal 0 and 1 as well. Therefore, is itself an indicator random variable. Furthermore, if and only if and , which means that no friends has a birthday on either day or day (that is, February 29). The chance that someone doesn’t have a birthday on day or February 29 is

,

so that the probability that no friend has a birthday on day or February 29 is

.

Therefore, as before,

,

so that

.

Therefore,

,

and we find the standard deviation of using

.

The graph below shows the expected value of , which was shown earlier to be

,

along with error bars representing two standard deviations.

Interestingly, the standard deviation of changes for different values of ; a direct calculation shows that the is maximized at with maximum value of approximately . Accordingly, for and , the error bars in the above figure have a total width of approximately 24 days (two standard deviations both above and below the expected value).