# Engaging students: Probability and odds

In my capstone class for future secondary math teachers, I ask my students to come up with ideas for engaging their students with different topics in the secondary mathematics curriculum. In other words, the point of the assignment was not to devise a full-blown lesson plan on this topic. Instead, I asked my students to think about three different ways of getting their students interested in the topic in the first place.

I plan to share some of the best of these ideas on this blog (after asking my students’ permission, of course).

This student submission again comes from my former student Victor Acevedo. His topic, from Pre-Algebra: probability and odds.

How can technology be used to effectively engage students with this topic?

There is an online interactive game in which students practice their knowledge on probability. The game is called “Beat the Odds” and it is on PBS’s learning media website. There are two game modes: training and competition. In training mode, students must answer questions about finding the probability of various events. (rolling a die, picking from a deck of cards, etc.) For each correct answer, students earn digital money and the questions scale in difficulty. After the students feel that they have earned enough money, they can switch over to competition mode. Competition mode allows students to bet money against other bot players to see who can answer questions the most accurately. Students are asked various questions and whoever is the closest to the correct answer wins the money in the “pot.”  Students can keep playing either until they lose all their money or until they decide to get out while they are ahead.

How has this topic appeared in pop culture (movies, TV, current music, video games, etc.)?

Probability is an integral part to sports analysis. In baseball, batting averages are used to determine a player’s batting ability by dividing the number of successful hits by the number of at bats. This statistic can be used to determine the probability that a player may hit a ball during their next at bat. For example, a player that has a .400 would have roughly a 40% chance of hitting the ball during their next at bat. By using a player’s batting average and other stats, teams can decide how to set up their line up for going up to bat. Typically, the players with the highest batting averages take up the first 5 spots in the lineup. The first three players need to be able to make it on to a base, while the fourth player needs to be a heavy hitter than can possibly have everyone score runs. Coaches consider every players’ batting averages, as well as other stats, to help them determine their best lineup and chances of winning.

How can this topic be used in your students’ future courses in mathematics or science?

Quantum theory is a branch of physics that focuses on studying the different properties of atoms and particles. The most famous application of probability in quantum theory is the concept of the wave-particle duality of light. A thought experiment with Schrodinger’s cat helps to illustrate this idea in terms that most can comprehend. A cat is trapped in a box with a poison gas that is randomly released. As an observer, you cannot tell whether that is dead or alive unless you open the box. Schrodinger theorized that until the box is open, the cat is neither dead nor alive but rather in between. The concept of wave-particle duality states that light and other quantum sized particles can behave as either waves or particles depending on the observer. Theoretical physicists have concluded that this idea of fluctuating realities is an underlying truth of all probabilities. Because of this, physicists believe that either we must accept this as truth and hold true the possibility of multiple universes, or that there may be something wrong with the theory as it currently stands.

References

Fell, A. (2013, February 5). Does probability come from quantum physics? Retrieved from https://www.ucdavis.edu/news/does-probability-come-quantum-physics/

Freudenrich, C., Ph.D. (2000, July 10). How Light Works. Retrieved from https://science.howstuffworks.com/light6.htm

# My Favorite One-Liners: Part 114

In this series, I’m compiling some of the quips and one-liners that I’ll use with my students to hopefully make my lessons more memorable for them.

I’ll use today’s one-liner whena step that’s usually necessary in a calculation isn’t needed for a particular example. For example, consider the following problem from probability:

Let $X$ be uniformly distributed on $\{-1,0,1\}$. Find $\hbox{Cov}(X,X^2)$.

The first step is to write $\hbox{Cov}(X,X^2) = E(X \cdot X^2) - E(X) E(X^2) = E(X^3) - E(X) E(X^2)$. Then we start computing the expectations. To begin,

$E(X) = (-1) \cdot \displaystyle \frac{1}{3} + 0 \cdot \displaystyle \frac{1}{3} + 1 \cdot \displaystyle \frac{1}{3} = 0$.

Ordinarily, the next step would be computing $E(X^2)$. However, this computation is unnecessary since $E(X^2)$ will be multiplied by $E(X)$, which we just showed was equal to $0$. While I might calculate $E(X^2)$ if I thought my class needed the extra practice with computing expectations, the answer will not ultimately affect the final answer. Hence my one-liner:

To paraphrase the great philosopher The Rock, it doesn’t matter what $E(X^2)$ is.

P.S. This example illustrates that the covariance of two dependent random variables ($X$ and $X^2$) can be zero. If two random variables are independent, then the covariance must be zero. But the reverse implication is false.

# Engaging students: Probability and odds

In my capstone class for future secondary math teachers, I ask my students to come up with ideas for engaging their students with different topics in the secondary mathematics curriculum. In other words, the point of the assignment was not to devise a full-blown lesson plan on this topic. Instead, I asked my students to think about three different ways of getting their students interested in the topic in the first place.

I plan to share some of the best of these ideas on this blog (after asking my students’ permission, of course).

This student submission again comes from my former student Trent Pope. His topic, from Pre-Algebra: probability and odds.

What interesting (i.e., uncontrived) word problems using this topic can your students do now?

This website contains problems that would be great for odds. On the worksheet it has you solving problems about the chances of getting different gumballs from a gumball machine and chances of winning gift cards in a drawing. These worksheets would be great because there are real life applications with these examples. On the worksheet students are to solve what color gumballs they could draw from the machine. This will give them a visual representation of their odds. In order to find their odds they must know all the required information such as the number of total gumballs and the number of each color. Then the instructor can ask the students any question about what they can draw. The other problem is that there are gift cards, coupons, and free admission to a theme park that a student draws from a hat. This would be another great example of how students can find the odds of what they can draw.

http://www.algebra-class.com/odds-and-probability.html

How could you as a teacher create an activity or project that involves your topic?

This project idea comes from the game show Deal or No Deal. The purpose of the project would be for students to see what the odds are of winning more money than the amount offered from the Banker. For instance, the banker will offer you $100,000 to leave the show without seeing what is in your briefcase. The contestant would then look to see how many briefcases are left that could contain an amount greater than$100,000. If there are five chances out of the twenty remaining briefcases, the student would have a 5/20 chance, or 25% chance, to win more money. So, the contestant might want to say no deal because there is a higher chance of winning more money should he/she stay in the game. Students could go multiple rounds of this and see if their chances increase as the game goes on. This would engage students and they would look forward to winning the game show.

http://www.teachforever.com/2008/02/lesson-idea-probability-using-deal-or.html

How has this topic appeared in pop culture (movies, TV, current music, video games, etc.)?

# Engaging students: Permutations

In my capstone class for future secondary math teachers, I ask my students to come up with ideas for engaging their students with different topics in the secondary mathematics curriculum. In other words, the point of the assignment was not to devise a full-blown lesson plan on this topic. Instead, I asked my students to think about three different ways of getting their students interested in the topic in the first place.

I plan to share some of the best of these ideas on this blog (after asking my students’ permission, of course).

This student submission again comes from my former student Sarah McCall. Her topic, from probability: permutations.

What interesting (i.e., uncontrived) word problems using this topic can your students do now?

In high school math, word problems are essentially unavoidable. They can be a pain, but they do help students to be able to see applications of what they are learning as well as good problem solving skills. So, if we must make use of word problems, we might as well make them as engaging/fun as possible. Some examples of ones that I found and would use in my classroom:

1. Permutation Peter went to the grocery store yesterday and met a super cute girl. He was able to get her phone number (written on the back of his receipt), but today when he went to call her he couldn’t find it anywhere! He knows that it consisted of 7 digits between 0 and 9. Help Permutation Peter by figuring out how many combinations of phone numbers there are.
2. Every McDonald’s Big Mac consists of 10 layers: 2 patties, 3 buns, lettuce, cheese, onions, special sauce, and pickles. How many different ways are there to arrange a Big Mac?

How has this topic appeared in pop culture?

Many students are easily confused when they first learn the difference between permutations and combinations, because for most permutations is an unfamiliar concept. One way to show students that they have actually seen permutations before in everyday life is with a Rubik’s cube. To use this in class, I would have students pass around a Rubik’s cube, while I explained that each of the possible arrangements of the Rubik’s cube is a permutation. I would also present to them (and explain) the equation that allows you to find the total number of possibilities (linked below) which yields approximately 43 quintillion permutations. This means it would be virtually impossible for someone to solve it just by randomly turning the faces. Who says you won’t use math in the real world!

How can technology be used to effectively engage students with this topic?
In a day and age where a majority of our population is absorbed in technology, I believe that one of the most effective ways to reach high school students is to encourage the constructive use of technology in the classroom instead of fighting it. Khan academy is one of the best resources out there for confusing mathematics topics, because it engages students in a format that is familiar to them (YouTube); not to mention it may be effective for students’ learning to hear a different voice explaining topics other than their normal teacher. In my classroom, I would have my students use their phones, laptops, or tablets to work through khan academy’s permutation videos, examples, and practice problems (link listed below).

References

https://www.quora.com/How-are-permutations-applied-in-real-life

https://prezi.com/q3aaem0k2xie/permutations-in-the-real-world

https://ruwix.com/the-rubiks-cube/mathematics-of-the-rubiks-cube-permutation-group

# Engaging students: Expressing probability as a fraction and as a percentage

In my capstone class for future secondary math teachers, I ask my students to come up with ideas for engaging their students with different topics in the secondary mathematics curriculum. In other words, the point of the assignment was not to devise a full-blown lesson plan on this topic. Instead, I asked my students to think about three different ways of getting their students interested in the topic in the first place.

I plan to share some of the best of these ideas on this blog (after asking my students’ permission, of course).

This student submission again comes from my former student Jenna Sieling. Her topic, from probability: expressing a probability as a fraction and as a percentage.

How could you as a teacher create an activity or project that involves your topic?

This topic is something that can really be applied in many places. Especially in sports, weather, and economics, probabilities as fractions and percentages are used daily. This can become very relatable to high school students no matter what they are interested in or plan to study in college. An activity that can be used in the classroom is starting a fake fantasy football league. Although I have never played in a fantasy football league, I know that to win in your group you need to look at the statistics of each player doing well. Given a class of hopefully around 30 students, we can start a week long activity of our own fantasy football league in the classroom and the students can be given different statistics each day to calculate the probability of their players being a good advantage for their team. This is just one activity that could catch the interest of students who may not usually be interested in probabilities.

How can this topic be used in your students’ future courses in mathematics or science?

One of the most popular majors for young students to fall into is business and probabilities become an important concept to understand if you plan to work in the business world. By making this point to a class, I feel the students will take the importance of this subject to heart. Business is not the only future path that would be using probabilities in the form of fractions or percentages. Fields like meteorology, economics, and even education majors would use the concept of probabilities to help teach elementary school students the basics to help them further on. If a student goes on to study history, at one point he or she will have to look at the economic history and understand the probability of these events happening and the probability of them happening again. The student would need to know how to multiply integers by fractions or percentages to gain conceptual knowledge of probability and its use.

How can technology (YouTube, Khan Academy [khanacademy.org], Vi Hart, Geometers Sketchpad, graphing calculators, etc.) be used to effectively engage students with this topic?

I googled different online games to use for probability games and the most useful games, I found from Mathwire.com. Most games on this website were dice-based probability games but I think these are fun, easy games that could be assigned as homework. One game on the website was a game named SKUNK. The aim of the game is to guess the probability that a pair a dice will give you the highest amount of points. Each letter in the name SKUNK counts as one round and at the end of all the rounds, the person with the highest amount of points wins. Each player has to roll the dice once within one round and calculate the probability of getting the highest amount on each round. After looking at this game and others on this website, I realized that I could also explain the probability you need to understand to play poker if it was a popular game between friends and family. I could easily find a website to create a mock poker game and show students the idea of probability within poker.

# Engaging students: Independent and dependent events

In my capstone class for future secondary math teachers, I ask my students to come up with ideas for engaging their students with different topics in the secondary mathematics curriculum. In other words, the point of the assignment was not to devise a full-blown lesson plan on this topic. Instead, I asked my students to think about three different ways of getting their students interested in the topic in the first place.

I plan to share some of the best of these ideas on this blog (after asking my students’ permission, of course).

This student submission comes from my former student Danielle Pope. Her topic, from Probability: independent and dependent events.

What interesting (i.e., uncontrived) word problems using this topic can your students do now?

Students use the idea of independent and dependent events in their lives without even realizing it. Many of the word problems used to introduce probability are basic concepts that students can understand. The basic definition of an independent event is “the probability that one event occurs in no way affects the probability of the other event occurring”. Word problems can be used to demonstrate this. Asking if the probability of flipping a coin changes if you were to roll a die as well is a prime example. These two acts are something that can be easily implemented in the classroom and the technical definition can be taught. Students can then help come up with more scenarios and teach themselves the terms. Similarly this idea can be used for dependent variables with a few changes. If the “probability of one event occurring influences the likelihood of the other event” then the event is defined as dependent. Word problems could be “if you were to draw two cards from a deck of 52 cards and if on your first draw you had an ace and you put that aside, would the probability of drawing another ace change? This card questions could be more challenging by taking out more cards each time.

How can this topic be used in your students’ future courses in mathematics or science?

The topic of independent and dependent events can later be translated into variables when used with functions in Algebra class. Knowing and understanding the difference will help students know how to classify an event and use the correct variable and axis if asked to sketch a list of data. Just in a probability course students will learn about conditional probability, which will use the idea of dependence. Other terminology like with replacement and without replacement will be used to define a dependent event in probability. This topic can even be translated into a physics classroom when talking about time, position, velocity, acceleration etc. For example, when calculating the velocity students will either find or be given the displacement and change in time. Not knowing that the dependent event divides the independent variable or specifically with velocity, displacement divides time, If those numbers are not plugged in correctly then that will lead to the wrong answer.

How has this topic appeared in the news?

In the news, independent and dependent events show up everyday. The most common example is weather. One of the longest debates that we have been having is if global warming and climate change has influenced the world. According to an article in the Smithsonian Magazine, scientists “couldn’t prove that global warming had “caused” the heat wave of 2003, (they) did assert that warming from human emissions had doubled the risk of extreme weather events.” This observation can then be taken to a student’s science class and they can research the risk of continuing this pattern of damage to Earth. Natural disasters can also have a say in many events around the world. For example just recently “Gas prices spiked in the Baltimore-area — and nationwide — in recent days and are expected to continue to rise after a major pipeline that runs from Texas to the East Coast had to be shut down following Hurricane Harvey”. This is a prime example of a dependent event. This shortage in gas specifically in Texas then led to many people rushing to fill up their cars, resulting in gas stations running out of gas, which is just another example of dependent events.

References:

https://www.wyzant.com/resources/lessons/math/statistics_and_probability/probability/further_concepts_in_probability

http://www.smithsonianmag.com/science-nature/does-climate-change-cause-extreme-weather-events-180964506/

# Facebook Birthday Problem: Part 5

Recently, I devised the following problem:

Suppose that you have n friends, and you always say “Happy Birthday” to each friend on his/her birthday. On how many days of the year will you not say “Happy Birthday” to one of your friends?

Until somebody tells me otherwise, I’m calling this the Facebook birthday problem in honor of Facebook’s daily alerts to say “Happy Birthday” to friends.

Here’s how I solved this problem. Let $I_k$ be an indicator random variable for “no friend has a birthday on day $k$, where $k = 366$ stands for February 29 and $k = 1, \dots, 365$ stand for the “usual” 365 days of the year. Therefore, the quantity $N$, representing the number of days of the year on which no friend has a birthday, can be written as

$N = I_1 + \dots + I_{365} + I_{366}$

In yesterday’s post, I began the calculation of the standard deviation of $N$ by first computing its variance. This calculation is complicated by the fact that $I_1, \dots, I_{366}$ are dependent. Yesterday, I showed that

$\hbox{Var}(N) = 365\displaystyle \left( \frac{1457}{1461} \right)^n \left[ 1 - \left( \frac{1457}{1461} \right)^n \right] + \displaystyle \left( \frac{1460}{1461} \right)^n \left[ 1 - \left( \frac{1460}{1461} \right)^n \right]$

$+ \displaystyle (365)(364) \left[ \displaystyle \left( \frac{1453}{1461} \right)^n - \displaystyle \left( \frac{1457}{1461} \right)^{2n} \right] + 2 \sum_{k=1}^{365}\hbox{Cov}(I_k,I_{366})$.

To complete this calculation, I’ll now find $\hbox{Cov}(I_k,I_{366})$, where $1 \le k \le 365$. I’ll use the usual computation formula for a covariance,

$\hbox{Cov}(I_k,I_{366}) = E(I_k I_{366}) - E(I_k) E(I_{366})$.

We have calculated $E(I_k)$ earlier in this series. In any four-year span, there are $4 \times 365 + 1 = 1461$ days, of which only one is February 29. Assuming the birthday’s are evenly distributed (which actually doesn’t happen in real life), the chance that someone’s birthday is not on day $k$ is

$\displaystyle 1 - \frac{4}{1461} = \displaystyle \frac{1457}{1461}$,

so that the probability that no friend has a birthday on day $k$ is

$\displaystyle \left( \frac{1457}{1461} \right)^n$.

Therefore, since the expected value of an indicator random variable is the probability that the event happens, we have

$E(I_k) = \displaystyle \left( \frac{1457}{1461} \right)^n$

for $k = 1, \dots, 365$. Similarly,

$E(I_{366}) = \displaystyle \left( \frac{1460}{1461} \right)^n$,

so that

$\hbox{Cov}(I_k,I_{366}) = E(I_k I_{366}) - \displaystyle \left( \frac{1457}{1461} \right)^n \left( \frac{1460}{1461} \right)^n$.

To find $E(I_k I_{366})$, we note that since $I_k$ is equal to either 0 or 1 and $I_{366}$ is equal to either 0 or 1, the product $I_k I_{366}$ can only equal 0 and 1 as well. Therefore, $I_k I_{366}$ is itself an indicator random variable. Furthermore, $I_k I_{366} = 1$ if and only if $I_k = 1$ and $I_{366} = 1$, which means that no friends has a birthday on either day $k$ or day $366$ (that is, February 29). The chance that someone doesn’t have a birthday on day $k$ or February 29 is

$\displaystyle 1 - \frac{4}{1461} - \frac{1}{1461} = \displaystyle \frac{1456}{1461}$,

so that the probability that no friend has a birthday on day $k$ or February 29 is

$\displaystyle \left( \frac{1456}{1461} \right)^n$.

Therefore, as before,

$E(I_k I_{366}) = \displaystyle \left( \frac{1456}{1461} \right)^n$,

so that

$\hbox{Cov}(I_k,I_{366}) = \displaystyle \left( \frac{1456}{1461} \right)^n - \displaystyle \left( \frac{1457}{1461} \right)^n \left( \frac{1460}{1461} \right)^n$.

Therefore,

$\hbox{Var}(N) = 365\displaystyle \left( \frac{1457}{1461} \right)^n \left[ 1 - \left( \frac{1457}{1461} \right)^n \right] + \displaystyle \left( \frac{1460}{1461} \right)^n \left[ 1 - \left( \frac{1460}{1461} \right)^n \right]$

$+ \displaystyle (365)(364) \left[ \displaystyle \left( \frac{1453}{1461} \right)^n - \displaystyle \left( \frac{1457}{1461} \right)^{2n} \right] + 2(365) \left[ \left( \frac{1456}{1461} \right)^n - \displaystyle \left( \frac{1457}{1461} \right)^n \left( \frac{1460}{1461} \right)^n \right]$,

and we find the standard deviation of $N$ using

$\hbox{SD}(N) = \sqrt{\hbox{Var}(N)}$.

The graph below shows the expected value of $N$, which was shown earlier to be

$E(N) = 365 \displaystyle \left( \frac{1457}{1461} \right)^n + \left( \frac{1460}{1461} \right)^n$,

along with error bars representing two standard deviations.

Interestingly, the standard deviation of $N$ changes for different values of $n$; a direct calculation shows that the $\hbox{SD}(N)$ is maximized at $n = 459$ with maximum value of approximately $6.1$. Accordingly, for $n = 450$ and $n = 500$, the error bars in the above figure have a total width of approximately 24 days (two standard deviations both above and below the expected value).

# Facebook Birthday Problem: Part 4

Recently, I devised the following problem:

Suppose that you have n friends, and you always say “Happy Birthday” to each friend on his/her birthday. On how many days of the year will you not say “Happy Birthday” to one of your friends?

Until somebody tells me otherwise, I’m calling this the Facebook birthday problem in honor of Facebook’s daily alerts to say “Happy Birthday” to friends.

Here’s how I solved this problem. Let $I_k$ be an indicator random variable for “no friend has a birthday on day $k$, where $k = 366$ stands for February 29 and $k = 1, \dots, 365$ stand for the “usual” 365 days of the year. Therefore, the quantity $N$, representing the number of days of the year on which no friend has a birthday, can be written as

$N = I_1 + \dots + I_{365} + I_{366}$

In yesterday’s post, I began the calculation of the standard deviation of $N$ by first computing its variance. This calculation is complicated by the fact that $I_1, \dots, I_{366}$ are dependent. Yesterday, I showed that

$\hbox{Var}(N) = 365\displaystyle \left( \frac{1457}{1461} \right)^n \left[ 1 - \left( \frac{1457}{1461} \right)^n \right] + \displaystyle \left( \frac{1460}{1461} \right)^n \left[ 1 - \left( \frac{1460}{1461} \right)^n \right]$

$+ \displaystyle 2 \!\!\!\!\! \sum_{1 \le j < k \le 365} \!\!\!\!\! \hbox{Cov}(I_j,I_k) + 2 \sum_{k=1}^{365} \hbox{Cov}(I_k,I_{366})$

To complete this calculation, I’ll now find the covariances. I’ll begin with $\hbox{Cov}(I_j,I_k)$ if $1 \le j < k \le 365$; that is, if $j$ and $k$ are days other than February 29. I’ll use the usual computation formula for a covariance,

$\hbox{Cov}(I_j,I_k) = E(I_j I_k) - E(I_j) E(I_k)$.

We have calculated $E(I_k)$ earlier in this series. In any four-year span, there are $4 \times 365 + 1 = 1461$ days, of which only one is February 29. Assuming the birthday’s are evenly distributed (which actually doesn’t happen in real life), the chance that someone’s birthday is not on day $k$ is

$\displaystyle 1 - \frac{4}{1461} = \displaystyle \frac{1457}{1461}$,

so that the probability that no friend has a birthday on day $k$ is

$\displaystyle \left( \frac{1457}{1461} \right)^n$.

Therefore, since the expected value of an indicator random variable is the probability that the event happens, we have

$E(I_k) = \displaystyle \left( \frac{1457}{1461} \right)^n$

for $k = 1, \dots, 365$. Therefore,

$\hbox{Cov}(I_j,I_k) = E(I_j I_k) - \displaystyle \left( \frac{1457}{1461} \right)^n \left( \frac{1457}{1461} \right)^n = E(I_j I_k) - \displaystyle \left( \frac{1457}{1461} \right)^{2n}$.

To find $E(I_j I_k)$, we note that since $I_j$ is equal to either 0 or 1 and $I_k$ is equal to either 0 or 1, the product $I_j I_k$ can only equal 0 and 1 as well. Therefore, $I_j I_k$ is itself an indicator random variable, which I’ll call $I_{jk}$. Furthermore, $I_{jk}$ if and only if $I_j = 1$ and $I_k = 1$, which means that no friends has a birthday on either day $j$ or day $k$. The chance that someone doesn’t have a birthday on day $j$ or day $k$ is

$\displaystyle 1 - \frac{4}{1461} - \frac{4}{1461} = \displaystyle \frac{1453}{1461}$,

so that the probability that no friend has a birthday on day $j$ or $k$ is

$\displaystyle \left( \frac{1453}{1461} \right)^n$.

Therefore, as before,

$E(I_j I_k) = \displaystyle \left( \frac{1453}{1461} \right)^n$,

so that

$\hbox{Cov}(I_j,I_k) = \displaystyle \left( \frac{1453}{1461} \right)^n - \displaystyle \left( \frac{1457}{1461} \right)^{2n}$.

Since there are $\displaystyle {365 \choose 2} = \displaystyle \frac{365\times 364}{2}$ pairs $(j,k)$ so that $1 \le j < k \le 365$, we have

$\hbox{Var}(N) = 365\displaystyle \left( \frac{1457}{1461} \right)^n \left[ 1 - \left( \frac{1457}{1461} \right)^n \right] + \displaystyle \left( \frac{1460}{1461} \right)^n \left[ 1 - \left( \frac{1460}{1461} \right)^n \right]$

$+ \displaystyle 2 \times \displaystyle \frac{365\times 364}{2} \left[ \displaystyle \left( \frac{1453}{1461} \right)^n - \displaystyle \left( \frac{1457}{1461} \right)^{2n} \right] + 2 \sum_{k=1}^{365}\hbox{Cov}(I_k,I_{366})$,

or

$\hbox{Var}(N) = 365\displaystyle \left( \frac{1457}{1461} \right)^n \left[ 1 - \left( \frac{1457}{1461} \right)^n \right] + \displaystyle \left( \frac{1460}{1461} \right)^n \left[ 1 - \left( \frac{1460}{1461} \right)^n \right]$

$+ \displaystyle (365)(364) \left[ \displaystyle \left( \frac{1453}{1461} \right)^n - \displaystyle \left( \frac{1457}{1461} \right)^{2n} \right] + 2 \sum_{k=1}^{365}\hbox{Cov}(I_k,I_{366})$.

The calculation of $\hbox{Cov}(I_k,I_{366})$ is similar to the above calculation; I’ll write this up in tomorrow’s post.

# Facebook Birthday Problem: Part 3

Recently, I devised the following problem:

Suppose that you have n friends, and you always say “Happy Birthday” to each friend on his/her birthday. On how many days of the year will you not say “Happy Birthday” to one of your friends?

Until somebody tells me otherwise, I’m calling this the Facebook birthday problem in honor of Facebook’s daily alerts to say “Happy Birthday” to friends.

Here’s how I solved this problem. Let $I_k$ be an indicator random variable for “no friend has a birthday on day $k$, where $k = 366$ stands for February 29 and $k = 1, \dots, 365$ stand for the “usual” 365 days of the year. Therefore, the quantity $N$, representing the number of days of the year on which no friend has a birthday, can be written as

$N = I_1 + \dots + I_{365} + I_{366}$

In yesterday’s post, I showed that

$E(N) = E(I_1) + \dots + E(I_{365}) + E(I_{366}) = 365 \displaystyle \left( \frac{1457}{1461} \right)^n + \left( \frac{1460}{1461} \right)^n$.

The calculation of the standard deviation of $N$ is considerably more complicated, however, since the $I_1, \dots, I_{366}$ are dependent. So we will begin by computing the variance of $N$:

$\hbox{Var}(N) = \displaystyle \sum_{k=1}^{366} \hbox{Var}(I_k) + 2 \!\!\!\!\! \sum_{1 \le j < k \le 366} \!\!\!\!\! \hbox{Cov}(I_j,I_k)$,

or

$\hbox{Var}(N) = \displaystyle \sum_{k=1}^{365} \hbox{Var}(I_k) + \hbox{Var}(I_{366}) + 2 \!\!\!\!\! \sum_{1 \le j < k \le 365} \!\!\!\!\! \hbox{Cov}(I_j,I_k) + 2 \sum_{k=1}^{365} \hbox{Cov}(I_k,I_{366})$

For the first term, we recognize that, in any four-year span, there are $4 \times 365 + 1 = 1461$ days, of which only one is February 29. Assuming the birthday’s are evenly distributed (which actually doesn’t happen in real life), the chance that someone’s birthday is not on day $k$ is

$\displaystyle 1 - \frac{4}{1461} = \displaystyle \frac{1457}{1461}$.

Therefore, the chance that all $n$ friends don’t have a birthday on day $k$ is

$\displaystyle \left( \frac{1457}{1461} \right)^n$.

Using the formula $\hbox{Var}(I) = p(1-p)$ for the variance of an indicator random variable, we see that

$\hbox{Var}(I_k) = \displaystyle \left( \frac{1457}{1461} \right)^n \left[ 1 - \left( \frac{1457}{1461} \right)^n \right]$

for $k = 1, \dots, 365$. Similarly, for the second term,

$\hbox{Var}(I_{366}) = \displaystyle \left( \frac{1460}{1461} \right)^n \left[ 1 - \left( \frac{1460}{1461} \right)^n \right]$

Therefore, so far we have shown that

$\hbox{Var}(N) = 365\displaystyle \left( \frac{1457}{1461} \right)^n \left[ 1 - \left( \frac{1457}{1461} \right)^n \right] + \displaystyle \left( \frac{1460}{1461} \right)^n \left[ 1 - \left( \frac{1460}{1461} \right)^n \right]$

$+ \displaystyle 2 \!\!\!\!\! \sum_{1 \le j < k \le 365} \!\!\!\!\! \hbox{Cov}(I_j,I_k) + 2 \sum_{k=1}^{365} \hbox{Cov}(I_k,I_{366})$

In tomorrow’s post, I’ll complete this calculation by finding the covariances.