Friday 21 June 2013

Birthday Paradox

This is one of the very interesting problems in Probability theory. So here it goes. Birthday Paradox concerns  the probability, that in a set of n randomly chosen people, some pair of them will have the same birthday.

Consider the following example: Assuming for a moment that birthdays are evenly distributed throughout the year, if you're sitting in a room with forty people in it, what are the chances that two of those people have the same birthday? For simplicity's sake, we'll ignore leap years. A reasonable, intelligent person might point out that the odds don't reach 100% until there are 366 people in the room (the number of days in a year + 1)... and forty is about 11% of 366... so such a person might conclude that the odds of two people in forty sharing a birthday are about 11%. In reality, due to Math's convoluted reasoning, the odds are about 90%. This phenomenon is known as the Birthday Paradox.

It turns out that this phenomenon is very useful in cryptography and hashing algorithms. The birthday paradox is strange, counter-intuitive, and completely true. It’s only a “paradox” because our brains can’t handle the compounding power of exponents. We expect probabilities to be linear and only consider the scenarios we’re involved in (both faulty assumptions, by the way). So, without diverging away lets dive into the main part.

Lets work with some numbers. Suppose there are 23 people in a room. What are the chances that 2 people share a birthday ?
Lets list down all the possible ways, 23 people can have birthdays and count the ways where at least 2 people share a birthday. After this calculating this, the required probability is a cake walk. But since the number of ways are huge, this is hard to do. Lets flip the problem. Instead of finding all the ways where at 2 people share a birthday, lets find all the ways where no 2 people share a birthday. Then we can easily compute the probability that at least 2 people share a birthday by subtracting from 1. 

With 23 people, we have 253 combinations. 
(23 X 22) / 2 = 253
The chance of 2 people having different birthdays is :
1 - (1/365)  = (364 / 365) = 0.997260  --> This should be pretty easy to understand
Makes sense. 364 out of 365 days are Ok.

Now, all the 253 pairs should be different. We use exponents to find the probability. So, the probability that all the 253 pairs have different birthdays is 
(364 / 365) ^ 253 = 0.4995

Now the chances that out of 23 people in a room, no 2 people share a birthday is 1 - 0.4995 = 0.5005 = 50.05%

The generalized formula is :
P(n) = 1 - (364 / 365)^C(n,2), where n is the count of people in the room.

There is another way to look at this problem which may seem easier for some people:
Let's say you have a big wall calendar with all 365 days on it. You walk in and put a big X on your birthday. The next person who walks in has only a 364 possible open days available, so the probability of the two dates not colliding is 364/365. The next person has only 363 open days, so the probability of not colliding is 363/365. If you multiply the probabilities for all 23 people not colliding, then you get:
364/365 × 363/365 × … 365-23+1/365 = Chances of no collisions