The Birthday Problem – in a nutshell.

March 21, 2019

How large should a randomly chosen group of people be, to make it more likely than not that at least two of them share a birthday?

For convenience, assume that all dates in the calendar are equally likely as birthdays, and ignore the Leap Year special of February 29^th

The first thing to look at is the likelihood that two randomly chosen people would share the same birthday.

Let’s call them Felix and Felicity. Say Felicity’s birthday is May 1^st. What is the chance that Felix shares this birthday with Felicity? Well there are 365 days in the year, and only one of these is May 1^st and we are assuming that all dates in the calendar are equally likely as birthdays. What we call the sample space is, therefore 365 days and each particular birthday is an ‘event’ in that sample space.

So, the probability that Felix’s birthday is May 1^st is 1/365, and the chance he shares a birthday with Felicity is 1/365.

So what is the probability that Felix’s birthday is not May 1^st?It is 364/365. This is the probability that Felix doesn’t share a birthday with Felicity.

More generally, for any randomly chosen group of two people, the probability that the second person has a different birthday to the first is 364/365.

With 3 people, the chance that all three are different is the chance that the first two are different (364/365) multiplied by the chance that the third birthday is different (363/365).

So, the probability that 3 people have different birthdays = 364/365 x 363/365

Now, suppose that the room contains four people. What is the probability that at least two of these people share the same birthday?

The probability that 4 people have different birthdays = 364 x 363 x 362 / 365 x 365 x 365

We can then subtract this probability from 1 to establish the probability that at least two of the four share a birthday.

Probability that none of the four people share the same birthday =

365 x 364 x 363 x 362 / 365 x 365 x 365 x 365 = 0.984

Probability that at least two of them share the same birthday = 1 – 0.984 = 0.016

Similarly, it can be calculated that the probability of at least two sharing a birthday increases as n, the number in the room, increases, as below:

n = 16; probability = 0.281

n= 23; probability = 0.505

n = 32; probability = 0.754

n = 40; probability = 0.892

So, the probability that two share a birthday exceeds 0.5 in a room of 23 or more people.

So how large should a randomly chosen group of people be, to make it more likely than not that at least two of them share a birthday? The answer is 23.

The intuition behind this is quite straightforward if we recognise just how many pairs of people there are in a group of 23 people, any pair of which could share a birthday.

In a group of 23 people, there are in fact 253 pairs of people to choose from. Therefore, a group of 23 people generates 253 chances, each of size 1/365, of having at least two people in the group sharing the same birthday.

The Birthday Problem is in this way notable for being a classic example of the Multiple Comparisons Fallacy. This fallacy arises when, in looking at many variables, the number of possible correlations that are being tested is under-estimated. In particular, multiple comparisons arise when a statistical analysis involves multiple simultaneous statistical tests, each of which has a potential to produce a ‘discovery.’ For example, with a thousand variables, there are almost half a million (1,000×999/2) potential pairs of variables that might appear correlated by chance alone. While each pair is extremely unlikely in itself to show dependence, from the half a million pairs, it is very possible that a large number will appear to be dependent. Say, for example, more than 20 comparisons are made where there is a 95% confidence level for each. In this case, we may well get a false comparison by chance. This becomes a fallacy when that false comparison is seen as significant rather than a statistical probability. This fallacy can be addressed by the use of more sophisticated statistical tests.

To summarize the Birthday problem, in a group of 23 people (assuming each of their birthdays is an independently chosen day of the year with all days equally likely), there is in fact greater than a 50 per cent chance that at least two of the group share the same birthday. This seems counter-intuitive, since it is rare to meet someone that shares a birthday. Indeed, if you select two random people, the chance that they share a birthday is about 1 in 365. With 23 people, however, there are 253 (23×22/2) pairs of people who might have a common birthday. So by looking across the whole group, we are checking whether any one of these 253 pairings, each of which independently has a tiny chance of coinciding, does indeed match. Because there are so many possibilities of a pair , it makes it more likely than not, statistically, for coincidental matches to arise. For a group of as 40 people, say, it is nearly nine times as likely that at least share a birthday than that they do not.

To be technical about it, in a group of 23 people, there are, according to the standard formula, ²³C₂pairs of people (called 23 Choose 2) pairs of people.

Generally, the number of ways k things can be chosen from n is:

ⁿ C _k = n! / (n-k)! k!

Here n! (n factorial) is n x n-1 x n-2 … down to 1. Similarly for k!

Thus, ²³C₂= 23! / 21! 2! = 23 x 22 / 2 = 253

These chances have some overlap: if A and B have a common birthday, and A and C have a common birthday, then inevitably so do B and C.

So the probability of at least two people sharing a birthday in a group of 23 is less than 253/365 (69.3%).

The probability that at least two people in the group of 23 do not share a birthday is:

(364/365)²⁵³= 0.4995

Essentially, making 253 comparisons and having them all be different is like getting heads 253 times in a row, i.e. you avoided tails 253 times in a row.

The odds of two people having different birthdays is 1 – 1/365 = 364/365 = 0.99726.

The odds of 23 people having different birthdays is (364/365)²⁵³= 0.4995

The odds that at least two of the 23 people share the same birthday = 1 – 0.4995 = 0.505 = 50.5%

So the next time you see two football teams line up, with the referee, it is more likely than not that two of those on the pitch share the same birthday.

Exercise

What is the probability that a randomly selected group of 24 people share a birthday? Assume that all dates in the calendar are equally likely as birthdays, and ignore the Leap Year February 29.^th

References and Links

Probability and the Birthday Paradox. Scientific American. March 29, 2012. https://www.scientificamerican.com/article/bring-science-home-probability-birthday-paradox/

Understanding the Birthday Paradox. Better Explained. https://betterexplained.com/articles/understanding-the-birthday-paradox/

Birthday Problem. Wikipedia. https://en.wikipedia.org/wiki/Birthday_problem

Multiple Comparisons Fallacy. In: Paradoxes of Probability and other statistical strangeness. The Conversation. Woodcock, S. April 4, 2017. https://theconversation.com/paradoxes-of-probability-and-other-statistical-strangeness-74440

Multiple Comparisons Fallacy. Logically Fallacious. https://www.logicallyfallacious.com/tools/lp/Bo/LogicalFallacies/130/Multiple-Comparisons-Fallacy

The Multiple Comparisons Fallacy. Fallacy Files. http://www.fallacyfiles.org/multcomp.html

The Misleading Effect of Noise: The Misleading Comparisons Problem. Koehrsen, W. Feb. 7, 2018. whttps://towardsdatascience.com/the-multiple-comparisons-problem-e5573e8b9578

Multiple Comparisons. https://youtu.be/EMzcZFtGZZE

The Multiple Comparisons Problem. https://youtu.be/dzi1CSvzCoU

From → Nutshells, paradoxes, Probability, Puzzles, Statistics

The Birthday Problem – in a nutshell.

Share this:

Related

Leave a comment Cancel reply

Prof. Leighton Vaughan Williams

Recent Posts

Categories

A+ links

All Conversation articles

All Select Networks

Audio Files

Betting

Betting Taxation

Book Chapters

Books

Centres

Charity

Choice and Reason

Competition Commission

David Henry Morris Williams, C. Eng.

Editorial

Employment

Evidence to UK Parliament

Gambling Commission

HM Revenue and Customs

Memberships and Fellowships

My Adobe Voice

National Audit Office

Other Publications

Papers Online

Personal

Political Forecasting

Press and media

Probability

Profile

Published Papers

Radio Interviews

Select Abstracts

Select Books

Select Broadcasts

Select Clippings

Select Pages

Select Papers

Select Presentations

Select Social Media

Select Stories

Select Websites

Select Wiki

Selected Talks

Short stories

Thought Experiment

Twisted Logic

Twitter

Useful Links

Various Blogs

XYZ

Flickr Photos