Professor Leighton Vaughan Williams

March 22, 2019

Collider Bias – in a nutshell.

Further and deeper exploration of paradoxes and challenges of intuition and logic can be found in my recently published book, Probability, Choice and Reason.

Collider Bias (also known as Berkson’s bias or Berkson’s Paradox) is a statistical quirk which makes it appear that there is an association between two events or variables which are actually unrelated. Notably, it shows that two values can be negatively correlated in a sample of a population when they are in fact uncorrelated or positively correlated in that population. It arises because of a type of selection bias, which is caused by the observation of some events more than others.

Take the case of a college which admits students based on either musical excellence or sporting excellence. For the sake of argument, assume that there is no link between the two in the total relevant population (say, all students in the country). In other words, a musically talented individual is no more nor less likely to be talented at sport. Because the college admits only students who are excellent at music, or excellent at sport, or both, this creates a group or subset of the population which displays a negative association between musical and sporting excellence.

To illustrate why, let’s make the simplifying assumption that the college admits students who score 9 or 10 out of 10 (on a scale of 0 to 10) on either sporting excellence or musical excellence. In the entire population, however, the average rating of the worst musician and the best sportsman would be equal, i.e. 5 out of 10. Yet within the group of student entrants, the average rating for sporting ability of those admitted for musical ability is still 5 (the population average) compared to an average rating of 9.5 for musical ability. The effect is to imply a negative correlation between sporting and musical ability where no such correlation exists in the wider population.

This has been shown to have important implications for medical statistics. Say, for example, that a hospital conducts a study which admits patients onto the study who are suffering from either eye cataracts or diabetes. In this case, there will appear an association (albeit spurious) between cataracts and diabetes in the set of patients included in the study which does not appear in the wider population. The reason that this paradox occurs is that the probability of one event happening (cataracts, in this example) is higher in the presence of the other event (diabetes, in this example) because cases whether neither occur are excluded.

Similarly, take the idea that there is a negative association in our minds between the quality of movies based on really good books and the quality of the books. One explanation can be derived from Berkson’s Paradox. This interpretation is that we remember the instances where the book is really good or the movie is really good or both. But we forget those cases where both the book and the movie were bad. In this case we find a (spurious) negative correlation between how good the movie is and how good the book is, because the bad movies/bad books element of the population are not included in the set of movies and books under analysis.

Perhaps the most famous example of Collider Bias was proposed by Jordan Ellsberg. This is the ‘attractive people are jerks’ example and is similar to the movies/books example. Say that someone only associates with people who are either pleasant or attractive or both. That eliminates from the sample pool those who are both unpleasant and unattractive. That leaves a sample with attractive people who are unpleasant, and pleasant people who are unattractive, but eliminates those who are neither pleasant nor attractive. So an association is noted between being attractive and being unpleasant, but this is because the unattractive people who are also unpleasant are not observed. So even if no link exists between attractiveness and unpleasantness in the population, it does in an observed world where the counter-examples who exist in the population are avoided and ignored.

To put it more formally, assume there are two independent events, X and Y. These events are not correlated when observed in nature. If one conditions on the fact that either event X or event Y occurred (call this condition Z), however, these events are now correlated. This arises because of selection bias. If we condition on Z (that X OR Y occurs), then if we know that event X did not occur, we know that event Y did occur. This conditioning on Z, what we can call the union of X and Y, leads to a correlation.

Put mathematically, if P (XIY) = P (X), then P (XIY, Z) is less than P (XIZ) where Z = X U Y.

Numerical example of Collider Bias

10% of the population swim and 5% play squash weekly, but there is no correlation between swimming and playing squash in the general population. So someone who plays squash is as likely to swim as any other member of the population and vice-versa.

Of the 200 members of a local health club, 30% swim and 20% play squash.

Based on the health club statistics, is there any evidence of a correlation between those who do not swim and those who play squash?

To answer this, we use the assumption that someone who plays squash is as likely to swim as any other member of the population, i.e. swimming and squash playing can be treated as independent events. So, the percentage of health club members who play squash who also swim would be 10% x 5% = 0.5% of 200 members, i.e. 1 member.

A randomly chosen health club member, however, has a 30% chance of swimming and a 20% chance of playing squash. So, 60 out of 200 members will swim and 40 play squash.

Now, what is the chance that a member who is not a swimmer plays squash?

Of the 60 members who swim, we have calculated above that only 1 also plays squash, i.e. of the 200 members in total, 60 swim and one swims and plays squash.

So, of the remaining 140 patients who do not swim, 39 play squash, i.e. 40 members in total play squash minus one who swims and plays squash. Thus, 39 members who do not swim play squash.

So 39 of the 140 health club members who do not swim play squash, i.e. 39/140 (27.9%). This is higher than the 20% in the population who play squash.

Even though the two events (swimming and squash) are independent, therefore, the health club statistics make it appear that swimming reduces the likelihood of playing squash, i.e. there is a negative correlation between swimming and playing squash. The reason is that we excluding from consideration those members of the general population who neither swim nor pay squash, and only considering those who either swim or play squash or both.

Exercise

10% of the population is suffering from a flu virus. Of those in a clinic intake of 100 patients, 30% are suffering from a flu virus. 10% of those in the clinic were admitted for appendicitis. Now, assume that someone suffering from appendicitis is as likely to get flu as any other member of the population, and vice-versa.

Is there any evidence from the clinic statistics that having flu reduces the likelihood of having appendicitis?

References and links

Paradoxes of probability and other statistical strangeness. Berkson’s Paradox. The Conversation. https://theconversation.com/paradoxes-of-probability-and-other-statistical-strangeness-74440

Berkson’s paradox. Physics of Risk. Oct. 9, 2018. http://rf.mokslasplius.lt/berkson-paradox/

Berkson’s paradox explained. Healthcare Economist. July 9, 2013. https://www.healthcare-economist.com/2013/07/09/berksons-paradox-explained/

Berkson’s Paradox. Mathemathinking. Oct. 5, 2014. http://corysimon.github.io/articles/berksons-paradox-are-handsome-men-really-jerks/

Jordan Ellsberg (2014), Why are Handsome Men Such Jerks? June 3. Slate.com https://slate.com/human-interest/2014/06/berksons-fallacy-why-are-handsome-men-such-jerks.html

March 22, 2019

Simpson’s Paradox – in a nutshell.

Was the University of California, Berkeley, guilty of discrimination in their entry standards? This was a cause of concern in the early 1970s. To show what was behind the concern, we can highlight the admission figures for the Fall term of 1973. This shows that male applicants to the University were significantly more likely to be accepted than females.

Applicants Admitted

Men 8442 44%

Women 4321 35%

Looks pretty damning, until it was decided to break the admittance figures down by department. In doing so, it revealed a paradox.

Dept. Men Women

Applicants Admitted Applicants Admitted

A 825 62% 108 82%

B 560 63% 25 68%

C 325 37% 593 34%

D 417 33% 375 35%

E 191 28% 393 24%

F 373 6% 341 7%

In other words, a higher proportion of women were admitted to four of the six departments than men.

So what was going on? Those with statistical training soon realised that this was a simple example of Simpson’s Paradox. Simpson’s Paradox arises when different groups of frequency data are combined, revealing a different performance rate overall than is the case when examining a breakdown of the performance rate. Put another way, Simpson’s paradox is the appearance of trends within different groups which disappear when data for the groups are combined together.

In the case of Berkeley, a study published in 1975 by Bickel, Hammel and O’Connell, in ‘Science’ reached the conclusion that women tended to apply to the more competitive departments with low rates of admission, such as the English Department, while men tended to apply to less competitive departments with high rates of admission, such as engineering and chemistry. As such the University was not actively discriminating against women, at least not on the basis of the statistics used to make the charge.

Ignorance of the implications of Simpson’s Paradox might also generate false conclusions in the case of medical trials.

Take the following drugs, and their success rate in medical trials over two different days.

Drug A Drug B

Day 1 63/90 = 70% 8/10 = 80%

Day 2 4/10 = 40% 45/90 = 50%

Overall, Drug A = 67% success rate; Drug B = 53% success rate.

But Drug B performs better on both days.

So which is the better drug? In the medical trials, I would certainly choose to be treated by Drug A. Others might differ, but I doubt they would persuade any reasonable judge of the outcome of the trials.

Take another example. In this trial, there are two groups, consisting of a control group of 240 patients who are supplied with a placebo drug, such as a sugar pill, which is known to have no effect on the illness under evaluation, and a test group of 240 patients who are supplied with the real drug. The 240 patients are made up of four groups. Group A is elderly adults, Group B is middle-aged adults, Group C is young adults and Group D is children.

Here are the results, with success rate measured by the proportion recovering from the illness within two days of taking the drug:

Those taking the placebo.

Group A: 20; Group B: 40; Group C: 120; Group D: 60

Success rates are:

Group A: 10%; Group B: 20%; Group C: 40%; Group D: 30%

Overall success rate for those taking the placebo = 2+8+48+18 Divided By 240 = 76/240 = 31.7%.

Those taking the real drug.

Group A: 120; Group B: 60; Group C: 20; Group D: 10

Success rates are:

Group A: 15%; Group B: 30%; Group C: 60%; Group D: 45%

Overall success rate for those taking the real drug = 18+18+12+18 Divided By 240 = 66/240 = 27.5%.

This compares with an overall success rate for those taking the placebo of 31.7%.

So the placebo, over the whole sample, produced a higher success rate than the real drug.

Breaking the numbers down by group, however, reveals a discrepancy.

For the real drug

Group A: 10%; Group B: 20%; Group C; 40%; Group D: 30%

For the placebo

Group A: 15%; Group B: 30%; Group C; 60%; Group D: 45%

So, in each individual group (elderly adults, middle-aged adults, young adults, children) the success rate is greater for those taking the real drug, although in the group as a whole, it is less.

How can we resolve the paradox?

The answer lies in the size and age distribution of each group, which differs between those who received the real drug and those who received the placebo. In this study, the group which received the placebo consists of a whole lot more young adults, for example, than the other groups, in contrast with the number taking the real drug. This is important because the natural recovery rates from this illness (as defined in the test) are normally higher in this demographic than the other groups, whether they receive the real drug or the placebo. Again, the elderly (whose recovery rates are normally lower than average) are much more heavily represented among those taking the real drug than the placebo.

Take another example from baseball. In the 1995/96 seasons, fans were divided between those who claimed Derek Jeter as the best performing player and those who claimed that title for David Justice. It is easy to see why. Here are their batting averages.

1995 1996 Combined

Derek Jeter 12/48 (.250) 183/582 (.314) 195/630 (.310)

David Justice 104/411 (.253) 45/140 (.321) 149/551 (.270)

Here we see that Jeter has the better overall batting average but Justice records a better average in each of the two years making up that overall average. To anyone conversant with Simpson’s Paradox this is nothing weird. It is certainly possible in theory for one player to score a better batting average in successive years than another, yet record a worse batting average overall. The case of Jeter and Justice is an example where the theory clearly shows up in practice.

Indeed, forward to 1997 and the paradox grows even stronger. In that year, Jeter averaged 0.291 (190/654), while Justice scored a better average (163/495). So, in three successive years, Justice recorded a better average than Jeter. Over the whole period, though, the batting average for Derek Jeter was 0.300 (385/1284), superior to David Justice, on 0.298 (312/1046).

So who is the better baseball player? Were the University of California, Berkeley, discriminating on the basis of gender? Which is the better drug? All of these questions are examples of Simpson’s Paradox.

Exercise

In a cricket match, bowlers propel the ball at the wicket defended by batsmen. The batsmen aims to score runs by hitting the ball and running between the wickets. The bowler aims to dismiss (‘take the wicket of’) the batsman by various means including hitting the wicket of the batsman. The bowling average is the number of runs scored by the batsmen off the bowler divided by the number of wickets taken by the bowler. The lower the bowling average the better for the bowler.

Now, let’s take the following example of two mythical cricket matches played by legendary bowlers, Harold Larwood and Bill Voce.

First Match:

Harold Larwood takes 3 wickets while bowling but concedes 60 runs off his bowling (an average of 20 runs conceded per wicket).

Bill Voce takes 2 wickets while bowling but concedes 68 runs (an average of 24 runs conceded per wicket).

Second Match:

Harold Larwood takes 1 wicket and concedes 8 runs (an average of 8 runs conceded per wicket).

Bill Voce takes 6 wickets and concedes 60 runs (an average of 10 runs conceded per wicket).

Question: Which bowler has the superior performance in the first match? Which bowler has the superior performance in the second match? Which bowler has the superior performance overall?

References and Links

Maths in a minute: Simpson’s Paradox. +Plus magazine. November 5, 2010. https://plus.maths.org/content/maths-minute-simpsons-paradox

All about averages: Simpson’s Paradox. +Plus magazine. January 1, 2005. https://plus.maths.org/content/all-about-averages

Paradoxes of probability and other statistical strangeness. Simpson’s Paradox. The Conversation. https://theconversation.com/paradoxes-of-probability-and-other-statistical-strangeness-74440

Simpson’s Paradox. Wikipedia. https://en.m.wikipedia.org/wiki/Simpson%27s_paradox

March 21, 2019

The Birthday Problem – in a nutshell.

How large should a randomly chosen group of people be, to make it more likely than not that at least two of them share a birthday?

For convenience, assume that all dates in the calendar are equally likely as birthdays, and ignore the Leap Year special of February 29^th

The first thing to look at is the likelihood that two randomly chosen people would share the same birthday.

Let’s call them Felix and Felicity. Say Felicity’s birthday is May 1^st. What is the chance that Felix shares this birthday with Felicity? Well there are 365 days in the year, and only one of these is May 1^st and we are assuming that all dates in the calendar are equally likely as birthdays. What we call the sample space is, therefore 365 days and each particular birthday is an ‘event’ in that sample space.

So, the probability that Felix’s birthday is May 1^st is 1/365, and the chance he shares a birthday with Felicity is 1/365.

So what is the probability that Felix’s birthday is not May 1^st?It is 364/365. This is the probability that Felix doesn’t share a birthday with Felicity.

More generally, for any randomly chosen group of two people, the probability that the second person has a different birthday to the first is 364/365.

With 3 people, the chance that all three are different is the chance that the first two are different (364/365) multiplied by the chance that the third birthday is different (363/365).

So, the probability that 3 people have different birthdays = 364/365 x 363/365

Now, suppose that the room contains four people. What is the probability that at least two of these people share the same birthday?

The probability that 4 people have different birthdays = 364 x 363 x 362 / 365 x 365 x 365

We can then subtract this probability from 1 to establish the probability that at least two of the four share a birthday.

Probability that none of the four people share the same birthday =

365 x 364 x 363 x 362 / 365 x 365 x 365 x 365 = 0.984

Probability that at least two of them share the same birthday = 1 – 0.984 = 0.016

Similarly, it can be calculated that the probability of at least two sharing a birthday increases as n, the number in the room, increases, as below:

n = 16; probability = 0.281

n= 23; probability = 0.505

n = 32; probability = 0.754

n = 40; probability = 0.892

So, the probability that two share a birthday exceeds 0.5 in a room of 23 or more people.

So how large should a randomly chosen group of people be, to make it more likely than not that at least two of them share a birthday? The answer is 23.

The intuition behind this is quite straightforward if we recognise just how many pairs of people there are in a group of 23 people, any pair of which could share a birthday.

In a group of 23 people, there are in fact 253 pairs of people to choose from. Therefore, a group of 23 people generates 253 chances, each of size 1/365, of having at least two people in the group sharing the same birthday.

The Birthday Problem is in this way notable for being a classic example of the Multiple Comparisons Fallacy. This fallacy arises when, in looking at many variables, the number of possible correlations that are being tested is under-estimated. In particular, multiple comparisons arise when a statistical analysis involves multiple simultaneous statistical tests, each of which has a potential to produce a ‘discovery.’ For example, with a thousand variables, there are almost half a million (1,000×999/2) potential pairs of variables that might appear correlated by chance alone. While each pair is extremely unlikely in itself to show dependence, from the half a million pairs, it is very possible that a large number will appear to be dependent. Say, for example, more than 20 comparisons are made where there is a 95% confidence level for each. In this case, we may well get a false comparison by chance. This becomes a fallacy when that false comparison is seen as significant rather than a statistical probability. This fallacy can be addressed by the use of more sophisticated statistical tests.

To summarize the Birthday problem, in a group of 23 people (assuming each of their birthdays is an independently chosen day of the year with all days equally likely), there is in fact greater than a 50 per cent chance that at least two of the group share the same birthday. This seems counter-intuitive, since it is rare to meet someone that shares a birthday. Indeed, if you select two random people, the chance that they share a birthday is about 1 in 365. With 23 people, however, there are 253 (23×22/2) pairs of people who might have a common birthday. So by looking across the whole group, we are checking whether any one of these 253 pairings, each of which independently has a tiny chance of coinciding, does indeed match. Because there are so many possibilities of a pair , it makes it more likely than not, statistically, for coincidental matches to arise. For a group of as 40 people, say, it is nearly nine times as likely that at least share a birthday than that they do not.

To be technical about it, in a group of 23 people, there are, according to the standard formula, ²³C₂pairs of people (called 23 Choose 2) pairs of people.

Generally, the number of ways k things can be chosen from n is:

ⁿ C _k = n! / (n-k)! k!

Here n! (n factorial) is n x n-1 x n-2 … down to 1. Similarly for k!

Thus, ²³C₂= 23! / 21! 2! = 23 x 22 / 2 = 253

These chances have some overlap: if A and B have a common birthday, and A and C have a common birthday, then inevitably so do B and C.

So the probability of at least two people sharing a birthday in a group of 23 is less than 253/365 (69.3%).

The probability that at least two people in the group of 23 do not share a birthday is:

(364/365)²⁵³= 0.4995

Essentially, making 253 comparisons and having them all be different is like getting heads 253 times in a row, i.e. you avoided tails 253 times in a row.

The odds of two people having different birthdays is 1 – 1/365 = 364/365 = 0.99726.

The odds of 23 people having different birthdays is (364/365)²⁵³= 0.4995

The odds that at least two of the 23 people share the same birthday = 1 – 0.4995 = 0.505 = 50.5%

So the next time you see two football teams line up, with the referee, it is more likely than not that two of those on the pitch share the same birthday.

Exercise

What is the probability that a randomly selected group of 24 people share a birthday? Assume that all dates in the calendar are equally likely as birthdays, and ignore the Leap Year February 29.^th

References and Links

Probability and the Birthday Paradox. Scientific American. March 29, 2012. https://www.scientificamerican.com/article/bring-science-home-probability-birthday-paradox/

Understanding the Birthday Paradox. Better Explained. https://betterexplained.com/articles/understanding-the-birthday-paradox/

Birthday Problem. Wikipedia. https://en.wikipedia.org/wiki/Birthday_problem

Multiple Comparisons Fallacy. In: Paradoxes of Probability and other statistical strangeness. The Conversation. Woodcock, S. April 4, 2017. https://theconversation.com/paradoxes-of-probability-and-other-statistical-strangeness-74440

Multiple Comparisons Fallacy. Logically Fallacious. https://www.logicallyfallacious.com/tools/lp/Bo/LogicalFallacies/130/Multiple-Comparisons-Fallacy

The Multiple Comparisons Fallacy. Fallacy Files. http://www.fallacyfiles.org/multcomp.html

The Misleading Effect of Noise: The Misleading Comparisons Problem. Koehrsen, W. Feb. 7, 2018. whttps://towardsdatascience.com/the-multiple-comparisons-problem-e5573e8b9578

Multiple Comparisons. https://youtu.be/EMzcZFtGZZE

The Multiple Comparisons Problem. https://youtu.be/dzi1CSvzCoU

March 21, 2019

The Death Row Problem – in a nutshell.

Further and deeper exploration of paradoxes and challenges of intuition and logic can be found in my recently published book, Probability, Choice and Reason.

First posed in Scientific American in 1959, the Three Prisoners (or Death Row) Problem remains a classic of conditional probability. The problem, or a version of it, is simple to state. There are three prisoners on death row, Adam, Bob and Charlie. They are told that each of them has had their names entered into a hat and the lucky name to be randomly chosen will be pardoned as an act of clemency to celebrate the King’s birthday. The warden knows who has been pardoned, but none of the prisoners do.

Adam asks the warden to name one of the prisoners who will definitely NOT be pardoned. Either way, he agrees that his own fate should not be revealed. If Bob is to be spared, name Charlie as one of the men to be executed. If Charlie is to be spared, name Bob as one of the men to be executed. If it is he, Adam, who is to be pardoned, the warden should just flip a coin and name either Bob or Charlie as one of the men to be executed.

The warden agrees and names Charlie as one of the men going to the gallows.

Given this information, what is the probability that Adam is going to be pardoned, and what is the chance that Bob will instead be pardoned?

Adam reasons that his chance of being spared before the conversation with the warden was 1/3, as there are three prisoners, and only one of these will be pardoned by random lot. Now, though, he reasons that one of either he or Bob is to walk free, as he knows that Charlie is not the lucky one. So now Adam reasons that his chance of being pardoned has risen to 1/2. But is he right?

We can look at it this way. Before talking to the warden, Adam correctly concludes that his chance of evading the gallows is 1/3. It is either he, Bob or Charlie who will be released, and each has an equal chance, so each has a 1/3 chance of being pardoned.

When Adam expressly asks the warden to name one of the OTHER men who will be executed, he is asking the warden not to name him, whether he is to be pardoned or not. The warden (as we are told in the question) selects which of the other men to name by flipping a coin. Now, Adam gains no new information about his fate. The information he does gain is about the fate of Bob and Charlie. By naming Charlie as the condemned man, the warden is ruling out the chance that Charlie is to be pardoned.

So Adam now knows the chance that Charlie will be spared has decreased from a 1/3 chance before the warden revealed this information to a zero chance after he reveals it.

But his own chance of being spared remains unchanged, because the warden was not able to reveal any new information relevant to his own fate. New information is a requirement for changing the probability that something will happen or not. So his probability of being pardoned remains at 1/3. The new information he does have is that Charlie is not the lucky man, so the chance that Bob gets lucky is 2/3.

Put another way, how is it possible that Adam and Bob received the same information but their odds of surviving are so different? It is because, when the warden made his selection, he would never have declared that Adam was going to die. On the other hand, he might well have declared Bob to be the condemned man. In fact, there was a 50-50 chance he would have done so. Therefore, the fact that he didn’t name Bob provides valuable information as to the likelihood that Bob was pardoned while telling us nothing as to whether Adam was. This is an example of the reality that belief updates must depend not merely on the facts observed but also on the method of establishing those facts.

In case there is still any doubt, imagine that there were 26 prisoners instead of 3. Adam asks the warden not to reveal his own fate but to name in random order 24 of the other prisoners who are to be executed. So what is the chance that Bob will be the lucky one of 26 before the warden reveals any names? It is 1/26, the same chance as each of the other prisoners. Every time, however, that the warden names a dead man walking, say Charlie or Daniel, that reduces their chances to zero and increases the chance of all those left except for Adam, who has expressly asked not be named, regardless of whether he is to be executed. So it means a lot about Bob’s likely fate to learn that the warden has eliminated everyone but Bob given that he had every opportunity to name Bob as one of those going to the gallows. It means nothing that he has not named Adam because he was expressly told not to, whatever his fate.

In a 26-man line-up, where the warden in random order names who are condemned, once everyone but Bob has been named for execution by the warden, Adam’s chance of surviving stays at 1/26. Bob’s chance of being pardoned rises to 25/26. This is despite the fact that there are only two remaining prisoners who have not been named for execution by the warden. Would you take 20/1 now that Adam will be spared? You might, if you were Bob, but you are not getting a good price, and you will not have long to spend it!

Exercise

There are three prisoners on death row. They are told that one of them will be pardoned. The warden knows who it is, but the prisoners do not. One of the prisoners asks the warden to name one of the prisoners who will definitely NOT be pardoned. Either way, he agrees that his own fate should not be revealed. The warden agrees and names one of the other two prisoners as one of the men going to the gallows. Given this information, what is the probability that the prisoner who spoke to the warden will be pardoned?

References and Links

The Paradox of the Three Prisoners. Medium.com. Jan. 3, 2017. https://medium.com/@petergleeson1/the-paradox-of-the-three-prisoners-c8a88fdb67d3

Three Prisoners Problem. Wikipedia. https://en.m.wikipedia.org/wiki/Three_Prisoners_problem

March 21, 2019

The Two Envelopes Problem – in a nutshell.

The Two Envelopes Problem, also known as the Exchange Paradox, is quite simple to state. You are handed two identical-looking envelopes, one of which, you are informed, contains twice as much money as the other. You are asked to select one of the envelopes. Before opening it, you are given the opportunity, if you wish, to switch it for the other envelope. Once you have decided whether to keep the original envelope or switch to the other envelope, you are allowed to open the envelope and keep the money inside. Should you switch?

Switching does seem like a no-brainer. Note that one of the envelopes (you don’t know which) contains twice as much as the other. So, if one of the envelopes, for example, contains £100, the other envelope will contain either £200 or £50. By switching, it seems, you stand to gain £100 or lose £50, with equal likelihood. So the expected gain from the switch is 1/2 (£100) + 1/2 (-£50) = £50-£25 = £25.

Looked at another way, the expected value of the money in the other envelope = 1/2 (£200) + 1/2 (£50) = £125, compared to £100 from sticking with the original envelope.

More generally, you might reason, if X is the amount of money in the selected envelope, the expected value of the money in the other envelope = 1/2 (2X)+ 1/2 (X/2) = 5/4 X. Since this is greater than X, it seems like a good idea to switch.

Let’s frame it another way, by proposing a game in which there are two boxes, in one of which there will be a prize of a given value and in the other a prize of twice that value. Choose one of the boxes and open it. You are now offered the chance to switch your choice to the other box. The problem you face is that you don’t know if you the box you opened contained the larger or the smaller prize. So what should you do? First, let’s calculate what the expected payout is if you switch after opening the first box, which contains a sum, S. If this was the smaller prize, then by switching you would win 2S. If the first box contained the larger sum, however, you would win only 1/S by switching boxes. On average, the expected sum you win is (1/2 x 2S) + (1/2 x 1/2S) = 5/4 S. This is bigger than S so you would expect to gain on average by switching boxes.

Is this right? Should you always switch?

Look at it this way. If the above logic is correct, then after switching envelopes the amount of money contained in the other envelope can be denoted as Y.

So by switching back, the expected value of the money in the original envelope = 1/2 (2Y) + 1/2 (Y/2) = 5/4Y, which is greater than Y, following the same reasoning as before. So you should switch back.

But following the same logic, you should switch back again, and so on, indefinitely.

This would be a perpetual money-making machine. Something is surely wrong here.

One way to consider the question is to note that the total amount in both envelopes is a constant, A = 3X, with X in one envelope and 2X in the other.

If you select the envelope containing X first, you gain 2X-X = X by switching envelopes.

If you select the envelope containing 2X first, you lose 2X-X = X by switching envelopes. So your expected gain from switching = 1/2 (X) + 1/2 (-X) = 1/2 (X-X) = 0.

Looked at another way, the expected value for the originally selected envelope = 1/2 (X) + 1/2 (2X) = 3/2 X. The expected value for the envelope you switch to = 1/2 (2X) + 1/2 X = 3/2 X. These amounts are identical, so there is no expected gain (or loss) from switching.

So which is right? This reasoning or the original reasoning. There does not seem a flaw in either. In fact, there is a flaw in the earlier reasoning, which indicated that switching was the better option. So what is the flaw?

The flaw is in the way that the switching argument is framed, and it is contained in the possible amounts that could be found in the two envelopes. As framed in the original argument for switching, the amount could be £100, £200 or £50. More generally, there could be £X, £2X or £1/2 X in the envelopes. But we know that there are only two envelopes, so there can only be two amounts in these envelopes, not three.

You can frame this as £X and £2X or as £1/2 X and £X, but not legitimately as £X, £2X and £1/2 X. By framing it is as two amounts of money, not three, in the two envelopes, you derive the answer that there is no expected gain (or loss) from switching.

If you frame it as £X and £2X, there is a 0.5 chance you will get the envelope with £X, so by switching there is a 0.5 chance you will get the envelope with £2X, i.e. a gain of £X. Similarly, there is a 0.5 chance you selected the envelope with £2X, in which case switching will lose you £x. So the expected gain from switching is 0.5 (£X) + 0.5 (-£X) = £0.

If you frame it as £X and £1/2 X, there is a 0.5 chance you will get the envelope with £X, so by switching there is a 0.5 chance you will get the envelope with £1/2 X, i.e. a loss of £1/2 X. Similarly, there is a 0.5 chance you selected the envelope with £1/2 X, in which case switching will gain you £1/2 X. So the expected gain from switching is 0.5 (-£1/2 X) + 0.5 (£1/2 X) = £0.

There is demonstrably no expected gain (or loss) from switching envelopes.

In order to resolve the paradox, you must label the envelopes before you make your choice, not after. So envelope 1 is labelled, say, A, and envelope 2 is labelled, say, B. A corresponds in advance to, say, £100 and B corresponds in advance to, say, £200, or to £50, but not both. You don’t know which corresponds to which. If you choose one of these envelopes, the envelope marked in advance with the other letter will contain an equal amount more or less than the one you have selected. So there is no advantage (or disadvantage) in switching in terms of expected value. In summary, the clue to resolving the paradox lies in the fact that there are only two envelopes and these contain two amounts of money, not three.

Exercise

You are handed two identical-looking envelopes, one of which, you are informed, contains twice as much money as the other. You are asked to select one of the envelopes. You select an envelope. Before opening it, you are now given the opportunity, if you wish, to switch it for the other envelope. Once you have decided whether to keep the original envelope or switch to the other envelope, you are allowed to open the envelope and keep the money inside. Should you switch? In making your decision, consider an example where the first envelope you chose contained £100. This means that the other envelope contains either £200 or £50.

References and Links

ThatsMaths. The Two Envelopes Fallacy. Nov. 29, 2018. https://thatsmaths.com/2018/11/29/the-two-envelopes-fallacy/

Two Envelopes Problem. Wikipedia. https://en.m.wikipedia.org/wiki/Two_envelopes_problem

March 21, 2019

What’s in a name? Probability in a nutshell.

Further and deeper exploration of paradoxes and challenges of intuition and logic can be found in my recently published book, Probability, Choice and Reason.

Suppose that a family has two children. What is the probability that both are girls? Well, this is straightforward because there are four equally likely possibilities (assuming the chances of a boy and a girl are 50-50).

Let us assume that the two children are concealed from view, one behind a red curtain and one behind a yellow curtain.

Put like this, there are four possibilities:

Boy behind both curtains.
Boy behind red curtain and girl behind yellow curtain.
Girl behind red curtain and boy behind yellow curtain.
Girl behind both curtains.

So the probability that there is a girl behind both curtains = ¼.

This answers the first question. Given the information that a family has two children, the chance that both are girls is 1 in 4.

Now what if we are told that at least one of the children is a girl. This is like saying that there is at least one girl behind the curtains, possibly two.

This eliminates option 1, i.e. a boy behind both curtains, leaving three equally likely possibilities, only one of which is a girl behind both curtains. So the chance that there is a girl behind both curtains given that you know that there is a girl behind at least one curtain is 1 in 3.

This is equivalent to asking the probability that both children are girls if you know that at least one of the children is a girl. The answer is 1 in 3.

Now what if you are told that at least one of the girls has a nose or a chin or two X chromosomes. This adds little or no new information, insofar as the vast majority have these. So if I tell you that a family has two children, at least one of whom is a girl with a nose or a chin or two X chromosomes, it is giving me very little new information. So the probability that both children are girls if at least one is a girl is still pretty much 1 in 3.

What if instead I tell you that one of the children is a girl called Florida. This is pretty much equivalent to telling you that the family has a daughter behind the red curtain, insofar as it is not just identifying that there is at least one girl in the family, but identifying who or where she is. When now asked the probability that there is a girl behind the yellow curtain, options 1 and 2 (above) disappear, leaving just option 3 (a girl behind the red curtain and a boy behind the yellow curtain) and option 4 (a girl behind both curtains). So the new probability, given the additional information which identifies or locates one particular girl advance is 1 in 2.

In other words, knowing that there is girl behind the red curtain, or else knowing that her name is Florida, is like meeting her in the street with her parents who introduce her. If you know they have another child at home, the chance it is a girl is 1 in 2. By meeting her, you have identified a feature particular to that individual girl, i.e. that she is standing in front of you and not at home (or behind the red curtain, or named Florida and not simply possessed of a nose).

If, on the other hand, you meet a man in the pub who mentions his two children and you find out that at least one of them is a daughter, but nothing more than that, you are back to knowing that there is a girl behind at least one curtain, but not which, i.e. Options 2, 3 and 4 above. In only one of these equally likely options, i.e. Option 4, is there a girl behind both curtains, so the chance of the other child being a girl is 1 in 3.

So does it matter that the daughter has this unusual name? It does. If you know that the man in the pub has two children and at least one daughter, but nothing more, the chance his other child is a girl is 1 in 3. If you find out that the man in the street has two children, and then he tells you that one of children is called Florida, you are left with (to all intents and purposes) just two options. His other child is either a boy or else a girl not called Florida, which is pretty much equivalent to saying his other child is a boy or a girl. So the probability that his other child is a girl is now effectively 1 in 2.

The different information sets can be compared to tossing a coin twice. The possible outcomes are HH, HT, TH, TT. If you already know there is ‘at least’ one head, that leaves HH, HT, TH. The probability that the remaining coin is a Tail is 2 in 3. If, on the other hand, you identify that the coin in your left hand is a Head, the probability that the coin in your right hand is a Head is now 1 in 2. It is because you have pre-identified a unique characteristic of the coin, in this case its location. Identifying the girl as Florida does the same thing. In terms of two coins it is like marking one of the coins with a blue felt tip pen. You now declare that there are two coins in your hands, and one of them contains a Head with a blue mark on it. Such coins are rare, perhaps as rare as girls called Florida. So you are now asked what the chance is that the other coin is Heads (without a blue felt mark). Well, there are two possibilities. The other coin is either Heads (almost surely with no blue felt mark on it) or Tails. So the chance the other coin is Heads is 1 in 2. Without marking one of the coins, to make it unique, the chance of the other coin being Heads is 1 in 3.

Put another way, there are four possibilities without marking one of the coins:

Heads in left hand, Tails in right hand.
Heads in left hand, Tails in right hand.
Heads in both hands.
Tails in both hands.

If you declare that at least one of the coins in your hands is Heads, this means the chance the other is Heads is 1 in 3. This is equivalent to declaring that one of the two children is a girl but saying nothing further. The chance the other child is a girl is 1 in 3.

Now if you identify one of the coins in some unique way, for example by declaring that Heads is in your left hand, the chance that Heads is also in your right hand is 1 in 2, not 1 in 3.

Similarly, declaring that one of the coins is a Heads marked with a blue felt tip pen, the chance that the other coin is Heads, albeit not marked with a blue felt tip, is 1 in 2. Marking the coin with the blue felt tip is like pre-identifying a girl (her name is Florida) as opposed to simply declaring that at least one of the children is some generic girl (for example, a girl with a nose).

In other words, there are four possibilities without identifying either child.

Boy, Boy
Girl, Girl
Boy, Girl
Girl, Boy

If at least one of the children is a girl, Option 1 disappears, and the chance the other child is a girl is 1 in 3.

If you identify one of the children, say a girl whom you name as Florida, it is like marking the Heads with blue felt tip or declaring which hand you are holding the coin in.

Your options now reduce to:

Boy, Boy
Boy, Girl named Florida
Boy, Girl not named Florida
Girl named Florida, Girl not named Florida.

Options 1 and 3 can be discarded, leaving Options 2 and 4. In this scenario, the chance that the other child is a girl (not named Florida) is 1 in 2. By pre-identifying one of the girls, Option 3 disappears, changing the probability that the other child is a girl from 1 in 3 to 1 in 2.

The new information changes everything.

So what is the probability of the family having two girls if you know that one of the two children is a girl, but no more than that? The answer is 1 in 3.

But what is the probability of the family having two girls if one of the two children is a girl named Florida? Armed with this new information, the answer is, to all intents and purposes, 1 in 2.

Another way to look at this is to consider a set of 4,000 families made up of two children. Choose a single unique identifier of each child, say age (it could equally be height or alphabetical order, anything uniquely identifying one child from the other). 1,000 of these will be two boys – say older boy and younger boy (BB), 1,000 will be two girls – older girl and younger girl (GG), 1,000 will be Boy-Girl – older boy, younger girl (BG), 1,000 will be Girl-Boy – older girl, younger boy (GB). If you identify at least one of the children as a boy, there remain 3,000 families (1,000 BB, 1,000 BG, 1,000 GB). 2/3 of these families contain a girl, so the probability the other child is a girl is 2/3.

Now, add into the mix the fact that one girl in a thousand in your set of 4,000 families is named Florida, and there are no families with two daughters named Florida.

In this case, 1,000 of these will be two boys – an older boy and a younger boy (BB), one will be an older boy and a younger girl named Florida (BF), one will be an older girl named Florida and a younger boy (FB), one will be an older girl named Florida and a younger girl not named Florida (FG), one will be an older girl not named Florida and younger girl named Florida (GF), 999 will be an older boy and a younger girl not named Florida (BG), 999 will be an older girl not named Florida and a younger boy (GB), 998 will be an older girl not named Florida and a younger girl not named Florida (GG). There will be no families with both girls named Florida.

This can be summarised as (given that B is a boy, G is a girl not named Florida, F is a girl named Florida, and the sequence is older-younger):

1,000 BB; 1 BF; 999 BG; 1 FB; 0 FF; 1 FG; 999 GB; 1 GF; 998 GG).

Given that at least one child is a girl named Florida, 4 possible pairs remain:

BF; FB; FG; GF.

Of these, 2 contain a girl named Florida:

FG and GF.

So, if we know that out of 4,000 families, one is a child named Florida (and 1 in 1,000 girls is named Florida), then what is the chance that the other child is a girl once you are told that one of the children is a girl named Florida. It is 1/2.

And what if it’s a boy called Benson, or a boy called Sue? Same reasoning. The chance that the other child is a boy is higher than if it’s a boy called Bob.

Appendix

The solution to the ‘Girl Named Florida’ problem can be demonstrated using a Bayesian approach.

Let P (GG) = probability of two girls if there are two children. Let G be the probability of at least one girl in the family)

Let P (GG I 2 children) be the probability of two girls given there are two children.

Let P (GG I G) be the probability of two girls GIVEN THAT at least one is a girl.

Then, P (GG I 2 children) = 1/4

P (GG I G) = P (H I GG) . P (GG) / P (G) … by Bayes’ Theorem

So P (GG I G) = 1 x 1/4 / (3/4) = 1/3

P (GG I 2 children, older child is a girl)

Now there are only two possibilities, GB and GG (Older girl and younger boy or Older girl and younger girl), so the conditional probability of two girls given the older child is a girl, P (GG I Older child is G) = 1/2.

P(GG I 2 children, at least one being a girl named Florida).

B = 1/2

G = 1/2 – x

GF (Girl named Florida) = x

where x is the % of people who are girls named Florida.

Of families with at least one girl named Florida, there are the following possible combinations, with associated probabilities. We assume for simplicity independence of names.

B GF = 1/2 x

GF B = 1/2 x

G GF = x (1/2 – x)

GF G = x (1/2 – x)

GF GF = x^2

Probability of two girls if one is a girl named Florida =

G.GF + GF.G + GF.GF / G.GF + GF.G + GF.GF + B.GF + GF.B

= x (1/2 – x) + x (1/2 – x) + x^2 / [x (1/2-x) + x (1/2-x) + x^2 + x]

= 1/2 x – x^2 + 1/2x – x^2 + x^2 / [1/2x – x^2 + 1/2x – x^2 + x^2 + x]

= x – x^2 / x – x^2 + x = x(1-x) / x (2-x) = 1-x / 2-x

Assuming that Florida is not a common name, x approaches zero and the answer approaches 1/2. So it turns out that the name of the girl is relevant information.

As x approaches 1/2, the answer converges on 1/3. For example, if we know that at least one child is a girl with a nose, x is close to 1/2 and the problem reduces the standard P (GG I G) problem outlined above, i.e.

P (GG I G) = P (H I GG) . P (GG) / P (G) … by Bayes’ Theorem

So P (GG I G) = 1 x 1/4 / (3/4) = 1/3

Exercise

You meet your new next door neighbour who tells you that she has two children, both of whom will be visiting the next day. She mentions her son, Benson, but tells you nothing about her other child. What is the chance that her other child is a daughter?

References and Links

Untrammeled. December 31, 2017. Two-child problem (when one is a child named Florida born on a Tuesday). http://www.untrammeledmind.com/2017/12/two-child-problem-when-one-is-a-girl-named-florida-born-on-a-tuesday/

Probably Overthinking It. Girl Named Florida Solutions. http://allendowney.blogspot.com/2011/11/girl-named-florida-solutions.html

March 21, 2019

The Gender Paradox – in a nutshell.

Further and deeper exploration of paradoxes and challenges of intuition and logic can be found in my recently published book, Probability, Choice and Reason.

You meet a man at a sales convention who mentions that he has two children, and you learn that one of them, at least, is a boy. What is the chance that his other child is a girl? It’s not a half, as most people think, but 2/3. What now if see someone on the street accompanied by a young lad whom he introduces as his son, and mentions that he has one other child. What is the chance that the other child is a girl? 2/3? No, this time it is ½. Puzzled? So was a Mr. John Francis, who challenged Marilyn Vos Savant, of Monty Hall fame, who had proposed the equivalent of these answers in response to a question to her ‘Ask Marilyn’ column in 1996. He wrote: “I have a BA from Harvard, an MBA from the University of Pennsylvania Wharton School, a math SAT score of 800 and a perfect score in the Glazer-Watson critical thinking test, but I’m willing to admit I make mistakes. I hope you will have the strength of character to review your answer to this problem and admit that even a math teacher and the person with the highest IQ in the world can make a mistake from time to time.”

Despite his SAT score, she was right, and he was wrong.

Here’s a way to look at it. Let’s imagine a red and a yellow door in a house which you have just entered. You are told that there is either a boy or a girl behind each of the doors. The choice of whether to place a boy or a girl behind each door is determined by the toss of a coin. Well, there are four possibilities:

1. Boy behind both doors.
2. Boy behind red door and girl behind yellow door.
3. Girl behind red door and boy behind yellow door.
4. Girl behind both doors.

So the probability that there is a girl behind one of the doors = ¾. The new information I now impart is that at least one of the doors has a boy behind it. So we can delete option 4. This leaves three equally likely options, two of which include a girl. So the probability that there is a girl behind one of the doors given this new evidence is 2/3.

This is directly comparable to the pairs of options available in a two-child family. The chance of two boys is 1/4 (indexed according to some discriminating factor, e.g. by age – older boy and younger boy), chance of two girls is 1/4 (older girl and younger girl), chance of boy and a girl is 1/2 (older boy and younger girl plus older girl and younger girl). The discriminating factor does not need to be age, so long as it is uniquely discriminating between each element of a pair (e.g. height, alphabetical order of first name, number of fingers on left hand, behind a particular door).

Now consider the case of the man you met at the Sales Convention, the one with two children, and one at least was a boy. From this information, what is the chance that his other child is a girl? Well, it could be a boy and it could be a girl. So surely it’s 50/50. Actually, it’s 2/3, and the two-door problem gives the reason. In telling you that one of his children was a boy, he was telling you the equivalent of saying that there was a boy behind one of the doors but nothing else, so you had excluded option 4 (girl behind both doors), leaving options 1, 2 and 3 alive, with equal probability. Among these options, two contain a girl and only one a boy. So the chance his other child is a girl is 2/3. That was your reasoning.

Now what if you meet the guy on the street. “Didn’t we meet at a Sales Convention? If you recall, you told me you had two children, and you mentioned a son. Pleased to meet you, sonny!” You had, as we’ve seen, worked out that the chance his other child was a girl as 2/3. “No, you’ve got the wrong guy,” he politely replies. “I do have two children, but that’s pure coincidence.” Armed now with this information, you instantly work out the chance that the other child is a girl is not 2/3, but ½. Why is the probability different this time? Does it really matter how you found out about the boy, whether you were told about him or saw him for yourself? In fact, it does.

If you know that the man at the convention has two children and at least one son, the chance his other child is a girl is 2 in 3. If you find out that the man in the street has two children, but you don’t know that he has a son until you see him, the chance his other child is a girl is 1 in 2. The reason is straightforward. If you just happen to find out he has a son, without knowing in advance that he has at least one son, it is like opening a specific door (say the red door) and finding a boy behind it. The door represents any unique piece of discriminating information. Location, age, and so on would do as well. Now this rules out Option 4. It also rules out Option 3. The discriminating factor here in seeing a particular boy, not a generic boy. That leaves Option 1 (the remaining child is a boy) and Option 2 (the remaining child is a Girl). So what is the chance that the child behind the other door is a girl? If the probabilities are equal, they are both 1/2.

The problem can also be compared to asking the man at the Convention, before you know that he a boy, to toss a coin and produce one of his children at random. Or maybe draw a card from a deck. Black card means boy, red card means girl. In the case that he has a boy and girl, we might assume there is a 50-50 chance that he will produce a boy, and so that leaves a 50-50 chance that the remaining child is a girl. So, in any case where you find out that he has a boy, without any other information except that he has two children, a reasonable estimate of the chance that the remaining child is a girl is 50-50. But if you know that he has at least one boy, it is different information. Different information changes everything. It is like knowing that there is definitely a boy behind one of the doors but without knowing which. That rules out Option 4, but leaves the other three Options live, and there is a girl remaining in two of the Options (Options 2 and 3) and a boy in just one of them (Option 1). So the chance the other child is a girl is 2/3. Put another way, you know that there is a 1 in 3 chance that he has two boys (BB, BG, GB) and so the chance the other child is a girl is 2 in 3.

If you see the boy, however, and you know there was a 50-50 chance that you would do so if he had a boy and girl, it is different information to simply being told that he has at least one son. It is like identifying one of the coloured doors (say the red door) and peeking behind it. Say it’s a boy. That leaves an equal chance there is a boy or a girl behind the yellow door. The different information sets can be compared to tossing coins. The possible outcomes are HH, HT, TH, TT. If you already know there is ‘at least’ one head, that leaves HH, HT, TH. The probability that the remaining coin is a Tail is 2 in 3. If, on the other hand, you don’t have any information other than that a coin has been tossed twice, the possibilities are HH, HT, TH, TT. Equal chance of a Head or a Tail if you uncover one coin randomly. So there is a 50-50 chance you will see a Head if you uncover one of the coins randomly, and there is a 50-50 chance that the coin you haven’t uncovered is a Tail. What goes for coins goes also for people. Actually, it’s not quite that simple. This assumes that there was a 50-50 chance that the man in the street would choose a boy or a girl (if he had one of each) to accompany him on his walk! The more likely it is that he would choose to take a boy with him than a girl (if he had a girl), the more likely it is that the other child is a girl. It is like tossing two coins, but only ever choosing to show a Head if there is a Head and a Tail. In that case, the likelihood that the other coin is a Tail is 2/3. It is the equivalent of announcing that you have at least one Head (or one boy) whenever you do have one, but never allowing the possibility of a Tail (or a girl) to be observed. It can be stated this way. “If something that might exist is unobservable, it is more likely to exist than if it can be observed but isn’t observed.” This is the strong version of what I term the ‘Possibility Theorem.’ Bottom line is that when we only display the boy when he exists, but never display the girl when she exists, it is more likely that the unobserved child is a girl. If we display both with equal probability, and they exist with equal probability, then when we observe one, the probability that the other exists is the same, unless some other information exists. It’s the difference between knowing that a boy is behind the red or yellow door, or perhaps both, on the one hand, and opening the red door and finding a boy behind it, on the other hand, when we know that boys and girls have been assigned to the doors by the toss of coin. In the first case, the probability of a girl behind a door in the puzzle house is 2 in 3; in the case where we open the red door and reveal the boy, the probability of a girl behind the other door is 1 in 2. We are assuming, of course, that there is actually a 50-50 chance of a boy and a girl, a Head and a Tail. If we change the assumptions or change the information set, we change the answer.

Appendix

Two solutions to the Boy-Girl Paradox using Bayes’ Theorem

Solution 1:

Consider a room containing two children. Assume equal probability that each child is a boy or a girl.

Let P (BB) = Probability of 2 boys; P (GG) = Probability of 2 girls; P (BG) = Probability of a boy and a girl.

So P (BB) = 1/4

P (GG) = 1/4

P (G and B) = 1/2

Now add the assumption that AT LEAST one is a boy. Represent this by B.

Let P (A I B) mean the probability of A Given B

So P (BB I B) = P (B I BB) x P (BB)/P(B) … by Bayes’ Theorem

P (B I BB) = probability of AT LEAST one boy given that both are boys = 1

It is clear that if both children are boys, at least one of the children must be a boy.

P (BB) = Probability of both being boys = 1/4, from the prior distribution

P (B) = probability of AT LEAST one being a boy, which includes BB and G.B = P (BB) + P (G.B) = 1/4 + 1/2 = 3/4

So P (BB I B) = 1/4 / 3/4 = 1/3, i.e. the probability that the other child is a boy is 1/3. The probability is 2/3 that the other child is a girl.

In other words, if at least one of the children is a boy, the chance that the other is a girl is 2/3.

Assume P (b) is the probability of drawing a boy from all possible cases (i.e. without the AT LEAST assumption).

Then, P (BB I b) = P (b I BB) x P (BB) / P (b)

= 1 x 1/4 / 1/2 = ½

Note that P (b), the probability of a boy in a random two-child family we can assume for present purposes to be the same as the probability of a girl, so the probability of each is ½.

P (b I BB) is the probability of a boy in a two-boy family, and is clearly 1.

P (BB) is the probability of both being boys = ¼ from the prior distribution.

So P (BB I b) = P (b I BB) x P (BB) / P (b) = 1 x 1/4 / 1/2 = 1/2, i.e. the probability that the other child is a boy is 1/2. The probability is 1/2 that the other child is a girl.

Solution 2:

What is the probability that someone has a girl, if he has two children, and you see him with his son?

Assuming that it was equally probable that you would have seen him with the boy or the girl, then the probability of seeing him with boy is ½, and with a girl is ½, if he has one of each. The correct calculation is now:

Let G2 be the probability he has a girl and P(B1) that he has a boy.

We want to find P(G2|B1).

P(G2|B1) = p(B1 and G2)/p(B1)

P(G2|B1) = 1/4/1/2 = ½

This is standard Bayesian analysis, but a complete solution requires an appeal to a “Possibility Theorem”.

“If something that might exist can’t be observed, it is more likely to exist than if it can be observed (with any positive probability) but isn’t observed.” (STRONG VERSION)

“If something might exist, it is more likely to do so than something which is more likely than it to be observed, but isn’t.” (WEAK VERSION)

This is critical in either formulation when assessing how the probability of a hypothesis being true might be affected by information which potentially exists and is relevant but is missing because it is for whatever reason unobserved or unobservable.

In this context, we must consider how likely is P(G2) to be true given the likelihood we would see the girl if she did exist.

If this is ½, the standard probability calculation is correct.

If there is a less than ½ chance that she would be observed even if she exists, however, then by appeal to this ‘Possibility Theorem’, the probability that she does exist is greater than half.

In particular, we require the more generalized rule:

P(G2IB1) = [P(G2 and B1) I (PB1Obs)/(PB1)] / [P(B1)IP(B1Obs)/P(B1)]

where P(B1Obs) = Probability that a boy would be observed

and so P(B1Obs)/P(B1) is the ratio of the probability that a boy would be observed to the actual probability of a boy existing.

In this case, this ratio could extend between 0 and 1, with 0.5 being a special case in which the probability that the other child is a girl is 0.5.

In the event that there is a zero probability that she would be observed even if she exists, P(G2IB1) = [P(G2 and B1) I (PB1Obs)/(PB1)] / [P(B1)IP(B1Obs)/P(B1)] =(1/4 x 2) / ¾, i.e.

the probability that she exists rises to 2/3.

In this context, knowing that someone has at least 1 boy is equivalent to observing the boy when the girl, should she exist, is unobservable.

P(G| at least 1 son) = p(G and at least 1 son)/p(at least 1 son)

P(G| at least 1 son) =1/2 / ¾ = 2/3

Without appeal to the ‘Possibility Theorem’, the solution to the paradox converges to 1/2 or 3/4. Appeal to the ‘Possibility Theorem’ shows that the actual probabilities can extend over a range between 1/2 and 3/4.

Exercise

You overhear a conversation in a bar in which a woman mentions her two children and also mentions her daughter. What is the probability that her other child is a boy?
The following week you see a woman in a park, and overhear her introducing her daughter, Barbara, to someone. “My other child’s at home”, you hear her say. What is the probability that her other child is a girl?

References and Links

Boy or Girl Paradox. Wikipedia. https://en.m.wikipedia.org/wiki/Boy_or_Girl_paradox

Boy or Girl Paradox. Wikivisually. https://wikivisually.com/wiki/Boy_or_Girl_paradox

March 21, 2019

The Portia’s Challenge Problem – in a nutshell.

In William Shakespeare’s ‘Merchant of Venice’, potential suitors of young Portia are offered a choice of three caskets, one gold, one silver and one lead. Inside one of them is a miniature portrait of her. Portia knows it is in the lead casket.

Now, according to her father’s will, a suitor must choose the casket containing the portrait to win Portia’s hand in marriage. The first suitor, the Prince of Morocco, must choose from one of the three caskets. Each is engraved with a cryptic inscription. The gold casket reads, “Who chooseth me shall gain what many men desire.” The silver casket reads, “Who chooseth me shall get as much as he deserves.” The lead casket reads, “Who chooseth me must give and hazard all he hath”. He chooses the gold casket, hoping to find “an angel in a golden bed.” Instead, he finds a skull and a scroll inserted into the skull’s “empty eye.” The message he reads on the scroll says, “All that glisters is not gold.” The Prince beats a hasty exit. “A gentle riddance”, says Portia. The next suitor is the Prince of Arragon. “Who chooseth me shall get as much as he deserves”, he reads on the silver casket. “I’ll assume I deserve the very best”, he declares, and opens the casket. Inside he finds a picture of a fool with a sharp dismissive note which says “With one fool’s head I came to woo, But I go away with two.”

Now let us think about a plot twist where Portia must open one of the other caskets and give Arragon a chance to switch choice of caskets if he wishes. She is not allowed to indicate where the portrait is and in this case must open the gold casket (she knows it is in the lead casket so can’t open that) and show it is not in there. She now asks the Prince whether he wants to stick with his original choice of the silver casket or switch to the lead casket.

Let us imagine that he believes that Portia has no better idea than he has of which casket contains the prize. In that case, should he switch from his original choice of the silver casket to the lead casket? Well, since Portia had no knowledge of the location of the portrait, she might have inadvertently opened the casket containing the portrait, so she adds new information by opening the casket. But if he knows that she is aware of the location of the portrait, her decision to open the gold casket and not the lead casket has doubled the chance that the lead casket contains the portrait compared to his original choice, other things equal. This is because there was just a one third chance that his original choice (silver) was correct and a two thirds chance that one of the other choices (gold, lead) was correct. She is forced to eliminate the losing casket of the two (in this case, gold), so the two thirds chance converges on the lead casket.

So should he switch to the lead casket or stay with the silver? It depends whether things actually are equal. In particular, it depends on how valuable any information contained in the inscriptions is. If he has little faith in the inscriptions to arbitrate, he should definitely switch and improve his chance of winning fair Portia’s hand from 1/3 to 2/3. If he thinks, however, that he has unlocked the secret from the inscriptions, the decision is more difficult. If so, he might stick with his choice in good conscience.

In summary, the key to the problem is the new information Portia introduced by opening a casket which she knew did not contain the portrait. By acting on this new information, the Prince can potentially improve his chance of correctly predicting which casket will reveal the portrait from 1 in 3 to 2 in 3 – by switching boxes when given the chance. Unless he has other information which makes the opening probabilities different to 1/3 for each casket, such as those cryptic inscriptions. If this information is potentially valuable, or at least if the Prince thinks so, that complicates matters!

Exercise

Potential suitors of Portia are offered a choice of three caskets, one gold, one silver and one lead. Inside one of them is a miniature portrait of her, which the suitor must correctly identify if he is to win her hand in marriage. Portia knows it is in the lead casket, and gives the suitor a chance to select it. The suitor, Bassanio, guesses it is in the gold casket. It is agreed in advance that once a casket is chosen, Portia will reveal one of the other caskets to be empty. She cannot open the lead casket as she knows that contains the portrait, so she opens the silver casket. Bassanio is now asked whether he wishes to stick with his original choice or switch to the lead casket. Whatever he now decides, that is his final choice. Should he switch to the lead casket, stick with the gold casket, or does it not matter? Would your answer chance if Portia did not know which casket contained the miniature portrait.

References and Links

Understanding the Monty Hall Problem. https://betterexplained.com/articles/understanding-the-monty-hall-problem/

Gold. Silver, and Lead Caskets. https://www.shmoop.com/merchant-of-venice/gold-silver-lead-caskets-symbol.html

Can you solve it? The mystery of Portia’s caskets. https://www.theguardian.com/science/2017/feb/13/can-you-solve-it-the-mystery-of-portias-caskets

Portia’s Caskets. https://dmackinnon1.github.io/portia/

March 21, 2019

The Four Door Problem – in a nutshell.

There are four doors, red, yellow, blue and green.

Three lead the way to dusty death. One leads the way to fame and fortune. They are assigned in order by your evil host who draws four balls out of a bag, coloured red, yellow, blue and green. The first three out of the bag are the colours of the doors that lead to dusty death. The fourth leads to fame and fortune. You must choose one of these doors, without knowing which of them is the lucky door.

Let us say you choose the red door. Since the destinies are randomly assigned to the doors, this means there is a 1 in 4 chance that you are destined for fame and fortune, a 3 in 4 chance that you are destined for a dusty death.

Your host, who knows the doors to death, now opens the yellow door, revealing a door to death. That is part of his job. He always opens a door, but never the door to fame and fortune.

Should you now walk through the red door, the blue door or the green door?

This is a bit like the Monty Hall Problem, and I have labelled it as Monty Hall Plus, to query how generalisable the standard Monty Hall switching strategy is when the number of choices increases.

Common intuition dictates that the chance that the red door leads to fame and fortune is 1 in 4 to start with, because there were four doors to choose from, equally likely to offer the way to fame and fortune. And that’s correct. After the yellow door is opened, that probability must rise. Right? After all, once the yellow door is opened, only three doors remain – the red door, the blue door and the green door. Surely there is an equal chance that fortune beckons behind each of these. If so, the probability in each case is 1 in 3.

Now, the host opens a second door, by the same process. Let’s say this time he opens the blue door, which again he reveals to be a death trap. That leaves just two doors. So surely they both now have a 1 in 2 chance. Take your pick, stick or switch. Does it really matter?

Yes, it does, in fact.

The reason it matters, just as in the standard Monty Hall problem, is that the host knows where the doors lead. When you choose the red door, there is a 1 in 4 chance that you have won your way to fame and fortune if you stick with it. There is a 3 in 4 chance that the red door leads to death. If you have chosen the red door, and it is the lucky door, the host must open one of the doors leading to a dusty death. This is valuable information.

Before he opened the yellow door, there was a 3 in 4 chance that the lucky door was EITHER the yellow, the blue or the green door. Now he is telling you that there is a 3 in 4 chance that the lucky door is EITHER the yellow, the blue or the green door BUT it is not the yellow door. So there is a 3 in 4 chance that it is EITHER the blue or the green door. It is equally likely to be either, so there is a 3 in 8 chance that the blue door is the lucky door and a 3 in 8 chance that the green door is the lucky door. But there is a 3 in 4 chance in total that the lucky door is EITHER the blue door or the green door.

Now he opens the blue door, and introduces even more useful information. Now he is telling you that there is a 3 in 4 chance that the lucky door is EITHER the blue or the green door BUT that it is not the blue door. So there must be a 3 in 4 chance that it is the green door. So now you can stick with the red door, and have a 1 in 4 chance of avoiding a dusty death, or switch to the green door and have a 3 in 4 chance of avoiding that fate.

It is because the host knows what is behind the doors that his actions, which are constrained by the fact that he can’t open the door to fame and fortune, introduces new information. Because he can’t open the door to fame and fortune, he is increasing the probability that the other unobserved destinies include the lucky one.

If he didn’t know what lay behind the doors, he could inadvertently have opened the door to fortune, so when he does so this adds no new information save that he has randomly eliminated one of the doors. If three doors now remain, they each offer a 1 in 3 chance of avoiding a dusty death. If only two doors remain unopened, they each offer in this case a 1 in 2 chance of death or glory. So when the host is as clueless about the lucky door as you are, you might as well just flip a coin – and hope!

Appendix

We can apply Bayes’ Theorem to solve the Deadly Doors Problem.

D1: Host opens Red Door (Door 1).

D2: Host opens Yellow Door (Door 2).

D3: Host opens Blue Door (Door 3).

D4: Host opens Green Door (Door 4).

C1: The car is behind Red Door (Door 1).

C2: The car is behind Yellow Door (Door 2).

C3: The car is behind Blue Door (Door 3).

C4: The car is behind Green Door (Door 4).

The prior probability of the game show host finding a car behind any particular door is P(C#) = 1/4,

where P(C1) = P (C2) = P(C3) = P(C4).

Assume the contestant chooses Door 1 and the host randomly opens one of the three doors he knows the car is not behind.

The probability that he will open Door 4, P (D4), is 1/3 and the conditional probabilities given the car being behind either Door 1 or Door 2 or Door 3 are as follows.

P(D4 I C1) = 1/3 … as he is free to open Door 2, Door 3 or Door 4, as he knows the car is behind the contestant’s chosen door, Door 1. He does so randomly.

P(D4 I C4) = 0 … as he cannot open a door that a car is behind (Door 4) or the contestant’s chosen door, so he must choose either Door 2 or Door 3.

P (D4 I C3) = 1/2 … as he cannot open a door that a car is behind (Door 2) or the contestant’s chosen door (Door 1), so he must choose either Door 3 or Door 4.

So, P (C1 I D4) = P (D4 I C1). P (C1) / P (D4) = 1/3 x 1/4 / 1/3 = 1/4

Therefore, there is a 1/4 chance that the car is behind the door originally chosen by the contestant (Door 1) when Monty opens Door 4.

But P (C2 I D4) = P (D4 I C2). P (C2) / P (D4) = 1 x 1/2 / 1/3 = 3/8

P (C3 I D4) = P (D4 I C3). P (C3) / P (D4) = 1/2 x 1/4 / 1/3 = 3/8

So, there is a 3/8 chance the car is behind Door 2 and a 3/8 chance the car is behind Door 3 after Monty Hall opens Door 4. Both are greater than sticking with the original Door (probability of car = 1/4), so it is advisable to switch to either door if offered the opportunity at this point in the game.

If Monty decides to randomly open one of the remaining closed doors, it must be Door 2 or Door 3, as he is not allowed to open Door 1, the door selected by the contestant. It is equally likely that it is behind each, with a probability of 3/8 in each case. Say Door 3 is opened, the probability it is behind the remaining door doubles to 3/4 (the combined probabilities of the three doors which were not selected).

The probability that he will open Door 3, P (D3), is 1/2 and the probability that the car is behind Door 3 is 3/8. The conditional probabilities given the car being behind either Door 1 or Door 2 or Door 3 are as follows.

P(D3 I C1) = 1/2 … as he is free to open Door 2 or Door 3, as he knows the car is behind the contestant’s chosen door, Door 1. He does so randomly.

P(D3 I C3) = 0 … as he cannot open a door that a car is behind (Door 3) or the contestant’s chosen door, so he must choose Door 2.

P (D3 I C2) = 1 … as he cannot open a door that a car is behind (Door 2) or the contestant’s chosen door (Door 1), so he must choose Door 3.

We know (shown above) that once he has opened Door 4 that P (C1) = 1/4; P(C2) = 3/8; P (C3) = 3/8.

So, P (C1 I D3) = P (D3 I C1). P (C1) / P (D3) = 1/2 x 1/4 / 1/2 = 1/4

Therefore, there is still a 1/4 chance that the car is behind the door originally chosen by the contestant (Door 1) when Monty opens Door 3.

But P (C2 I D3) = P (D3 I C2). P(C2) / P (D3) = 1 x 3/8 / 1/2 = 3/4

So once Monty Hall has opened Door 3 and Door 4, the probability of winning the car rises from 1/4 to 3/4 by switching from Door 1 (the Red Door) to Door 2 (the Yellow Door).

Puzzle Extra: After the host opens one of the remaining three doors (Door 2, Door 3, Door 4), after you have chosen Door 1, you are invited to switch. Assume you do so, to one of the remaining two doors (say Door 2 and Door 3). Say you choose to switch to Door 2. The host now opens Door 3, revealing the path to a dusty death. So fame and fortune car could now lie behind your original choice, Door 1, or your new choice, Door 2. You are invited to stay with your choice of Door 2 or switch back to Door 1. Which should you do?

Clue: By switching from Door 1 to Door 2, the chance of opening the lucky door has increased from 1/4 to 3/8. So the chance it is either Door 1 or Door 3 must be 5/8. The host now opens Door 3, revealing the way to a dusty death. So the chance that Door 1 is the lucky door now 5/8 while the chance that Door 2 is the lucky door is 3/8/ So he should switch back. Right … or wrong?

Exercise

Question 1. You are given the choice of four boxes, coloured red, orange, purple and magenta. Three are empty, a thousand pound prize is inside the fourth. You are asked to select a box. You select the red box.

I know which box contains the prize and must open one of the boxes that I know to be empty. I open the orange box and it is empty.

To maximise your chance of winning the prize, should you switch to the purple box, or switch to the magenta box, stick with the red box, or does it not matter? Why?

Question 2. Suppose in a separate experiment, you are given the choice of four boxes, coloured black, white, grey and brown. Three are empty, a thousand pound prize is inside the fourth. You are asked to select a box. You select the black box.

I do not know which box contains the prize and must open one of the boxes. I open the grey box and it is empty.

To maximise your chance of winning the prize, should you switch to the white box, switch to the brown box, stick with the black box, or does it not matter? Why?

References and Links

Monty Hall Problem. Wolfram MathWorld. http://mathworld.wolfram.com/MontyHallProblem.html

Games and Monty Hall. Statistical Ideas. http://statisticalideas.blogspot.com/2015/06/games-and-monty-hall.html#!/2015/06/games-and-monty-hall.html

March 21, 2019

The Three Box Problem – in a nutshell.

Further and deeper exploration of paradoxes and challenges of intuition and logic can be found in my recently published book, Probability, Choice and Reason.

The Monty Hall Problem is a famous, perhaps the most famous, probability puzzle ever to have been posed. It is based on an American game show, Let’s Make a Deal, hosted by Monty Hall, and came to public prominence as a question quoted in an ‘Ask Marilyn’ column in Parade magazine in 1990.

‘Suppose you’re on a game show, and you’re given the choice of three doors. Behind one door is a car: behind the others, goats. You pick a door, say No.1, and the host, who knows what’s behind all the doors, opens another door, say No. 3, which reveals a goat. He then says to you, “Do you want to switch to door No. 2?” This is not a strategic decision on his part based on knowing that you chose the car, in that he always opens one of the doors concealing a goat and offers the contestant the chance to switch. It is part of the rules of the game.

So should you switch doors?

Consider the probability that you chose the correct door the first time, i.e. No 1 is the door to a car. What is that probability? Well, clearly it is 1/3 in that you have three doors to choose from, all equally likely.

But what happens to the probability that Door No. 1 is the key to the car once Monty has opened one of the other doors?

This again seems quite straightforward. There are now two doors left unopened, and there is no way to tell behind which of these two doors lies the car. So the probability that Door 1 offers the star prize now that Door 2 (or else Door 3) has been opened would seem to be 1/2. So should you switch? Since the two remaining doors would seem to be equally likely paths to the car, it would seem to make no difference whether you stick with your original choice of Door 1 or switch to the only other door that is unopened.

But is this so? Marilyn Vos Savant, in her ‘Ask Marilyn’ column, declared that you should switch doors to boost your chances of winning the car. This answer was howled down by the great majority of the readers who wrote in, and was rejected by even such as Paul Erdos, one of the most prolific mathematicians of all time. Their reasoning was that once the door was opened, only two doors remained closed, so the chance that the car was behind either of the doors was identical, i.e. ½. For that reason, switching or sticking by the contestant should make no difference to the chance of winning the car. Vos Savant argued, in contrast, that switching doubled the chance of winning the car.

Let’s think it through.

When you choose Door 1, there is a 1 in 3 chance that you have won your way to the car if you stick with it. There is a 2 in 3 chance that Door 1 leads to a goat. On the other hand, if you have chosen Door 1, and it is the lucky door, the host must open one of the two doors concealing a goat. He knows that. You know that. So he is introducing useful new information into the game.

Before he opened a door, there was a 2 in 3 chance that the lucky door was EITHER Door 2 or Door 3 (as there was a 1 in 3 chance it was Door 1). Now he is telling you that there is a 2 in 3 chance that the lucky door is EITHER Door 2 or Door 3 BUT it is not the door he just opened. So there is a 2 in 3 chance that it is the door he didn’t open. So, if he opened Door 2, there is a 2 in 3 chance that Door 3 leads to the car. Likewise, if he opened Door 3, it is a 2 in 3 chance that Door 2 leads to the car. Either way, you are doubling your chance of winning the car by switching from Door 1 (probability of car = 1/3) to whichever of the other doors he does not open (probability of car = 2/3).

It is because the host knows what is behind the doors that his actions, which are constrained by the fact that he can’t open the door to the car, introduce valuable new information. Because he can’t open the door to the car, he is obliged to point to a door that isn’t concealing the car, increasing the probability that the door he doesn’t open is the lucky one (from 1/3 to 2/3).

If this is not intuitively clear, there is a way of making it more so. Let’s say there were 20 doors, with a car behind one of them and goats behind 19 of them. Now say we choose Door 1. This means that the probability that this is the winning door is 1 in 20. There is a 19 in 20 probability that one of the other doors conceals the car. Now Monty starts opening one door at a time, taking care not to reveal the car each time. After opening a carefully chosen 18 doors (chosen because they didn’t conceal a car), just one door remains. This could be the door to the car or your original choice of Door 1 could be the path to the car. But your original choice had an original probability of 1/20 of being the winning door. Nothing has changed that, because every time he opens a door he is sure to avoid opening a door leading to a car. So the chance that the door he leaves unopened points to the car is 19/20. So, by switching, you multiply the probability that you have won the car from 1/20 to 19/20.

If he didn’t know what lay behind the doors, he could inadvertently have opened the door to the car, so when he does so this adds no new information save that he has randomly eliminated one of the doors. If he randomly opens 18 doors, not knowing what is behind them, and two doors now remain, they each offer a 1 in 2 chance of the car. So you might as well just flip a coin – and hope!

Even when it is explained this way, I find that many people find it impossible to grasp the intuition. So here’s the clincher.

Say I have a pack of 52 playing cards, which I lay face down. If you choose the Ace of Spades, you win the car. Every other playing card, you win nothing. Go on, choose one. This is now laid aside from the rest of the deck, still face down. The probability that the card you have chosen is the Ace of Spades is clearly 1/52.

Now I, as the host, know exactly where the Ace of Spades is. There is a 51/52 chance that it must be somewhere in the rest of the deck, and if it is I know where. Now, I carefully turn over the cards in the deck one a time, taking care never to turn over the Ace of Spades, until there is just one card left. What is the chance that the one remaining card from the deck is the Ace of Spades? It is 51/52 because I have carefully sifted out all the losing cards to leave just one card, the Ace of Spades. In other words, I have presented you with the one card out of the remaining deck of 51 that is the Ace of Spades, assuming that it was not the card you chose in the first place. The chance that the card you chose in the first place was the Ace of Spades is 1/52. So the card I have selected for you out of the remaining deck has a probability of 51/52 of being the Ace of Spades. So should you switch when I offer you the chance to give up your original card for the one that I have filtered out of the remaining 51 cards (taking care each time never to reveal the Ace of Spades). Of course you should. And that’s what you should tell Monty Hall every single time. Switch!

Appendix

In the standard description of the Monty Hall Problem, Monty can open door 1 or door 2 or door 3. The car can be behind door 1, door 2 or door 3. The contestant can choose any door.

We can apply Bayes’ Theorem to solve this.

D1: Monty Hall opens Door 1.

D2: Monty Hall opens Door 2.

D3: Monty Hall opens Door 3.

C1: The car is behind Door 1.

C2: The car is behind Door 2.

C3: The car is behind Door 3.

The prior probability of Monty Hall finding a car behind any particular door is P(C#) = 1/3,

where P(C1) = P (C2) = P(C3).

Assume the contestant chooses Door 1 and Monty Hall randomly opens one of the two doors he knows the car is not behind.

The conditional probabilities given the car being behind either Door 1 or Door 2 or Door 3 are as follows.

P(D3 I C1) = 1/2 … as he is free to open Door 2 or Door 3, as he knows the car is behind the contestant’s chosen door, Door 1. He does so randomly.

P(D3 I C3) = 0 … as he cannot open a door that a car is behind (Door 3) or the contestant’s chosen door, so he must choose Door 2.

P (D3 I C2) = 1 … as he cannot open a door that a car is behind (Door 2) or the contestant’s chosen door (Door 1).

These are equally probable, so the probability he will open D3, i.e. P(D3) = ½ + 0 + 1 / 3 = 1/2

So, P (C1 I D3) = P(D3 I C1). P(C1) / P(D3) = 1/2 x 1/3 / 1/2 = 1/3

Therefore, there is a 1/3 chance that the car is behind the door originally chosen by the contestant (Door 1) when Monty opens Door 3.

But P (C2 I D3) = P(D3 I C2).P(C2) / P (D3) = 1 x 1/3 / 1/2 = 2/3

Therefore, there is twice the chance of the contestant winning the car by switching doors after Monty Hall has opened a door.

Exercise

Question 1. You are given the choice of three boxes, coloured red, yellow and blue. Two are empty, a thousand pound prize is inside the third. You are asked to select a box. You select the blue box.

I know which box contains the prize and must open one of the boxes that I know to be empty. I open the yellow box and it is empty.

To maximise your chance of winning the prize, should you switch to the red box, stick with the blue box, or does it not matter? Why?

Question 2. Suppose in a separate experiment, you are given the choice of three boxes, pink, green and violet. Two are empty, a thousand pound prize is inside the third. You are asked to select a box. You select the green box.

I do not know which box contains the prize and must open one of the boxes. I open the pink box and it is empty.

To maximise your chance of winning the prize, should you switch to the violet box, stick with the green box, or does it not matter? Why?

References and Links

Su, Francis E. et al., Monty Hall Problem. Math Fun Facts. https://www.math.hmc.edu/funfacts/ffiles/20002.6.shtml

Open Culture. Oct. 3, 2017. http://www.openculture.com/2017/10/the-famously-controversial-monty-hall-problem-explained-a-classic-brain-teaser.html

Statistics How To. https://www.statisticshowto.datasciencecentral.com/probability-and-statistics/monty-hall-problem/

The Monty Hall Problem. https://www.montyhallproblem.com

Understanding the Monty Hall Problem. https://betterexplained.com/articles/understanding-the-monty-hall-problem/

Bayes’ Theorem – Monty Hall Problem. http://ucanalytics.com/blogs/bayes-theorem-monty-hall-problem/

Monty Hall problem. Wikipedia. https://en.m.wikipedia.org/wiki/Monty_Hall_problem