# The Bobby Smith Problem: Bayes in Action

Bobby Smith, aged 8, is a good schoolboy footballer, but you know that only one in a thousand such 8-year-olds go on to become professional players. So you would like to get an unbiased assessment of his real chance of developing into a top player. A coach tells you there is a test, taken by all good 8-year-old footballers, that can measure the child’s potential. The test, you learn, is 95% accurate in identifying future professional footballers, and these always receive a grade of A+.

Bobby takes the test and is graded A+.

How many of the 8-year-olds tested, who get an A+, fail to develop into top players, you ask. Now the coach imparts the good news. All current professional players scored A+ when they took the test in their own school days, and we can take it that anyone who scores below that can be ruled out as a future professional player. And the test is 95% accurate, so only 5% of those who get the A+ grade fail to develop into professional footballers. So what is the actual chance that Bobby will become a top player?

If you are like most people, you will think the chance is very high.

This is your reasoning: I don’t really know whether Bobby is likely to turn into a professional player or not. But he has taken this test. In fact, no current professional player scored below A+, and the test only very rarely allocates a top grade to a child who will not become a professional footballer. If the test is really this good, therefore, it looks like Bobby will have a bright future as a football star.

Is this true? Think of it this way. If there were no test, you would have asked the coach a very basic question: in your experience, what is the chance that Bobby will become a professional player? The coach would have dampened your enthusiasm: one in a thousand, he would have said. But with the test result in hand, there’s no need to ask this question. It’s irrelevant in the face of a very accurate test result, isn’t it?

In fact, this is a well-known fallacy, which psychologists call the *Inverse Fallacy*, or *Prosecutor’s Fallacy*. The fallacy is to confuse the probability of a hypothesis being true, given some evidence, with the probability of the evidence arising, given the hypothesis is true.

In our example, the hypothesis is that Bobby will become a top player, and the evidence is the high test score. What we want to know is the probability that Bobby will become a top player, given that the test says he will be. What we know, on the other hand, is the probability that the test says Bobby will be a top player, given that he will be. The coach told you this probability, on all available evidence, is 100%: the test is in this sense infallible, in that all professional players score A+ on the test. In answering your other question, the coach also told you the probability of an A+ test score, given that the child will not become a top player, is only 5%. You take this information and conclude that Bobby is very likely to turn into a top player.

In fact, of the thousand children who took the test, only one (statistically speaking) will become a professional footballer. The test is 95% accurate, so 5% of the 1,000 children will score A+ and not become top players, i.e. there will be 50 ‘false positives.’ Anyone who will become a top player, on the other hand, will score A+ on the test.

So what is the chance that Bobby will become a professional footballer if he scores A+ on the test?

Solution: 50 kids who will not become top footballers score A+ (the 50 ‘false positives’). Only one of the one thousand eight-year-olds who take the test develops into a professional player, and that child will score A+. Look at it this way. A thousand 8-year-olds take the test, and of these 50 of them will receive a letter telling them they have scored A+ on the test but will not develop into top players. One child will receive a letter with a score of A+ and actually will go on to become a professional player. Therefore the probability you will become a top footballer if you score A+ is just 1 in 51, i.e. 1.96%.

This is the same idea as the medical ‘false positives’ problem.

In that problem, a thousand people go to the doctor and all are tested for flu. Only one actually has the flu. Those with the flu always test positive. We know that the test for flu is 95% accurate, so 5% of the 1,000 people will test positive and not have the flu, i.e. there will be 50 ‘false positives’. One will test positive who does have the flu. Those with the flu all test positive. So what is the chance that you have the flu if you test positive?

Solution: 50 people who do not have the flu test positive. One person who has the flu tests positive. Therefore, the probability you have the flu if you test positive is 1 in 51, i.e. 1.96%

We can also solve the Bobby Smith problem using Bayes’ Theorem. The (posterior) probability that a hypothesis is true after obtaining new evidence, according to the a,b,c formula of Bayes’ Theorem, is equal to:

ab/[ab+c(1-a)]

a is the prior probability, i.e. the probability that a hypothesis is true before the new evidence. b is the probability of the new evidence if the hypothesis is true. c is the probability you of the new evidence if the hypothesis is false.

In the case of the Bobby Smith problem, the hypothesis is that Bobby will develop into a professional player.

Before the new evidence (the test), this chance is 1 in 1000 (0.001)

So a = 0.001

The probability of the new evidence (the A+ score on the test) if the hypothesis is true (Bobby will become a professional player) is 100%, since all professional players score A+ on the test.

So b =1

The probability we would see the new evidence (the A+ score on the test) if the hypothesis is false (Bobby will not become a professional player) is 5%, since the test is 95% accurate in spotting future professional footballers.

So c = 0.05

Substituting into Bayes’ equation gives:

Posterior probability = ab/[ab+c(1-a)] = 0.001x 1 / [0.001 x 1 + 0.05 (1 – 0.001)] = 0.0196

So, using Bayes’ Theorem, the chance that Bobby Smith, who scored A+ on the test which is 95% accurate, will actually become a top player, is not 95% as intuition might suggest, but just 1.96%, as we have shown previously by a different route.

So there is just a 1.96 per cent chance that Bobby Smith will go on to become a professional player, despite scoring A+ on that very accurate test of player potential.

That’s the statistics, the cold Bayesian logic. Now for the good news. Bobby Smith was the lucky one. He currently plays for Barcelona, under a different name.

**Appendix**

We can also solve the Bobby Smith problem using the traditional notation version of Bayes’ Theorem.

P (HIE) = P (EIH). P (H) / [P (EIH) . P(H) + P (EIH’) . P(H’)]

Before the new evidence (the test), this chance is 1 in 1000 (0.001)

So P (H) = 0.001

The probability of the new evidence (the A+ score on the test) if the hypothesis is true (Bobby will become a professional player) is 100%, since all professional players score A+ on the test.

So P (EIH) =1

The probability we would see the new evidence (the A+ score on the test) if the hypothesis is false (Bobby will not become a professional player) is 5%, since the test is 95% accurate in spotting future professional footballers.

So P (EIH’) = 0.05

Substituting into Bayes’ equation gives:

P (HIE) = 0.001x 1 / [0.001 x 1 + 0.05 (1 – 0.001)] = 0.0196

**APPENDIX TO CHAPTER 8**

In the case of the Othello problem, the hypothesis is that Desdemona is guilty of betraying Othello with Cassio. Before the new evidence (the finding of the keepsake), let’s say that Othello assigns a chance of 4% to Desdemona being unfaithful.

So P (H) = 0.04

The probability we would see the new evidence (the keepsake in Cassio’s lodgings) if the hypothesis is true (Desdemona and Cassio are conducting an affair) is, say, 50%.

So P (EIH) = 0.5

The probability we would see the new evidence (the keepsake in Cassio’s lodgings) if the hypothesis is false is, say, just 5%.

So P (EIH’) = 0.05

Substituting into Bayes’ Theorem:

P (HIE) = P (EIH). P (H) / [P (EIH) . P(H) + P (EIH’) . P(H’)]

P (HIE) = 0.5 x 0.04 / [0.5 x 0.04 + 0.05 x 0.96]

P (HIE) = 0.02 / [0.02 + 0.048] = 0.294

Posterior probability = 0.294.

So, using Bayes’ Rule, and these estimates, the chance that Desdemona is guilty of betraying Othello is 29.4%.

If P (EIH’) = 0.01

The new Bayesian probability of Desdemona’s guilt now becomes:

P (HIE) = 0.5 x 0.04 / [0.5 x 0.04 + 0.01 x 0.96]

P (HIE) = 0.02 / (0.02 + 0.0096) = 0.02 / 0.0296 = 0.676

Updated probability = 0.676 = 67.6%.