Bobby Smith, aged 8, is a good schoolboy basketball player, but you know that only one in a thousand such 8-year-olds go on to become professional players.  So you would like to get an unbiased assessment of his real chance of developing into a top player. A coach tells you there is a test, taken by all good 8-year-old players, that can measure the child’s potential. If the test was perfect, everyone who received an A+ on the test would go on to a become a pro player. In fact, it is 95% accurate, in the sense that 5% of those taking the test will receive an A+ score and fail to become professional basketball players. Still, this is a very small percentage. Unfortunately, though, anyone failing to score A+ has no chance of becoming a pro player.

Bobby takes the test and is graded A+.

So what is the actual chance that Bobby will become a professional basketball player?

If you are like most people, you will think the chance is very high.

This is your reasoning: I don’t really know whether Bobby is likely to turn into a professional player or not. But he has taken this test. In fact, no professional player could have scored below A+, and the test only very rarely allocates a top grade to a child who will not become a professional basketball player. If the test is really this good, therefore, it looks like Bobby will have a bright future as a basketball star.

Is this true? Think of it this way. If there were no test, you would have asked the coach a very basic question: in your experience, what is the chance that Bobby will become a professional player? The coach would have dampened your enthusiasm: one in a thousand, he would have said. But with the test result in hand, there’s no need to ask this question. It’s irrelevant in the face of a very accurate test result, isn’t it?

In fact, this is a well-known fallacy, another example of the Inverse Fallacy, or Prosecutor’s Fallacy. The fallacy is to confuse the probability of a hypothesis being true, given some evidence, with the probability of the evidence arising given the hypothesis is true.

In our example, the hypothesis is that Bobby will become a professional player, and the evidence is the high test score. What we want to know is the probability that Bobby will become a pro player, given that the test says he will be. What we know, on the other hand, is the probability that Bobby will score A+ on the test, given that he will become a professional player. The coach told you that this probability is 100%: all future professional players will score A+ on the test. In answering your other question, the coach also told you that that 5% of those taking the test will score A+ yet fail to progress to the professional game. This is a small percentage. So you take this information and conclude that Bobby is very likely to turn into a top player.

In fact, of the thousand children who took the test, only one (statistically speaking) will become a professional player. The test for an A+ is 95% accurate in identifying a future pro player, in the sense that 5% of the 1,000 children will score A+ and not become professional  players, i.e. there will be 50 ‘false positives.’ Anyone who will become a pro basketball player, on the other hand, will score A+ on the test.

So what is the chance that Bobby will become a professional basketball player if he scores A+ on the test?

Solution: 50 children who will not become professional basketball players score A+ (the 50 ‘false positives’). Only one of the one thousand eight-year-olds who take the test develops into a professional player, and that child will score A+. Look at it this way. A thousand 8-year-olds take the test and of these 50 of them will receive a letter telling them they have scored A+ on the test but will not develop into top players. One child will receive a letter with a score of A+ and actually will go on to become a professional player. Therefore the probability Bobby will become a top basketball player if he scores A+ is just 1 in 51, i.e. 1.96%.

This is a similar idea to the medical ‘false positives’ problem.

In the equivalent flu version of the problem, a thousand people go to the doctor and all are tested for flu. Only one actually has the flu. Those with the flu always test positive. We know that the test for flu is 95% accurate, in the sense that 5% of the 1,000 people will test positive and not have the flu, i.e. there will be 50 ‘false positives’. One will test positive who does have the flu. Those with the flu all test positive. So what is the chance someone has the flu if they test positive? In this case, 50 people who do not have the flu test positive. One person who has the flu tests positive. Therefore, the probability you have the flu if you test positive is 1 in 51, i.e. 1.96%

We can also solve the Bobby Smith problem using Bayes’ Theorem. The (posterior) probability that a hypothesis is true after obtaining new evidence, according to the a,b,c formula of Bayes’ Theorem, is equal to:

ab/ [ab+c(1-a)]

a is the prior probability, i.e. the probability that a hypothesis is true before the new evidence. b is the probability of the new evidence if the hypothesis is true. c is the probability of the new evidence if the hypothesis is false.

In the case of the Bobby Smith problem, the hypothesis is that Bobby will develop into a professional player.

Before the new evidence (the test), this chance is 1 in 1000 (0.001)

So a = 0.001

The probability of the new evidence (the A+ score on the test) if the hypothesis is true (Bobby will become a professional player) is 100%, since all professional players score A+ on the test.

So b =1

The probability we would see the new evidence (the A+ score on the test) if the hypothesis is false (Bobby will not become a professional player) is 5%, since the test is 95% accurate in spotting future professional players.

So c = 0.05

Substituting into Bayes’ equation gives:

Posterior probability = ab/ [ab+c(1-a)] = 0.001x 1 / [0.001 x 1 + 0.05 (1 – 0.001)] = 0.0196

So, using Bayes’ Theorem, the chance that Bobby Smith, who scored A+ on the test which is 95% accurate, will actually become a top player, is not 95% as intuition might suggest, but just 1.96%, as we have shown previously by a different route.

There is, therefore, just a 1.96 per cent chance that Bobby Smith will go on to become a professional basketball player, despite scoring A+ on that very accurate test of player potential.

That’s the statistics, the cold Bayesian logic. Now for the good news. Bobby Smith was the lucky one. He currently plays for New York Knicks under a different name.

Appendix

We can also solve the Bobby Smith problem using the traditional notation version of Bayes’ Theorem.

P (HIE) = P (EIH). P (H) / [P (EIH) . P(H) + P (EIH’) . P(H’)]

Before the new evidence (the test), this chance is 1 in 1000 (0.001)

So P (H) = 0.001

The probability of the new evidence (the A+ score on the test) if the hypothesis is true (Bobby will become a professional player) is 100%, since all professional players score A+ on the test.

So P (EIH) =1

The probability we would see the new evidence (the A+ score on the test) if the hypothesis is false (Bobby will not become a professional player) is 5%, since the test is 95% accurate in spotting future professional footballers.

So P (EIH’) = 0.05

Substituting into Bayes’ equation gives:

P (HIE) = 0.001x 1 / [0.001 x 1 + 0.05 (1 – 0.001)] = 0.0196

Exercise

Lucy Jones, aged 10, is a good school tennis player, but you know that only one in a thousand such 10-year-olds go on to become professional players. So you would like to get an unbiased assessment of her real chance of developing into a top player. A coach tells you there is a test, taken by all good 10-year-old tennis players, that can measure the child’s potential. The test, you learn, is 98 per cent accurate in identifying future professional tennis players, and these always receive a grade of A+.

Lucy takes the test and is graded A+.

How many of the 10-year-olds tested, who get an A+, fail to develop into top players, you ask? Now the coach imparts the good news. All professional players score A+ on the test as 10-year-olds, and we can take it that anyone who scores below that can be ruled out as a future professional player. And the test is 98 per cent accurate, so only 2 per cent of those who take the test will get the A+ grade and fail to develop into professional players. So what is the actual chance that Lucy will become a professional tennis player?

Is your child a football star?