Bayes and the Testing Problem – in a nutshell.
Further and deeper exploration of paradoxes and challenges of intuition and logic can be found in my recently published book, Probability, Choice and Reason.
Let’s say a patient goes to see the doctor. The doctor performs a test on all his patients, for a flu virus, estimating that only 1 per cent of the people who visit his surgery have the virus. The test he gives them, however, is 99 percent accurate – that is, 99 percent of people who are sick test positive and 99 percent of the healthy people test negative. Now the question is: if the patient tests positive, what chances should the doctor give to the patient having the flu virus?
The intuitive answer is 99 percent. But is that right?
The information we are given is ‘the probability of testing positive given that you have the virus’. What we want to know, however, is ‘the probability of having the virus given that you tested positive.’ Common intuition conflates these two probabilities, but they are in fact very different. In fact, if the test is 95% accurate, this means that 95% of sick people test positive. But this is NOT the same thing as saying that 95% of people who test positive are sick. This is known as the ‘Inverse Fallacy’ or ‘Prosecutor’s Fallacy’. It is the fallacy, to which jurors are very susceptible, of believing that the probability of a defendant being guilty of a crime in light of the observation of some piece of evidence is the same as the probability of observing that piece of evidence if the defendant was guilty. They are in fact very different things, and the two probabilities can diverge markedly.
So what is the probability of having the virus if you test positive, given that the test is 99% accurate (i.e. 99% of people who have the virus test positive and 99% of people who do not have the virus test negative)?
To answer this we can use Bayes’ Theorem.
The (posterior) probability that a hypothesis is true after obtaining new evidence, according to the a,b,c formula of Bayes’ Theorem, is equal to:
ab/ [ab+c(1-a)]
a is the prior probability, i.e. the probability that a hypothesis is true before you see the new evidence. Before the new evidence (the test), this chance is estimated at 1 in 100 (0.01), as we are told that 1 per cent of the people who visit his surgery have the virus. So, a = 0.01
b is the probability of the new evidence if the hypothesis is true. The probability of the new evidence (the positive result on the test) if the hypothesis is true (the patient is sick) is 99%, since the test is 99% accurate. So, b =0.99
c is the probability of the new evidence if the hypothesis is false. The probability of the new evidence (the positive result on the test) if the hypothesis is false (the patient is not sick) is just 1% (because the test is 99% accurate, and we can only expect a false positive 1 time in 100). So, c = 0.01
Using Bayes’ Theorem, the updated (posterior) probability = ab/ [ab+c(1-a)] = 1/2
So there is actually a 50% chance that the test, which is 99% accurate and has tested positive, has misdiagnosed you and you are actually flu-free.
Basically, it is a competition between how rare the disease is and how rarely the test is wrong. In this case, there is a 1 in 100 chance that you have the flu before undertaking the test, and the test is wrong 1 time in 100. These two probabilities are equal, so the chance that you actually have the flu when testing positive is actually 1 in 2, despite the test being 99% accurate.
But what if the patient is showing symptoms of the disease before being tested?
In this case, the prior probability should be updated to something higher than the prevalence rate of the disease in the entire tested population, and the chance you are actually sick when you test positive rises accordingly. To the extent that a doctor only tests for something that there is corroborating support for, the likelihood that the test result is correct grows. For this reason, any positive test result should be taken very seriously, statistics aside.
More generally, the ‘False Positive’ problem can easily lead to false convictions based on forensic evidence. Let’s say that we have a theft based on access to a secure storage facility, and we test everyone who could potentially have had access, which is 100 people. Without any other evidence, we can now assign a prior probability that the suspect currently being questioned is guilty of the crime at 1 in 100 or 0.01.
Forensic evidence now comes in the way of a partial fingerprint inside the office safe. It is scientifically determined that the probability the suspect’s fingerprint matches the partial print is 95% (0.95). So there’s just a 5% chance that the print was left by another of the suspects. Applying Bayes’ Theorem, we find that when the 95% accurate forensic test provides a match, the actual probability that the suspect is guilty is just 16%. This makes sense when we consider that testing all 100 suspects would (given that the test has a false positive rate of 5%) provide an estimated five false matches. With larger trawls of forensic testing, the likelihood of a false match becomes commensurately higher.
More generally, to differentiate truth from scare we really do need to understand and employ Bayes’ Theorem. Whether at the doctor’s surgery or in the jury room, understanding it really could save a life.
Appendix
In the original setting with the test results showing positive for a flu virus, a = 0.01, b = 0.99, c = 0.01. Substituting into Bayes’ equation, ab/[ab+c(1-a)], gives:
Posterior probability = 0.01x 0.99 / [0.01 x 0.99 + 0.01 (1 – 0.01)] = 0.01×0.99 / [0.01×0.99 + 0.01×0.99] = 1/2
Another way of visualising this problem is by constructing a simple box diagram for a population of 10,000 patients. Of these, 1%, or 100, have the flu virus and 9900 do not. These are inserted into the Total column. There is a 1% error rate, so 1% of the 9900 who do not have the flu virus test positive. Hence the remaining 9801 test negative. Of the 100 who actually have the flu virus, one tests negative (because of the error rate) and the remaining 99 correctly test positive. See below.
Test positive | Test negative | Total | |
Has flu virus | 99 | 1 | 100 |
No flu virus | 99 | 9801 | 9900 |
Total | 198 | 9802 | 10000 |
It is now easy to see that of the 198 who test positive, exactly half (99) actually have the flu virus. The other half are false positives.
Let’s take another example.
The probability of a true positive (test comes back positive for virus and the patient has the virus) is 90%. The chance that it gives a false negative (test comes back negative yet the patient has the virus) is 10%. The chance of a false positive (test comes back positive yet the patient does not have the virus) is 7%. The chance of a true negative (test comes back negative and the patient does not have the virus) is 93%.
The probability that a random patient has the virus based on the prevalence of the virus in the tested population is 0.8%.
Here, a = 0.8% (0.008) – this is the prior probability
b =90% (0.9) – probability of a true positive
c = 7% (0.07) – probability of a false positive
So, updated probability that the patient has the virus given the positive test result =
ab / [ab + c (1-a)] = 0.008 / [0.0072 + 0.07 x (1 – 0.008)]
= 0.008 x 0.9 / [0.008 x 0.9 + 0.06944] = 0.0072 / 0.07664 = 0.0939 = 9.39%
This can be shown using the raw figures to produce the same result. We can choose any number for total tested, and the result is the same. Let’s choose 1 million, say, as the number tested.
So total tested = 1,000,000
Total with virus = 0.008 x 1,000,000 = 8000
True positive = 0.9 x 8000 = 7200
False positive = 0.07 x 992,000 = 69,440
Tested positive = 69,440 + 7200 = 76,640
Updated (posterior) probability that the patient who tests positive has the virus = True positives / Total positives = 7200 / 76640 = 0.0939 = 9.39%
In the forensic match example, we can construct a box table. In the example, out of a population of suspects of 100, one is guilty and 99 are not guilty. These are inserted into the Total column. There is a 5% error rate in the forensic match, so there is a 0.95 chance of a match if the suspect is guilty (top left). There’s a 5% chance that one of the 99 will provide a match (0.05 x 99 = 4.95), leaving 84.15 as the number for the Not guilty/No match cell.
Match | No match | Total | |
Guilty | 0.95 | 0.05 | 1 |
Not guilty | 4.95 | 94.05 | 99 |
Total | 5.9 | 94.1 | 100 |
So the chance that the suspect provides a match and is actually guilty is the proportion of those guilty and matching out of all those matching (0.95/5.9 = 0.16).
So the 95% accurate forensic match provides a hit when matched to the suspect but his actual probability of guilt on these figures is just 16%.
Using Bayes’ Theorem, we reach the same conclusion:
Substituting into Bayes’ equation gives:
P (Guilty I Match) = 0.01x 0.95 / [0.01 x 0.95 + 0.05 (1 – 0.01)] = 0.01×0.95 / [0.01×0.95 + 0.05×0.99] = 0.0095/(0.0095+0.0495) = 0.0095/0.059 = 0.16.
So P (Guilty I Match) = 0.16
P (Not guilty I Match) = 0.84
Sensitivity and Specificity
In terms of false positive analysis, especially in a medical context, the concepts of sensitivity and specificity are often used.
Sensitivity (also termed the true positive rate) is the proportion of true positives who have a positive test result. In a medical context, it is for example the proportion of people with a condition that are correctly identified (test positive) with the condition.
Specificity (also termed the true negative rate) is the proportion of people who don’t have the disease who have a negative test result. In a medical context, it is for example the proportion of people without a condition that are correctly identified (test negative) as not having the condition.
Thus, sensitivity quantifies the avoiding of false negatives and specificity does the same for false positives. There is usually a trade-off between these measures. For example, airport security scanners that are set to detect low-risk items such as keys (wrongly identify true threats) have low specificity but will almost certainly identify high-risk items, such as guns (high sensitivity). A perfect predictor would identify all genuine cases and no false alarms would be triggered.
Say that TP is someone who has a disease and tests positive for it (True Positive). FN is someone who has a disease and tests negative for it (False Negative). FP is someone who does not have the disease but tests positive for it (False Positive). TN is someone who does not have the disease and tests negative for it (True Negative).
In this case, Sensitivity (True Positive Rate) = TP/(TP+FN), i.e. the probability of a positive test given that the patient has the disease. It is a function of the characteristics of the test itself. Because everyone in the group tested has the disease, it is not affected at all by the prevalence of the disease.
Specificity (True Negative Rate) = TN/(TN+FP), i.e. the probability of a negative test given that the patient does not have the disease.
Sensitivity is not the same as Precision (Positive Predictive Value, PPV), which is the ratio of true positives to combined true and false positives.
PPV = TP/(TP+FP)
PPV is a statement about the proportion of actual positives in the population being tested, i.e. the probability that you have the disease if you have tested positive for it.
NPV (Negative Predictive Value) = TN/(TN+FN)
So for positive and negative predicted values, these are affected by the prevalence of the disease in the community and so is not simply a function of the characteristics of the test itself. So, when comparing one test with another in terms of the positive and negative predicted value, you need top be looking at the same population group or at least population groups with the same incidence of disease.
Now, the Likelihood Ratio is the probability that a test is correct divided by the probability that it is incorrect. In medicine, Likelihood Ratios can be used to determine whether a test result usefully changes the probability that a condition exists.
Two versions of the Likelihood Ratio (Positive LR and Negative LR) exist, one for positive and one for negative test results.
The positive likelihood ratio is calculated as:
LR+ = sensitivity/(1-specificity), which is equivalent to:
LR+ = P (T+ I D+) / P (T+ I D-)
i.e. LR+ is the probability of a person who has the condition testing positive divided by the probability of a person who does not have the disease testing positive.
The negative likelihood ratio is calculated as:
LR- = (1-sensitivity)/specificity, which is equivalent to:
LR- = P (T+ I D+) / P (T- I D-)
i.e. LR- is the probability of a person who has the condition testing negative divided by the probability of a person who does not have the condition testing negative.
The pre-test odds of a particular diagnosis, multiplied by the likelihood ratio, determines the post-test odds.
Post-test odds = Pre-test odds x LR+
Odds = P (something is true) / P (something is false)
Probability = Odds / (1 + Odds)
Exercise
Question 1.
A patient goes to see the doctor. The doctor performs a test on all his patients, for a flu virus, estimating that only 1 per cent of the people who visit his surgery have the flu. The test he gives them, however, is 95 per cent reliable – that is, 95 per cent of people who are sick test positive and 95 per cent of the healthy people test negative.
Question 2.
A tennis tournament administers a test for banned drugs to all of the tournament entrants. The test is 90% accurate, if the person is using the banned drugs, and 85% accurate if the person is not using them. 10 per cent of all tournament entrants are in fact using the banned drugs. Now, what is the probability that am entrant is using drugs, if they test positive.
Question 3.
66 people have the flu and test positive for it. Four people have the flu and test negative for it. Three people don’t have the flu but test positive for it. 827 people don’t have the flu and test negative for it.
-
- What is the Sensitivity of the test?
- What is the Specificity of the test?
- What is the Positive Predictive Value?
- What is the Negative Predictive Value?
- What is the Positive Likelihood Ratio?
- What is the Negative Likelihood Ratio?
- What are the Pre-Test Odds a person has the flu?
- What are the Post-Test Odds a person has the flu?
- Question 4.
- 1,000 people are tested for the flu. 100 people have the flu. Of these, 90 test positive and 10 test negative. 900 do not have the flu. 150 of these test positive, and 750 test negative.
-
- What is the Sensitivity of the test?
- What is the Specificity of the test?
-
- 610 people have the virus and test positive. 118 people have the virus and test negative. 13,212 people do not have the virus but test positive. 127,344 people do not have the virus and test negative.
-
- What is the Sensitivity of the test?
- What is the Specificity of the test?
- What is the Positive Likelihood Ratio?
- What is the Negative Likelihood Ratio?
- What are the Pre-Test Odds a person has the flu?
- What are the Post-Test Odds a person has the flu?
- Now, say that the doctor examines the person before administering the test and assigns a 30% pre-test probability that he has flu. Assuming this estimate is accurate, what is the pre-test Odds that the person has flu if he tests positive?
- What is the post-test Odds that this person has flu?
- What is the post-test probability that this person has flu?
- Say the person who has been assigned a 30% pre-test probability of having the flu instead tests negative. What is the Post-test Odds now that he has the flu?
- What is the Post-test probability that he has the flu?
References and Links
The Role of Probability. Bayes’ Theorem. Boston University School of Public Health. http://sphweb.bumc.bu.edu/otlt/MPH-Modules/BS/BS704_Probability/BS704_Probability6.html
What is Bayes’ Theorem? Scientific American. https://www.scientificamerican.com/article/what-is-bayess-theorem-an/
Su, Francis E., et al. “Medical Tests and Bayes’ Theorem.” Math Fun Facts https://www.math.hmc.edu/funfacts/ffiles/30002.6.shtml
Base Rate Fallacy. In: Paradoxes of probability and other statistical strangeness. S. Woodcock. April 4, 2017. https://theconversation.com/paradoxes-of-probability-and-other-statistical-strangeness-74440
Sensitivity vs Specificity and Predictive Value. Statistics HowTo. https://www.statisticshowto.datasciencecentral.com/sensitivity-vs-specificity-statistics/
Sensitivity, Specificity, Positive Predictive Value, and Negative Predictive Value. https://newonlinecourses.science.psu.edu/stat507/node/71/
Sensitivity and Specificity. Science Direct. https://www.sciencedirect.com/topics/medicine-and-dentistry/sensitivity-and-specificity
Sensitivity and Specificity. Wikipedia. https://en.wikipedia.org/wiki/Sensitivity_and_specificity
Likelihood Ratios. CEBM. https://www.cebm.net/2014/02/likelihood-ratios/
Diagnostics and Likelihood Ratios, Explained. http://www.thennt.com/diagnostics-and-likelihood-ratios-explained/
Likelihood Ratios in Diagnostic Testing. Wikipedia. https://en.wikipedia.org/wiki/Likelihood_ratios_in_diagnostic_testing