Skip to content

Let us invent a little crime story in which you are a follower of Bayes and you have a friend in a spot of trouble. In this story, you receive a telephone call from your local police station. You are told that your best friend of many years is helping the police investigation into a case of vandalism of a shop window in a street adjoining where you knows she lives. It took place at noon that day, which you know is her day off work. You had heard about the incident earlier but had no good reason at the time to believe that your friend was in any way linked to it.

She next comes to the telephone and tells you she has been charged with smashing the shop window, based on the evidence of a police officer who positively identified her as the culprit. She claims mistaken identity. You must evaluate the probability that she did commit the offence before deciding how to advise her. So the condition is that she has been charged with criminal damage; the hypothesis you are interested in evaluating is the probability that she did it. Bayes’ Theorem, of course, helps to answer this type of question.

There are three things to estimate. The first is the Bayesian prior probability (which we represent as ‘a’). This is the probability you assign to the hypothesis being true before you become aware of the new information. In this case, it means the probability you would assign to your friend breaking the shop window immediately before you got the new information from her on the telephone that she had been charged on the basis of the witness evidence.

The second is the probability that the new evidence would have arisen if the hypothesis was true (which we represent as ‘b’). In this case, you need to estimate the probability of the police officer identifying your friend if your friend actually did break the window.

The third is to estimate the probability that the new evidence would have arisen if the hypothesis was false (which we represent as ‘c’). In this case, you need to estimate the probability of the police officer identifying your friend if your friend did NOT break the window.

According to Bayes’ Theorem, Posterior probability = ab/ [ab+c(1-a)]

So let’s apply Bayes’ Theorem to the case of the shattered shop window. Let’s start with a. Well, you have known her for years, and it is totally out of character, although she does live just a stone’s throw from the shop, and it is her day off work, so she could in principle have done it. Let’s say 5% (0.05). Assigning the prior probability is fraught with problems, however, as awareness of the new information might easily affect the way you assess the prior information. You need to make every effort to estimate this probability as it would have been before you received the new information. You also have to be precise as to the point in the chain of evidence at which you establish the prior probability.

What about b? This is the probability of the new evidence if the hypothesis was true. What is the hypothesis? That your friend broke the window. What is the new evidence? That the police officer has identified your friend as the person who smashes the window. So b is an estimate of the probability that the police officer would have identified your friend if she was indeed guilty. If she threw the brick, it’s easy to imagine how she came to be identified by the police officer. Still, he wasn’t close enough to catch the culprit at the time, which should be borne in mind. Let’s say that the probability he has identified her and that she is guilty is 80% (0.8).

Let’s move on to c. This is the probability of the new evidence if the hypothesis was false. What is the hypothesis again? That your friend broke the window. What is the new evidence again? That the police officer has identified your friend as the person who did it. So c is an estimate of the probability that the police officer would have identified her if she was not the guilty party, i.e. a false identification. If your friend didn’t shatter the window, how likely is the police officer to have wrongly identified her when he saw her in the street later that day? It is possible that he would see someone of similar age and appearance, wearing similar clothes, and jump to the wrong conclusion, or he may just want to identify someone to advance his career. Let us estimate the probability as 15% (0.15).

Once we’ve assigned these values, Bayes’ theorem can now be applied to establish a posterior probability. This is the number that we’re interested in. It is the measure of how likely is it that your friend broke the window, given that she’s been identified as the culprit by the police officer and charged on the basis of this evidence.

Given these estimates, we can use Bayes’ Theorem to update our probability that our friend is guilty to 21.9%, despite assigning a reliability of 80% to the police officer’s identification.

The most interesting takeaway from this application of Bayes’ Theorem is the relatively low probability you should assign to the guilt of your friend even though you were 80% sure that the police officer would identify her if she was guilty, and the small 15% chance you assigned that he would falsely identify her. The clue to the intuitive discrepancy is in the prior probability (or ‘prior’) you would have attached to the guilt of your friend before you were met face to face with the charge based on the evidence of the police officer. If a new piece of evidence now emerges (say a second witness), you should again apply Bayes’ Theorem to update to a new posterior probability, gradually converging, based on more and more pieces of evidence, ever nearer to the truth.

It is, of course, all too easy to dismiss the implications of this hypothetical case on the grounds that it was just too difficult to assign reasonable probabilities to the variables. But that is what we do implicitly when we don’t assign numbers. Bayes’ Theorem is not at fault for this in any case. It will always correctly update the probability of a hypothesis being true whenever new evidence is identified, based on the estimated probabilities. In some cases, such as the crime case illustrated here, that is not easy, though the approach you adopt to revising your estimate will always be better than using intuition to steer a path to the truth.

In many other cases, we do know with precision what the key probabilities are, and in those cases we can use Bayes’ Theorem to identify with precision the revised probability based on the new evidence, often with startlingly counter-intuitive results. In seeking to steer the path from ignorance to knowledge, the application of Bayes is always the correct method.

Appendix

The calculation and the simple algebraic expression that we have identified in this setting is:

ab/[ab+c(1-a)]

a is the prior probability of the hypothesis (she’s guilty) being true. This is more traditionally represented by the notation P(H). In the example, a = 0.05.

b is the probability the police officer identifies her conditional on the hypothesis being true, i.e. she’s guilty. This is more traditionally represented by the notation (PEIH), i.e. probability of E (the evidence) given the hypothesis is true, P(H). In the example, b = 0.8.

c is the probability the police officer identifies her conditional on the hypothesis not being true, i.e. she’s not guilty. This is more traditionally represented by the notation (PEIH’), i.e. probability of E (the evidence) given the hypothesis is false, P(H’). In the example, c = 0.15.

In our example, a = 0.05, b = 0.8, c = 0.15

Using Bayes’ Theorem, the updated (posterior) probability that the friend is guilty is:

ab/[ab+c(1-a)] = 0.04/(0.04+ 0.1425) = 0.04/0.1825

Posterior probability = 0.219 = 21.9%

Leave a Comment