# Bayes in the Courtroom – in a nutshell.

On the 9th of November, 1999, Sally Clark, a 35-year-old solicitor and mother of a young child, was convicted of murdering two of her children. The presiding Judge, Mr. Justice Harrison, declared that “… we do not convict people in these courts on statistics. It would be a terrible day if that were so.” As it turned out, it was indeed a terrible day, for Sally Clark and for the justice system.

The background to the case is that the death of the babies was put down to natural causes, probably SIDS (‘Sudden Infant Death Syndrome’). Later the Home Office pathologist charged with the case became suspicious and Sally Clark was charged with murder and tried at Chester Crown Court. It eventually transpired that essential evidence in her favour had not been disclosed to the defence, but not before a failed appeal in 2000. At a second Appeal, in 2003, she was set free, and the case is now recognised as a huge miscarriage of justice.

So what went wrong?

A turning point in the trial was the evidence given by a key prosecution witnesses, who argued that the probability of a baby dying of SIDS was 1 in 8,543. So the probability of two babies dying of SIDS was that fraction squared, or 1 in about 73 million. It’s the chance, he argued, “… of backing that long odds outsider at the Grand National … let’s say it’s a 80 to 1 chance, you back the winner last year, then the next year there’s another horse at 80 to 1 and it is still 80 to 1 and you back it again and it wins. Now we’re here in a situation that, you know, to get to these odds of 73 million you’ve got to back that 1 in 80 chance four years running … So it’s the same with these deaths. You have to say two unlikely events have happened and together it’s very, very, very unlikely.”

Perhaps unsurprisingly in face of this interpretation of the evidence, the jury convicted her and she was sentenced to life in prison.

But the evidence was flawed, as anyone with a basic understanding of probability would have been aware. One of the basic laws of probability is that you can only multiply probabilities if those probabilities are independent of each other, even assuming that the proposed probability was accurate (there are separate reasons to doubt this). This would be true only if the cause of death of the first child was totally independent of the cause of death of the second child. There is no reason to believe this. It assumes no genetic, familial or other innocent link between these sudden deaths at all. That is a basic error of classical probability. The other error is much more sinister, in that it is harder for the layman to detect the flaw in the reasoning. It is the ‘Prosecutor’s Fallacy’ and is a well-known problem in the theory of conditional probability, and in particular the application of what is known as Bayesian reasoning, which is discussed in the context of Bayes’ Theorem elsewhere.

The ‘Prosecutor’s Fallacy’ is to conflate the probability of innocence given the available evidence with the probability of the evidence arising given the fact of innocence. In particular, the following propositions are very different:

- The probability of observing some evidence (the dead children) given that a hypothesis is true (here that Sally Clark is guilty).
- The probability that a hypothesis is true (here that Sally Clark is guilty) given that we observe some evidence (the dead children).

These are totally different propositions, the probabilities of which can and do diverge widely.

Notably, the probability of the former proposition is much higher than of the latter. Indeed, the probability of the children dying given that Sally Clark is a child murderer is effectively 1 (100%). However, the probability that she is a child murderer given that the children have died is a whole different picture.

Critically, we need to consider the prior probability that she would kill both babies, i.e. the probability that she would kill her children, before we are given this evidence of sudden death. This is the concept of ‘prior probability’, which is central to Bayesian reasoning. This prior probability must not be viewed through the lens of the later emerging evidence. It must be established on its own merits and then merged through what is known as Bayes’ Theorem with the new evidence.

In establishing this prior probability, we need to ask whether there was any other past indication or evidence to suggest that she was a child murderer, as the number of mothers who murder their children is almost vanishingly small. Without such evidence, the prior probability of guilt should correspond to something like the proportion of mothers in the general population who serially kill their children. This prior probability of guilt is close to zero. In order to update the probability of guilt, given the evidence of the dead children, the jury needs to weigh up the relative likelihood of the two competing explanations for the deaths. Which is more likely? Double infant murder by a mother or double SIDS. In fact, double SIDS is hugely more common than double infant murder. That is not a question that the jury, unversed in Bayesian reasoning or conditional probability, seems to have asked themselves. If they did, they reached the wrong conclusion.

More generally, it is likely in any large enough population that one or more cases will occur of something which is improbable in any particular case. Out of the entire population, there is a very good chance that some random family will suffer a case of double SIDS. This is no ground to suspect murder, however, unless there was a particular reason why the mother in this particular family was, before the event, likely to turn into a double child killer.

To look at the problem another way, consider the wholly fictional case of Lottie Jones, who is charged with winning the National Lottery by cheating. The prosecution expert gives the following evidence. The probability of winning the Lottery jackpot without cheating, he tells the jury, is 1 in 45 million. Lottie won the Lottery. What’s the chance she could have done so without cheating in some way? So small as to be laughable. The chance is 1 in 45 million. So she must be guilty. Sounds ridiculous put like that, but it is exactly the same sort of reasoning that sent Sally Clark, and sends many other innocent people, to prison in real life.

As in the Sally Clark case, the prosecution witness in this fictional parody committed the classic ‘Prosecutor’s Fallacy’, assuming that the probability that Lottie is innocent of cheating given the evidence (she won the Lottery) was the same thing as the probability of the evidence (she won the Lottery) given that she didn’t cheat. The former is much higher than the latter, unless we have some other indication that Lottie has cheated to win the Lottery. Once again, it is an example of how it is likely that in any large enough population one or more cases will occur of something which is improbable in any particular case. The probability that needed to be established in the Lottie case was the probability that she would win the Lottery before she did. If she is innocent, that probability is 1 in tens of millions. The fact that she did, in fact, win the Lottery does not change that.

Lottie just got very, very lucky. Just as Sally Clark got very, very unlucky.

Sally Clark never recovered from the trauma of losing her children and spending years in prison falsely convicted of killing them. She died on 16th March, 2007, of acute alcohol intoxication.

*Exercise*

What is the Prosecutor’s Fallacy, using an equation or equations to illustrate your answer? How might this fallacy lead to false convictions?

*References and Links*

Scheurer, V. Understanding Uncertainty. Convicted on Statistics? https://understandinguncertainty.org/node/545

Joyce, H. (2002). Beyond Reasonable Doubt. +Plus Magazine. Sept. 1. https://plus.maths.org/content/beyond-reasonable-doubt https://plus.maths.org/content/beyond-reasonable-doubt

Centre for Evidence-Based Medicine. (2018). The Prosecutor’s Fallacy. July 19. https://www.cebm.net/2018/07/the-prosecutors-fallacy/

Fenton, N., Neil, M. and Berger, D. (2016). Bayes and the Law. Annual Review of Statistics and its Applcations. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4934658/

McGrayne, S.B. Simple Bayesian Problems. The Sally Clark Case. http://www.mcgrayne.com/disc.htm

Using Statistical Evidence in Courts. A Case Study. http://ben-israel.rutgers.edu/711/Sally_Clark.pdf

Brown, R.J. Sally Clark – What went wrong? http://www.mathestate.com/Sally%20Clark%20-%20What%20went%20wrong.pdf

In the Dark. Archive for Sally Clark. https://telescoper.wordpress.com/tag/sally-clark/

Mike Disney. Cot Deaths, Bayes’ Theorem and Plain Thinking. http://www2.geog.ucl.ac.uk/~mdisney/teaching/GEOGG121/bayes/COT%20DEATHS.doc

Statistical Methods 2013. Sally Clark Case.

http://www.mpia.de/~calj/statistical_methods_ss2013/homework/h02_sally_clark_cot_death.pdf

Coursera. The Sad Story of Sally Clark. https://www.coursera.org/lecture/introductiontoprobability/the-sad-story-of-sally-clark-bII6g

Sally Clark. Wikipedia. https://en.m.wikipedia.org/wiki/Sally_Clark