On December 30, 1967, senior detectives from Scotland Yard sent owners of gambling clubs into a proverbial spin. Anyone operating a roulette wheel that contained the number zero would be prosecuted, they warned. From now on the whirl of numbers would all be reds and blacks – starting with the number one. This warning 50 years ago followed a judgment in the House of Lords, the country’s highest court of appeal at the time, that the green zero was illegal under gaming law. According to these so-called “law lords”, this was because the chances must be equally favourable to all players in the game.The Lords’ problem with the zero was that players betting on the ball landing on an individual number were being offered odds of 35/1 – put £1 on number 7 and if it came up you got £35 back plus your stake. But standard British roulette wheels have 37 numbers including zero, so the odds should have been 36/1. This discrepancy gave the house an edge of 2.7% – the proportion of times the ball would randomly fall into the zero slot. (Note that in the US and South America roulette wheels normally have both a zero and double zero, giving them a house edge of just over 5%). The British edge on roulette wheels was a small one, such that someone staking £10 on a spin would expect statistically to lose an average of 27 pence. But it’s a vital one. Without an edge on a game the operator would expect only to break even, and that’s before accounting for running costs. The Lords’ decision also looked like the back door to banning every other game with a house edge, such as blackjack and baccarat.

It had been illegal in the UK to organise and manage the playing of games of chance since the Gaming Act of 1845. The Betting and Gaming Act 1960 was the most substantive change to gambling regulation since then. As well as permitting the likes of betting shops and pub fruit machines, it opened the door to gambling halls – though only in a very restricted way.Designed to permit small-stakes play on bridge in members’ clubs, the act legalised gaming clubs so long as they took their money from membership fees and from charges to cover the cost of the gaming facilities. Casinos soon proliferated, however, and by the mid-1960s around a thousand had sprung up. Many introduced French-style roulette, with wheels that included a single zero, since the law had arguably not been clear as to whether the house could have an edge. The one variation thought necessary by some to comply with the legislation was that when the ball landed on zero the house and player split the stake, instead of it being kept by the house. Not only had the law liberalised gambling more than had been envisaged by the government of the day, many casinos had apparent ties to organised crime. London gaming quickly became notorious. Film star George Raft, a man once linked to such shady characters as Las Vegas mobster Benjamin “Bugsy” Siegel, was one of the more high-profile names associated with the scene.

When the Lords drew a line in the sand in 1967 by banning zeros in roulette, gaming bodies went into overdrive. One proposal designed to save the zero was to offer odds of 36/1 on individual numbers, and instead levy a playing charge on the players. The government was soon persuaded it needed to legislate again. In 1968 a new Gaming Act introduced a Gaming Board and strict measures to regulate and police gaming in Great Britain. New licensing rules, including a “fit and proper persons” test, pushed out the shady operators.

The one concession to the industry was that gaming clubs and casinos would be permitted to play roulette with a zero. Other games with a house edge, such as baccarat, blackjack and craps were also explicitly permitted. In an environment of regulated, licensed gaming establishments, the government was saying, a small edge was acceptable as a way of paying for costs and turning a profit. This came on the back of another reform that was vital for developing the industry that we see today. Following the legalisation of betting shops in 1960, the government began taxing their turnover in 1966. It was the first tax on betting since the one introduced in 1926 by then Chancellor of the Exchequer, Winston Churchill, in the days before cash bookmaking was legal and above board. “I am not looking for trouble. I am looking for revenue,” Churchill declared at the time. He didn’t see much of the latter and got a lot of the former: endless enforcement difficulties and opposition from lobby groups and in parliament. The tax was gone by 1930. Yet the 1966 tax stuck, and today the UK gambling landscape is much changed – not only because of the introduction of the National Lottery in 1994 but thanks also in large measure to two key pieces of modernising legislation. The first was the radical overhaul of betting taxation in 2001 and the other was the Gambling Act of 2005, both of which I was closely involved with as an adviser.

Instead of taxing betting turnover, now operators are taxed on their winnings (gross profits). Casinos, betting shops and online operators can advertise on radio and TV; players no longer need to be members of casinos to visit them; and online operators based overseas but active in the UK market must comply with UK licence requirements. Betting exchanges allow people to bet person-to-person, a Gambling Commission regulates betting and gaming, and electronic roulette with a zero is legally available in betting shops and casinos.

The industry as a whole has grown very significantly in size and employs a lot of people, and there is more evidence-based research and focus on the issue of gambling prevalence and problem gambling than ever before. The wheel has certainly turned a long way since that Lords decision in 1967, when the country was still trying to decide what kind of gambling system it wanted. The question that now divides opinion is how far the wheel has turned for the better.

References

Leighton Vaughan Williams, The Day Zero was banned from British roulette: How times have changed. Article in The Conversation. Link below:

Blaise Pascal was a 17th century French mathematician and philosopher, who laid some of the main foundations of modern probability theory. He is particularly celebrated for his correspondence with mathematician Pierre Fermat, forever associated with Fermat’s Last Theorem. Schoolchildren learning mathematics are more familiar with him courtesy of Pascal’s Triangle. Increasingly, though, it is Pascal’s Wager, and latterly the Pascal’s Mugging puzzle, that has entertained modern philosophers. Simply stated, Pascal’s Wager can be stated thus: If God exists and you wager that He does not, your penalty relative to betting correctly is enormous. If God does not exist and you wager that He does, your penalty relative to betting correctly is inconsequential. In other words, there’s a lot to gain if it turns out He does and not much lost if He doesn’t. So, unless it can be proved that God does not exist, you should always side with him existing, and act accordingly. Put another way, Pascal points out that if a wager was between the equal chance of gaining two lifetimes of happiness and gaining nothing, then a person would be foolish to bet on the latter. The same would go if it was three lifetimes of happiness versus nothing. He then argues that it is simply unconscionable by comparison to bet against an eternal life of happiness for the possibility of gaining nothing. The wise decision is to wager that God exists, since “If you gain, you gain all; if you lose, you lose nothing”, meaning one can gain eternal life if God exists, but if not, one will be no worse off in death than by not believing. On the other hand, if you bet against God, win or lose, you either gain nothing or lose everything.

It seems intuitively like there’s something wrong with this argument. The problem arises in trying to find out what it is. One good try is known as the ‘many gods’ objection. The argument here is that one can in principle come up about with multiple different characterisations of a god, including a god that punishes people for siding with his existence. But this assumes that all representations of what God is are equally probable. In fact, some representations must be more plausible than others, if the alternatives are properly investigated. A characterisation that has hundreds of million of followers, for example, and a strongly developed set of apologetics is more likely to be true, however unlikely anyone might believe that to be, than a theory based on an evil teapot.

Once we begin to drop the equal-probability assumption, we severely weaken the ‘many gods’ objection. Basically, if it is more likely that the God of a major established religion is possibly true (however almost vanishingly unlikely any individual might think that to be) relative to the evil teapot religion, the ‘many gods’ objection very quickly begins to crumble to dust. At that point, one needs to take seriously the stratospherically high rewards of siding with belief (at whatever long odds one might set for that) compared to the stakes.

It is true that infinities swamp decisions, but we need not even go as far as positing infinite reward for the decision problem relative to the stakes to become a relatively straightforward one. It’s also true that future rewards tend to be seriously under-weighted by most human decision-makers. In truth, pain suffered in the future will feel just as bad as pain suffered today, but most of us don’t think or behave as if that’s so. The attraction of delaying an unwelcome decision is well documented. In the immortal words of St. Augustine of Hippo in his ‘Confessions’, “Lord make me pure – but not yet!”

A second major objection is the ‘inauthentic beliefs’ criticism, that for those who cannot believe to feign belief to gain eternal reward invalidates the reward. What such critics are pointing to is the unbeliever who says to Pascal that he cannot make himself believe. Pascal’s response is that if the principle of the wager is valid, then the inability to believe is irrational. “Your inability to believe, because reason compels you to and yet you cannot, [comes] from your passions.” This inability, therefore, can be overcome by diminishing these irrational sentiments: “Learn from those who were bound like you. . . . Follow the way by which they began; by acting as if they believed.”

The writer and academic C.S. Lewis picked up on the same theme three centuries later. In ‘Mere Christianity’ he offers complementary advice. “Do not waste time bothering whether you ‘love’ your neighbour; act as if you did. As soon as we do this we find one of the great secrets. When you are behaving as if you loved someone, you will presently come to love him.” Lewis is making the point that the same applies to a belief in God. But the point has never been made so clearly as in the Gospel of St. Mark in the New Testament. In Chapter 9, we are told of the man who brings his son to be cured by Jesus. Jesus asks the boy’s father, “How long has he been like this?” “From childhood,” he answered … “But if you can do anything, take pity on us and help us.” “If you can”?’ said Jesus. “Everything is possible to someone who believes.” Immediately the boy’s father roared, “I do believe; help me overcome my unbelief!”

Even some modern atheist philosophers admit to struggling with the problem set by Blaise Pascal. One attempt to square the circle is by saying that in the world where God, as conventionally conceived, exists with a non-zero probability, there is a case for pushing a hypothetical button to make them believe if offered just one chance, and that chance was now or never. Given the chance of delaying the decision as long as possible, however, it seems they would side with St. Augustine’s approach to the matter of his purity.

Pascal’s Wager has taken on new life in the last couple of decades as it has come to be applied to the problems of existential threats like Climate Change. This issue bears a similarity to Pascal’s Wager on the existence of God. Let’s say, for example, there is only a one per cent chance that the planet is on course for catastrophic climatic disaster and that delay means passing a point of no return where we would be powerless to stop it. In that case, not acting now would seem a kind of crazy. It certainly breaches the terms of Pascal’s Wager. This has fittingly been termed Noah’s Law: If an ark may be essential for survival, get building, however sunny a day it is overhead. Yes, when the cost of getting it wrong is just too high, it probably pays to hedge your bets.

Pascal’s Mugging is a new twist on the problem, which can if wrongly interpreted give comfort to the naysayers. It can be put this way. You are offered a proposal by someone who turns up on your doorstep. Give me £10, the doorstepper says, and I will return tomorrow and give you £100. I desperately need the money today, for reasons I’m not at liberty to divulge. I can easily pay you anything you like tomorrow, though. You turn down the deal because you don’t believe he will follow through on his promise. So he asks you how likely you think it is that he will honour any deal you are offered. You say 100 to 1. In that case, I will bring you £1100 tomorrow in return for the £10. You work out the expected value of this proposal to be 1/100 times £1100 or £11, and hand over the tenner. He never comes back and you have, in a way, been intellectually mugged. But was handing over the note irrational? The mugger won the argument that for any low probability of being able to pay back a large amount of money there exists a finite amount that makes it rational to take the bet. In particular, a rational person must admit there is at least some non-zero chance that such a deal would be possible. However low the probability you assign to being paid out, you can be assigned a potential reward, which need not be monetary, which would outweigh it.

Pascal’s mugging has more generally been used to consider the appropriate course of action when confronted more systemically by low-probability, high-stakes events such as existential risk or charitable interventions with a low probability of success but extremely high rewards. Common sense might seem to suggest that spending money and effort on extremely unlikely scenarios is irrational, but since when can we trust common sense? And there’s no reason to believe that it serves us well here either.

Blaise Pascal was a very clever guy and those who over the centuries have too quickly dismissed his ideas have paid the intellectual (and perhaps a much bigger) price. Today, in an age when global existential risk is for obvious reasons (nuclear annihilation not least) a whole lot higher up the agenda than it was in Pascal’s day, it is time that we revisit (atheists, agnostics and believers alike) the lessons to be learned from ‘The Wager’, and that we do so with renewed urgency. The future of the planet just might depend on it.

On the 9th of November, 1999, Sally Clark, a 35-year-old solicitor and mother of a young child, was convicted of murdering her first two children. The presiding Judge, Mr. Justice Harrison, declared that “… we do not convict people in these courts on statistics. It would be a terrible day if that were so.” As it turned out, it was indeed a terrible day, for Sally Clark and for the justice system.

The background to the case is that the death of the babies was first put down to SIDS (‘Sudden Infant Death Syndrome’). Then a Home Office pathologist expressed doubts and Sally Clark was charged with murder. It later transpired that essential evidence in her favour had not been disclosed to the defence, but not before a failed appeal in 2000. At a second Appeal, in 2003, she was set free, and the case is now recognised as a classic miscarriage of justice.

So what went wrong?

A key turning point in the trial was the evidence given by a key prosecution witness, who argued that the probability of a baby dying of SIDS was 1 in 8,543. So the probability of two babies dying, he said, was that fraction squared, or 1 in about 73 million. But one of the basic laws of probability is that you can only multiply probabilities if those probabilities are independent of each other. That assumes no genetic, environmental or other innocent link between these sudden deaths at all, even assuming the 1/8543 number was correct. The other error is arguably more sinister, and very difficult for the layman to detect. It is known as the ‘Prosecutor’s Fallacy’.

The ‘Prosecutor’s Fallacy’ is to represent the probability of innocence given the available evidence to be the same as the probability of the evidence arising given the fact of innocence. In fact, they are very different. In particular, the following two propositions differ markedly.

1. The probability of observing some evidence (the dead children) given that a hypothesis is true (here that Sally Clark is guilty).

2. The probability that a hypothesis is true (here that Sally Clark is guilty) given that we observe some evidence (the dead children).

Notably, the probability of the former proposition is much lower than of the latter.

Indeed, the probability of the children dying given that Sally Clark is a child murderer is effectively 1 (100%). However, the probability that she is a child murderer given that the children have died is a whole different thing.

Critically, we need to consider what is known by statisticians as the prior probability that she would kill both babies, i.e. the probability that she would kill her children, before we are given this evidence of sudden death. This concept is central to what is known as Bayesian reasoning. Critically, this prior probability must not be viewed through the lens of the later emerging evidence. It must be established on its own merits and then merged through what is known as Bayes’ Rule with the emerging evidence.

In establishing this probability, we need to ask whether there was any other past indication or evidence to suggest that she was a child murderer, as the number of mothers who murder their children is close to vanishingly small. Without such evidence, the prior probability of guilt is close to zero. In order to update the probability of guilt, given the evidence of the dead children, the jury needs to weigh up the relative likelihood of the two competing explanations for the deaths. Which is more likely? Double infant murder by a mother or double SIDS. In fact, double SIDS is much more common than double infant murder.

More generally, it is likely in any large enough population that one or more cases will occur of something which is highly improbable in any particular case. Out of the entire population, there is a very good chance that some random family will suffer a case of double SIDS. This is no ground to suspect murder, unless there was a particular reason why that particular family was likely to harbour a double child killer. It is like convicting someone of cheating the National Lottery simply because they happened to hit the jackpot.

Such miscarriages of justice become even more likely when individually very weak charges are bundled together with other only vaguely related weak cases, to help each other across the line. Active trawling for complainants opens up the additional possibility of related dangers. Our growing understanding of cognitive biases in human perception and decision-making, combined with the deviation of true conditional probability from what common intuition tells us, means that we should be more aware than ever of how easy a miscarriage of justice is. The reality is that the way the law has developed over the last 125 years, and particularly the last 25 years, notably by very severely weakening the required degree of similarity between bundled charges, has made those miscarriages more likely than ever.

The growth of electronic records, including stored texts and messages, also indicates cause for alarm. Recent cases show that exculpatory evidence was contained in the form of texts. In one case a single isolated text provided the vital piece of evidence to prove a false allegation. Which raises the question. How often are false allegations admitted to electronically? Very rarely, almost certainly. So in bringing to light those that we do know of surely indicates that there are many multiples of these false allegations that are not recorded and stored electronically. In a context where police are directed to automatically believe complainants, this is particularly problematic,

Add in the systematic failures of the police and prosecution to reveal exculpatory evidence to the defence. Add in the well-known failures of eyewitness identification evidence. Add in false memory syndrome. Add in cross-contamination of DNA evidence. Add in forced or false confession evidence. Add in arbitrary target conviction rates. Add in confirmation bias by police and the prosecution service. And we have a problem. A very big problem.

As for Sally Clark, she never recovered from the trauma of losing her children and spending years in prison falsely convicted of killing them. She died on 16th March, 2007, of acute alcohol intoxication.

Further Reading

Similar Fact Evidence

http://www.richardwebster.net/similarfactevidence.html

Bundling of charges

https://www.thetimes.co.uk/article/these-bundles-of-charges-pose-a-real-danger-cgzhph7nsf5

Bundling of allegations

https://www.thetimes.co.uk/article/even-rolf-harris-deserved-better-than-this-v3xcvqfxx

Confirmation bias

https://www.thetimes.co.uk/article/this-disgraceful-chief-constable-must-quit-lh5lrfwts

https://www.thetimes.co.uk/article/met-chiefs-perverted-our-system-of-justice-vwgnbl9hb

High-profile miscarriages of justice

http://news.bbc.co.uk/hi/english/static/in_depth/uk/2001/life_of_crime/miscarriages.stm

https://en.wikipedia.org/wiki/List_of_miscarriage_of_justice_cases

DNA cross-contamination

False memory syndrome

https://www.theguardian.com/science/2010/nov/24/false-memories-abuse-convict-innocent

False confession evidence

https://www.theguardian.com/uk/2011/oct/09/false-confessions-sean-hodgson-courts

False eyewitness and identification evidence.

https://www.theguardian.com/uk/2009/aug/18/eyewitness-evidence-wrongful-conviction

Memory, trawling and the misinformation effect.

Memory, Trawling and the Misinformation Effect

Non-disclosure of evidence.

https://www.theguardian.com/law/2017/dec/19/police-non-disclosure-should-lead-to-reform

http://www.bbc.co.uk/news/uk-42431171

If you are content to point and shoot with an automatic camera, you will these days usually do just fine. But let’s say you are looking for a bit more, with some manual control over the settings. That’s where most amateur snappers tend to take fright. They think it’s all a bit technical. Actually, it’s not. It’s all essentially about ‘exposure’, which is basically the brightness or darkness of a photo, and this comes down to three settings – aperture, shutter speed and ISO.

The aperture is simply a set of blades which widen and narrow to control how much light enters the camera. Aperture sizes are measured by f-stops, with a high f-stop (say f/18 or f/22) corresponding to the aperture being quite small (less light entering the camera), and a low f-stop (say f/3.5 or f/5.6) meaning that the aperture is bigger (more light). The aperture also controls what is known as the depth of field, which is an indication of how much of the picture is sharp and how much is blurry. So if you want a figure in the foreground to be sharp and the background blurry, you would want a shallow depth of field (low f-stop, wide aperture). If you want the entire field sharp (for example, a mountain range) you are looking for a high f-stop (small aperture). In summary, a wide aperture (low f-stop, say f/5.6) gives you a brighter photo but a shallower depth-of-field, while a small aperture (high f-stop, say f/18) gives you a darker picture but more depth of field.

The shutter speed is simply a measure of how long the shutter is open, so a slow shutter speed (say 1/60 of a second) allows time for more light to enter, producing a brighter picture (but more blur if objects are in motion), and a fast shutter speed (say 1/800 of a second) produces a darker picture but less blur. In summary, a fast shutter speed (say 1/1000) produces a darker picture, but will be susceptible to less blur, while a slow shutter speed (say 1/80) producesa brighter picture but is susceptible to more blur.

The ISO controls the exposure a different way, thrugh software in the camera that makes it extra sensitive to light. In particular, a high ISO (say 1600) will produce a brighter picture than a low ISO (say 200). A high ISO is consistent with more digital noise in the picture, however, which tends to make the photo look a bit more grainy. In summary, you would select a higher ISO for a brighter photo, but that opens you up to a ‘noisier’ (more grainy) picture, while a lower ISO will give you a darker (but less grainy) picture.

The art now is in combining these settings to give the best overall effect. For example, say you want to take a photo with some movement in it, so you decide to select a fast shutter speed (say 1/800). But the picture comes out darker than you’d like as a result. So you try to compensate by opening up the aperture (to say f/3.5), although this reduces the depth-of-field, blurring the background. Still, the blur doesn’t concern you too much as it’s what’s in the foreground that is the subject of the picture. It’s still a bit darker than you’d ideally like, though. Finally, you turn to the ISO setting and increase that to brighten the picture, while being careful to balance this with your desire to avoid too much digital noise.

By manually and independently setting the aperture, the shutter speed and the ISO, you now have the picture pretty much as close as possible to how you wish it to come out. The automatic mode will often do a good enough job, but using the manual settings allows you to have that bit more control over the final product.

That’s photography using manual settings in a nutshell. I hope it’s been of some help.

You win a quiz show and are offered a choice. You are presented with a transparent box containing £10,000 and an opaque box which contains either £100,000 or nothing. Now, you can open the opaque box and take what is inside, or you can open both boxes and take the contents inside both. Which should you choose? Well, if that’s all the information you have, it’s obvious that you should open both boxes. You certainly will not win less than by just opening one of the boxes, but you might win a lot more. So far, so good. But now introduce an additional factor. Before making your decision, you had to undergo a computerised sophisticated psychometric test (a Predictor) which you are now told has been unerring in its prediction of what hundreds of previous contestants would decide. Whenever they chose both boxes, there was nothing inside the opaque box. Whenever they had chosen just the opaque box, however, they found a cool £100,000 inside. When you make your decision the computer’s decision has already been made. The contents of the opaque box have already been placed there. What is happening is that the Predictor informs the game show organisers its prediction of whether a contestant will choose two boxes or one box. Whenever it predicts that the contestant will choose two boxes, no money is placed in the opaque box. Whenever it predicts that the contestant will choose just the opaque box, £100,000 has been deposited in the box.

This is essentially the basis of what is known as Newcomb’s Paradox or Newcomb’s Problem, a thought experiment devised by William Newcomb of the University of California and popularised by philosopher Roberz Nozick in a paper published in 1969.

So what should you do? Open just the opaque box or open both boxes.

In his paper, Nozick writes that “To almost everyone, it is perfectly clear and obvious what should be done. The difficulty is that these people seem to divide almost evenly on the problem with large numbers thinking that the opposing half is just being silly.”

The argument of those who argue for opening both boxes (the so-called ‘two-boxers’) is that the money has already has already been deposited at the time you are asked to make your decision. Taking two boxes can’t change that, so that’s the rational thing to do.

The argument of those who argue for opening just the opaque box (the so-called ‘one-boxers’) is that the psychometric test is either a perfect or near-perfect predictor of what you will do. It has never got it wrong before. Every single previous contestant who has opened two boxes has found the opaque box empty, and every single previous contestant who has opened just the opaque box has won the £100,000. So do what all the evidence tells you is the sensible thing to do and open just the opaque box.

One way of considering the question is to ask whether your choice in some way determines the choice of the Predictor, and thereby the decision as to whether to place the £100,000 in the box. Well, there’s no time-travelling retro-causality involved. The predictor is basically a piece of computer software which bases its prediction on a psychometric test. It just so happens that the test is uncannily accurate in knowing what people will do.

Look at it this way. Bottom line is that you have a free choice, so why not open both boxes? The problem is that if you are the type of person who is a two-boxer, the predictor will have found this out from the super-efficient psychometric test. If you are the type of person, however, who is a one-boxer, the predictor will find that out too.

So it’s not that there is any good reason in itself to open one box rather than two. After all, what you decide now can’t change what is already in the box. But there is a good reason why you should be the type of person who only opens one box. And the best way to be the sort of person who only opens one box is to only open one box. For that reason, the way to win the £100,000 is to agree to open just the opaque box and leave the other box untouched.

But why leave behind that extra £10,000 when the £100,000 which you are about to win is already in the box?

That’s Newcomb’s Paradox. You decide! Are you are a one-boxer or two?

On the 9th of November, 1999, Sally Clark, a 35-year-old solicitor and mother of a young child, was convicted of murdering her first two children. The presiding Judge, Mr. Justice Harrison, declared that “… we do not convict people in these courts on statistics. It would be a terrible day if that were so.” As it turned out, it was indeed a terrible day, for Sally Clark and for the justice system.

The background to the case is that the death of the babies was put down to natural causes, probably SIDS (‘Sudden Infant Death Syndrome’). Later the Home Office pathologist charged with the case became suspicious and Sally Clark was charged with murder and tried at Chester Crown Court. It later transpired that essential evidence in her favour had not been disclosed to the defence, but not before a failed appeal in 2000. At a second Appeal, in 2003, she was set free, and the case is now recognised as a huge miscarriage of justice.

So what went wrong?

A key turning point in the trial was the evidence given by a key prosecution witnesses, who argued that assuming Sally Clark was innocent, the probability of a baby dying of SIDS was 1 in 8,543. So the probability of two babies dying was that fraction squared, or 1 in about 73 million. It’s the chance, he argued, “… of backing that long odds outsider at the Grand National … let’s say it’s a 80 to 1 chance, you back the winner last year, then the next year there’s another horse at 80 to 1 and it is still 80 to 1 and you back it again and it wins. Now we’re here in a situation that, you know, to get to these odds of 73 million you’ve got to back that 1 in 80 chance four years running … So it’s the same with these deaths. You have to say two unlikely events have happened and together it’s very, very, very unlikely.”

Perhaps unsurprisingly in face of this interpretation of the evidence, the jury convicted her and she was sentenced to life in prison.

But the evidence was flawed, as anyone with a basic understanding of probability would have been aware. One of the basic laws of probability is that you can only multiply probabilities if those probabilities are independent of each other. This would be true only if the cause of death of the first child was totally independent of the cause of death of the second child. There is no reason to believe this. It assumes no genetic, familial or other innocent link between these sudden deaths at all. That is a basic error of classical probability. The other error is much more sinister, in that it is harder for the layman to detect the flaw in the reasoning. It is known as the ‘Prosecutor’s Fallacy’ and is a well-known problem in the theory of conditional probability, and in particular the application of what is known as Bayesian reasoning, which is discussed in the context of Bayes’ Theorem elsewhere.

The ‘Prosecutor’s Fallacy’ is to conflate the probability of innocence given the available evidence with the probability of the evidence arising given the fact of innocence. In particular, the following propositions are very different:

1. The probability of observing some evidence (the dead children) given that a hypothesis is true (here that Sally Clark is guilty).

2. The probability that a hypothesis is true (here that Sally Clark is guilty) given that we observe some evidence (the dead children).

These are totally different propositions, the probabilities of which can diverge widely.

Notably, the probability of the former proposition is much lower than of the latter.

Indeed, the probability of the children dying given that Sally Clark is a child murderer is effectively 1 (100%). However, the probability that she is a child murderer given that the children have died is a whole different picture.

Critically, we need to consider the prior probability that she would kill both babies, i.e. the probability that she would kill her children, before we are given this evidence of sudden death. This is the concept of ‘prior probability’, which is central to Bayesian reasoning. This prior probability must not be viewed through the lens of the later emerging evidence. It must be established on its own merits and then merged through what is known as Bayes’ Rule with the emerging evidence.

In establishing this prior probability, we need to ask whether there was any other past indication or evidence to suggest that she was a child murderer, as the number of mothers who murder their children is almost vanishingly small. Without such evidence, the prior probability of guilt should correspond to something like the proportion of mothers in the general population who serially kill their children. This prior probability of guilt is close to zero. In order to update the probability of guilt, given the evidence of the dead children, the jury needs to weigh up the *relative* likelihood of the two competing explanations for the deaths. Which is more likely? Double infant murder by a mother or double SIDS. In fact, double SIDS is hugely more common than double infant murder. That is not a question that the jury, unversed in Bayesian reasoning or conditional probability, seems to have asked themselves.

More generally, it is likely in any large enough population that one or more cases will occur of something which is improbable in any particular case. Out of the entire population, there is a very good chance that some random family will suffer a case of double SIDS. This is no ground to suspect murder, however, unless there was a particular reason why the mother in this particular family was, before the event, likely to turn into a double child killer.

To put it another way, consider the wholly fictional case of Lottie Jones, who is charged with winning the National Lottery by cheating. The prosecution expert gives the following evidence. The probability of winning the Lottery jackpot without cheating, he tells the jury, is 1 in 45 million. Lottie won the Lottery. What’s the chance she could have done so without cheating in some way? So small as to be laughable. The chance is 1 in 45 million. So she must be guilty. Sounds ridiculous put like that, but it is exactly the same sort of reasoning that sent Sally Clark, and sends many other innocent people, to prison in real life.

As in the Sally Clark case, the prosecution witness in this fictional parody committed the classic ‘Prosecutor’s Fallacy’, assuming that the probability that Lottie is innocent of cheating given the evidence (she won the Lottery) was the same thing as the probability of the evidence (she won the Lottery) given that she didn’t cheat. The former is much higher than the latter, unless we have some other indication that Lottie has cheated to win the Lottery. Once again, it is an example of how it is likely that in any large enough population one or more cases will occur of something which is improbable in any particular case. The probability that needed to be established in the Lottie case was the probability that she would win the Lottery before she did. If she is innocent, that probability is 1 in tens of millions. The fact that she did, in fact, win the Lottery does not change that.

Lottie just got very, very lucky. Just as Sally Clark got very, very unlucky.

Sally Clark never recovered from the trauma of losing her children and spending years in prison falsely convicted of killing them. She died on 16th March, 2007, of acute alcohol intoxication.

Further reading

Bayes’ Theorem. The most powerful equation in the world. https://leightonvw.com/2017/03/12/bayes-theorem-the-most-powerful-equation-in-the-world/

This is probably the most important idea in probability. Truth and justice depend on us getting it right. https://leightonvw.com/2014/12/13/this-is-probably-the-most-important-idea-in-probability-truth-and-justice-depends-on-us-getting-it-right/

The Two Envelopes Problem, also known as the Exchange Paradox, is quite simple to state. You are handed two identical-looking envelopes, one of which, you are informed, contains twice as much money as the other. You are asked to select one of the envelopes. Before opening it, you are given the opportunity, if you wish, to switch it for the other envelope. Once you have decided whether to keep the original envelope or switch to the other envelope, you are allowed to open the envelope and keep the money inside. Should you switch?

Switching does seem like a no-brainer. Note that one of the envelopes (you don’t know which) contains twice as much as the other. So, if one of the envelopes, for example, contains £100, the other envelope will contain either £200 or £50. By switching, it seems, you stand to gain £100 or lose £50, with equal likelihood. So the expected gain from the switch is 1/2 (£100) + 1/2 (-£50) = £50-£25 = £25.

Looked at another way, the expected value of the money in the other envelope = 1/2 (£200) + 1/2 (£50) = £125, compared to £100 from sticking with the original envelope.

More generally, you might reason, if X is the amount of money in the selected envelope, the expected value of the money in the other envelope = 1/2 (2X)+ 1/2 (X/2) = 5/4 X. Since this is greater than X, it seems like a good idea to switch.

Is this right? Should you always switch?

Solution (Spoiler Alert)

If the above logic is correct, then after switching envelopes the amount of money contained in the other envelope can be denoted as Y.

So by switching back, the expected value of the money in the original envelope = 1/2 (2Y) + 1/2 (Y/2) = 5/4Y, which is greater than Y, following the same reasoning as before. So you should switch back.

But following the same logic, you should switch back again, and so on, indefinitely.

This would be a perpetual money-making machine. Something is surely wrong here.

One way to consider the question is to note that the total amount in both envelopes is a constant, A = 3X, with X in one envelope and 2X in the other.

If you select the envelope containing X first, you gain 2X-X = X by switching envelopes.

If you select the envelope containing 2X first, you lose 2X-X = X by switching envelopes. So your expected gain from switching = 1/2 (X) + 1/2 (-X) = 1/2 (X-X) = 0.

Looked at another way, the expected value for the originally selected envelope = 1/2 (2X) + 1/2 X = 3/2 X. The expected value for the envelope you switch to = 1/2 (2X) = 1/2 X = 3/2 X. These amounts are identical, so there is no expected gain (or loss) from switching.

So which is right? This reasoning or the original reasoning. There does not seem a flaw in either. In fact, there is a flaw in the earlier reasoning, which indicated that switching was the better option. So what is the flaw?

The flaw is in the way that the switching argument is framed, and it is contained in the possible amounts that could be found in the two envelopes. As framed in the original argument for switching, the amount could be £100, £200 or £50. More generally, there could be £X, £2X or £1/2 X in the envelopes. But we know that there are only two envelopes, so there can only be two amounts in these envelopes, not three.

You can frame this as £X and £2X or as £1/2 X and £X, but not legitimately as £X, £2X and £1/2 X. By framing it is as two amounts of money, not three, in the two envelopes, you derive the answer that there is no expected gain (or loss) from switching.

If you frame it as £X and £2X, there is a 0.5 chance you will get the envelope with £X, so by switching there is a 0.5 chance you will get the envelope with £2X, i.e. a gain of £X. Similarly, there is a 0.5 chance you selected the envelope with £2X, in which case switching will lose you £x. So the expected gain from switching is 0.5 (£X) + 0.5 (-£X) = £0.

If you frame it as £X and £1/2 X, there is a 0.5 chance you will get the envelope with £X, so by switching there is a 0.5 chance you will get the envelope with £1/2 X, i.e. a loss of £1/2 X. Similarly, there is a 0.5 chance you selected the envelope with £1/2 X, in which case switching will gain you £1/2 X. So the expected gain from switching is 0.5 (-£1/2 X) + 0.5 (£1/2 X) = £0.

There is demonstrably no expected gain (or loss) from switching envelopes.

In order to resolve the paradox, you must label the envelopes before you make your choice, not after. So envelope 1 is labelled, say, A, and envelope 2 is labelled, say, B. A corresponds in advance to, say, £100 and B corresponds in advance to, say, £200, or to £50, but not both. You don’t know which corresponds to which. If you choose one of these envelopes, the envelope marked in advance with the other letter will contain an equal amount more or less than the one you have selected. So there is no advantage (or disadvantage) in switching in terms of expected value. In summary, the clue to resolving the paradox lies in the fact that there are only two envelopes and these contain two amounts of money, not three.