Bayes’ theorem concerns how we formulate beliefs about the world when we encounter new data or information. The original presentation of Rev. Thomas Bayes’ work, ‘An Essay toward Solving a Problem in the Doctrine of Chances’, was given in 1763, after Bayes’ death, to the Royal Society, by Mr. Richard Price. In framing Bayes’ work, Price gave the example of a person who emerges into the world and sees the sun rise for the first time. At first, he does not know whether this is typical or unusual, or even a one-off event. However, each day that he sees the sun rise again, his confidence increases that it is a permanent feature of nature. Gradually, through a process of statistical inference, the probability he assigns to his prediction that the sun will rise again tomorrow approaches 100 per cent. The Bayesian viewpoint is that we learn about the universe and everything in it through approximation, getting closer and closer to the truth as we gather more evidence. The Bayesian view of the world thus sees rationality probabilistically.
As such, Bayes’ perspective on cause and effect can be contrasted with that of David Hume, the logic of whose argument on this issue is contained in ‘An Enquiry Concerning Human Understanding’. According to Hume, we cannot justify our assumptions about the future based on past experience unless there is a law that the future will always resemble the past. No such law exists. Therefore, we have no fundamentally rational support for believing in causation. Bayes instead applies and formalizes the laws of probability to the science of reason, to the issue of cause and effect.
I propose that we apply the same Bayesian perspective to Immanuel Kant’s duty-based ‘Categorical Imperative.’ This can be summarised in the form: ‘Act only according to that maxim which you could simultaneously will to be a universal law.’ On this basis, to lie or to break a promise doesn’t work as a practical imperative, because if everyone lied or broke their promises, then the very concept of telling the truth or keeping one’s promises would be turned on its head. A society that worked according to the universal principle of lying or promise-breaking would be unworkable. Kant thus argues that we have a perfect duty not to lie or break our promises, or indeed do anything else that we could not justify being turned into a universal law.
The problem with this approach in many eyes is that it is too restrictive. If a crazed gunman demands that you reveal which way his potential victim has fled, you must not lie to save him because this could not be universalisable as a rule of behaviour.
I propose that the application of a justification argument can solve the problem. This argument from justification is that you have no duty to respond to any request which is posed without reasonable appeal to duty. So, in this example, the gunman has no reasonable appeal to duty from you, so you can make an exception to the general rule.
Why is this consistent with the practical implications of Kant’s ‘universal law’ maxim? It’s an issue of probability. In the great majority of situations, you have no defence based on the argument from justification for lying or breaking a promise. So the universal expectation is that truth-telling and promise-keeping is overwhelmingly probable. The more often this turns out to be true in practice, the closer this approach converges on Kant’s absolute imperative by a process of simple Bayesian updating.
In a world in which ethics is indeed based on duty, it is this broader conception of duty which, I propose, should inform our actions.
The Abilene Paradox is a classic management parable. Does it sound familiar in your family or workplace? If so, it may be time to do something about it.
THE ABILENE PARADOX
On a hot afternoon visiting in Coleman, Texas, the family is comfortably playing dominoes on a porch, until the father-in-law suggests that they take a trip to Abilene [53 miles north] for dinner. The wife says, “Sounds like a great idea.” The husband, despite having reservations because the drive is long and hot, thinks that his preferences must be out-of-step with the group and says, “Sounds good to me. I just hope your mother wants to go.” The mother-in-law then says, “Of course I want to go. I haven’t been to Abilene in a long time.”
The drive is hot, dusty, and long. When they arrive at the cafeteria, the food is as bad as the drive. They arrive back home four hours later, exhausted.
One of them dishonestly says, “It was a great trip, wasn’t it?” The mother-in-law says that, actually, she would rather have stayed home, but went along since the other three were so enthusiastic. The husband says, “I wasn’t delighted to be doing what we were doing. I only went to satisfy the rest of you.” The wife says, “I just went along to keep you happy. I would have had to be crazy to want to go out in the heat like that.” The father-in-law then says that he only suggested it because he thought the others might be bored.
The group sits back, perplexed that they together decided to take a trip which none of them wanted. They each would have preferred to sit comfortably, but did not admit to it when they still had time to enjoy the afternoon.
Originally stated by George Washington University Professor, Jerry B. Harvey.
If there is more than one possible universe, impenetrable to the others, is it enough that God is God of this universe, in order to be God, or does he have to be God of all of the possible universes?
Ethical imperatives can, I suggest, be usefully classified into reason-based ethical imperatives and evidence-based ethical imperatives. Evidence-based ethical imperatives can, of course, be influenced by evidence. But reason-based ethical imperatives cannot. Both are subject to the application of reason for their formulation, however, and both can be challenged by reason.
Both types of ethical imperative are duty-based. To the extent that their justification depends on evaluating their particular consequences, they are not true imperatives.
When acting in accordance with an ethical imperative, I suggest also that that a Law of Justification holds, i.e. nobody has a duty to undertake any particular action in response to others unless that person has a right to demand that action arising out of an absolute or personal ethical imperative of those being called upon to act. In other words, there is no duty to respond to any request which is posed without reasonable appeal to duty. This is, I suggest, a universal principle.
In the context of evaluating an evidence-based ethical imperative, I propose that greater weight should be attached (other things equal) to the evidence of a person whose personal incentive (including self-interest or self-regard) to offer that evidence or opinion is less. Particular weight should be attached, other things equal, to evidence offered by a person offering that evidence or opinion who has a personal disincentive (including harm to self-interest or self-regard) to do so.
In evaluating subjective perception of evidence, weight should be given to a consideration of any implicit incentives, self-interest or self-regard which might affect that perception.
In assessing the value of an evidence-based ethical imperative, all evidence and opinion relevant to that imperative should reflect an objective evaluation of how consistent that evidence is with the imperative relative to its consistency with an alternative and competing ethical imperative, mediated by the degree of prior belief in these alternative personal ethical frameworks.
Simply put then, I advocate making a clear distinction between reason-based ethics and evidence-based ethics. Appeal to evidence cannot influence the first, appeal to reason can influence both. If a position is taken that some action is absolutely wrong, which no amount of evidence could contradict, I term this a reason-based ethical judgment, because it is open to influence, at least in principle, by reason, but not by evidence. If it is possible to change one’s mind based on the production of some evidence, then that is an evidence-based ethical judgement. No mind should ever be closed to reason when formulating or revising any ethical imperative.
It is very important to distinguish these, I propose, in order to help resolve or arbitrate ethical differences or to decide whether they are likely to be resolvable.
The next step is to identify actual examples of these ethical imperatives, and to probe how we might best resolve them.
If it is determined that the ethical positions under consideration contain no ethical imperatives but are instead consequence-based judgements, matters move on in a different direction, but can again be categorised into evidence-based and reason-based consequentialist ethics, and then considered from that perspective.
Truth and sound justice depend on sound ethics, among other things.
The task now is to try and establish which ethical frameworks are sound and which are not. By the application of reason and (perhaps) evidence.
People have been wrongly hanged because of it. People have been wrongly punished because of it. And people have suffered unnecessarily because of it. It’s the mistaken belief that the probability of something being true based on seeing the evidence is the same thing as seeing the evidence if something is true. This can have, has had and continues to have, devastating implications. In fact, the probabilities of these two things will often diverge enormously.
To dramatize the problem I will introduce you to the Shakespearean tragedy, Othello. In the play, Othello’s wife Desdemona is set up by the evil Iago, who plants a treasured keepsake that Othello had given her in the home of young Cassio. When Othello comes upon the keepsake, he soon leaps to the mistaken conclusion that Desdemona has been unfaithful to him, with tragic consequences.
Othello made the mistake of believing that the probability that Desdemona was unfaithful given the evidence of the treasured keepsake being found in Cassio’s home was the same probability that the keepsake would have been found in Cassio’s home if Desdemona had been unfaithful to him. Easy mistake. We do it all the time in everyday life, usually with less dramatic implications. More importantly, juries do it all the time, as do practitioners in others fields, like medicine.
Let’s put it another way. What is the chance that someone who has been repeatedly shot in a flat that you rent out would die? Very high. The evidence here is the dead person, the gunshot wounds and the fact that you have access to the flat. The hypothesis is that you are the murderer. Now, the probability we would see that evidence if you ARE the murderer is 100%. But the probability that you are the murderer given that we see that evidence is much lower. There are perhaps many different people who could have committed the murder, even if you are one of them. This seems obvious, and when stated this way it IS obvious, but in real life the problem is usually not stated or understood so clearly, and is often disguised.
This is sometimes referred to as the ‘Prosecutor’s Fallacy.’ It is the fallacy of making out that someone is guilty because the evidence is consistent with their guilt. This is often enough to convict, because this measure is often confused with the probability that the accused is guilty given that the evidence exists. They are totally different things. But even when they are clearly distinguished, the probability we assign to guilt can be seriously over-estimated because of a common cognitive failing known as the prior indifference fallacy. This is the fallacy of believing that the likelihood that something is true rather than false, when we have little prior idea, starts out as 50-50. This is just not so without proper justification but the implications of this belief, which may be implicit, are potentially huge. The prior probability, in the absence of any evidence, is simply not 50-50 unless there is a very good reason to believe that to be so before we see any evidence. But without the evidence, what good reason could there be for believing that to be so? Unless we can anchor this properly, all successive evidence-based reasoning will be flawed.
Fortunately for us, there is a rule used by those conversant with the laws of probability which can in fact help determine the actual relation between the truth of a hypothesis and the evidence relating to that hypothesis. The solution it arrives at is very rarely the same as would be arrived at without it. It is called Bayes’ Rule, but not many people know it, or how to apply it. Until more people do, the relationship between truth and justice is likely to remain severely strained.
It shouldn’t be possible for us to exist. But we do. That’s the sort of puzzle I like exploring. So I will. Let’s start with the so-called ‘Cosmological Constant.’ This is an extra term added by Einstein in working out equations in general relativity that describe a non-expanding universe. The need for the cosmological constant is required to explain why a static universe doesn’t collapse in upon itself through the action of gravity. It’s true that the force of gravity is infinitesimally small compared to the electromagnetic force, but it has a lot more influence on the universe because all the positive and negative electrical charges in the universe somehow seem to balance out. Indeed, if there were just a 0.00001 per cent difference in the net positive and negative electrical charges within a body, it would be torn apart and cease to exist. This cosmological constant, therefore, is added to the laws of physics simply to balance the force of gravity contributed by all of the matter in the universe. What it represents is a sort of unobserved “energy” in the vacuum of space which possesses density and pressure, which prevents a static universe from collapsing in upon itself. But we now know from observation that galaxies are actually moving away from us and that the universe is expanding. In fact, the Hubble Space Telescope observations in 1998 of very distant supernovae showed that the Universe is expanding more quickly now than in the past. So the expansion of the Universe has not been slowing due to gravity, but has been accelerating. We also know how much unobserved energy there is because we know how it affects the Universe’s expansion. But how much should there be? We can calculate this using quantum mechanics. The easiest way to picture this is to visualize “empty space” as containing “virtual” particles that continually form and then disappear. This “empty space”, it turns out, “weighs” 1,093 grams per cubic centimetre. Yet the actual figure differs from that predicted by a factor of 10 to the power of 120. The “vacuum energy density” as predicted is simply 10120 times too big. That’s a 1 with 120 zeros after it. So there is something cancelling out all this “dark” energy, to make it 10 to the power of 120 smaller in practice than it should be in theory. Now this is very fortuitous. If the cancellation figure was one power of ten different, 10 to the power of 119, then galaxies could not form, as matter would not be able to condense, so no stars, no planets, no life. So we are faced with the mindboggling fact that the positive and negative contributions to the cosmological constant cancel to 120 digit accuracy, yet fail to cancel beginning at the 121st digit. In fact, the cosmological constant must be zero to within one part in roughly 10120 (and yet be nonzero), or else the universe either would have dispersed too fast for stars and galaxies to have formed, or else would have collapsed upon itself long ago. How likely is this by chance? Essentially, it is the equivalent of tossing a coin and needing to get heads 400 times in a row and achieving it. Go on. Do you feel lucky? Now, that’s just one constant that needs to be just right for galaxies and stars and planets and life to exist. There are quite a few, independent of this, which have to be equally just right. We can talk about those another time, but this I think sets the stage. I’ve heard this called the fine-tuning argument. I’m now rather interested in finding out who or what is composing this very fine tune.
I saw some pink flamingos recently at Twycross zoo in Warwickshire, and wondered to myself whether all flamingos are pink. What would it take to confirm or disprove the hypothesis? The nice thing about this sort of hypothesis is that it’s testable and potentially falsifiable. All it takes is to find a flamingo that is not pink, and I can conclude that not all flamingos are pink. Just one observation can change my flamingo world view. It doesn’t matter how many pink flamingos I witness, however, no number can prove the hypothesis short of the number of flamingos that potentially exist. Still, the more I see that are pink, the more probable it becomes that all flamingos are actually pink. How probable I consider that is at any given time is related to how probable I thought it was before I saw the latest one. While I was considering this, I saw someone wearing blue tennis shoes. Instantly I realised that made it more likely that all flamingos are pink, but struggled to make intuitive sense of it. The only reason I know is because the pink flamingo thought experiment is simply one example of a broader paradox first formally identified by Carl Gustav Hempel, sometimes known as Hempel’s paradox or else the Raven Paradox. The paradox arises from asking whether observing a green apple makes it more likely that all ravens are black, assuming that you don’t know the answer. It would intuitively seem not. Why should seeing a green apple tell you anything about the colour of ravens? The way to answer this is to re-state ‘All ravens are black’ as ‘Everything that is not black is not a raven.’ In fact, these two statements are logically equivalent. To see this, assume there are just two ravens and two tennis shoes (one right-foot, one left-foot) in the whole world. Now you identify the colour of each of these objects. You observe that both tennis shoes are blue and the other two objects are black. So you announce that everything that is not black (each of the tennis shoes) is not a raven. This is identical to saying that all ravens are black. The logic universalises to any number of objects and colours. Assume now we see just one of the tennis shoes and it turns out to be blue. You can now announce that one possible thing that is not black is not a raven. If you see the other tennis shoe and it is blue, that means that there are now two things that are not black that are not a raven. Each time you see something, it is possible that you would not be able to say this – i.e. you would say instead that you have seen something not black and it is a raven. It is like being dealt a playing card from a deck of four which contains only blue or black cards. You are dealt a black card, and it shows a raven. You know that at least one of the other cards is a raven, and it could be a black card or a blue card. You receive a blue card. Now, before you turn it over, what is the chance it is a raven? You don’t know, but whatever it is, the chance that only black cards show ravens improves if you turn the blue card over and it shows a tennis shoe. Each time you turn a blue card over it could show a raven. Each time that it doesn’t makes it more likely that none of the blue cards shows a raven. Substitute all non-ravens for tennis shoes and all colours other than black for the blue cards, and the result universalises. Every time you see an object that is not black and is not a raven, it makes it just that tiny, tiny bit more likely that everything that is not black is not a raven, i.e. that all ravens are black. How much more likely? This depends on how observable non-black ravens would be if they exist. If there is no chance that they would be seen even if they exist, because non-black ravens never emerge from the nest, say, it is much more difficult to falsify the proposition that all ravens are black. So when you observe a blue tennis shoe it offers less evidence for the ‘all ravens are black’ hypothesis than when it is just possible that the blue thing you saw would have been a raven and not a tennis shoe. More generally, the more likely a non-black raven is to be observed if it exists, the more evidence observation of a non-black object offers for the hypothesis that all ravens are black. The same goes for pink flamingos. So to summarize, we have a paradox traceable to Hempel which can be generalised by appeal to the Vaughan Williams Possibility Theorem. Let’s do this using the pink flamingo thought experiment. Proposition 1: All flamingos are pink. Proposition 2 (logically equivalent to Prop. 1): Everything that is not pink is not a flamingo. Proposition 3: If something might or might not exist, but is unobservable, it is more likely to exist than something which can be observed, with any positive probability, but is not observed. If something might or might not exist, it is more likely to exist if it is less likely to be observed than something else which is more likely to be observed, and is not observed (Possibility Theorem). So when I see two blue tennis shoes, I am ever more slightly more confident that all flamingos are pink than before I saw them, and especially so if any non-pink flamingos out there are easy to spot. And I’d still be wrong, but for all the right reasons, until I saw an orange or white flamingo, and then I’d be right, and sure.