‘Superforecasting’ is a term popularised from insights gained as part of a fascinating idea known as the ‘Good Judgment Project’, which consists of running tournaments where entrants compete to forecast the outcome of national and international events.
The key conclusion of this project is that an identifiable element of those taking part (so-called ‘Superforecasters’) were able to consistently and significantly out-predict their peers. To the extent that this ‘superforecasting’ is real, and it seems to be, it provides support for the belief that markets can not only be beaten but systematically so.
So what is special about these ‘Superforecasters’? A key distinguishing feature of these wizards of prediction is that they tend to update their estimates much more frequently than regular forecasters, and they do so in smaller increments. Moreover, they tend to break big intractable problems down into smaller tractable ones.
They are also much better than regular forecasters at avoiding the trap of underweighting new information or overweighting it. In particular, they are good at evaluating probabilities dispassionately using a so-called Bayesian approach, i.e. establishing a prior (or baseline) probability that an event will occur, and then constantly updating that probability as new information emerges, incrementally updating in proportion to the weight of the new evidence.
In adopting this approach, the Superforecasters are echoing the response of legendary economist, John Maynard Keynes, to a criticism made to his face that he had changed his position on monetary policy.
“When my information changes, I alter my conclusions. What do you do, Sir?”
In this, Keynes was one of the great ‘Superforecasters.’ Keynes went on to earn a fortune betting in the currency and commodity markets.
Superforecasters in the field of sports betting can benefit in particular from betting in-running, while the event is taking place. Their evaluations are also likely to be data-driven, and are updated as frequently as possible, taking into account variables some of which may not even exist pre-match.
They will be aware of players who tend to struggle to close the deal, whether in golf, tennis, snooker, or whatever, and who may be value ‘lays’ when trading in-running at short prices. Or shaky starters, like batsmen whose average belies their likely performance once they get into double figures. This information is only valuable, however, if the market doesn’t already incorporate it. So they gain an edge by access to and dispassionate analysis of large data sets. Moreover, they are very aware that patterns spotted, and conclusions derived, from small data sets can be dangerous, and potentially very hazardous to the accumulation of wealth.
Superforecasters also tend to use ‘Triage’. This is the process of determining the most important things from amongst a large number that require attention. Risk expert and Hedge Fund manager, Aaron Brown offers an example of how, when he first got interested in basketball in the 1970s there were data analysts who tried to analyse the game from scratch. He considered that a hard proposition compared to asking which team was likely to attract more betting interest. As Los Angeles was a rich and high-betting city, and the LA Lakers a glamorous team, he figured it wasn’t hard to guess that the betting public would disproportionately favour the Laker and that therefore the spread would be slanted against them. ‘Bet against the Lakers at home’ became his strategy, and he observes that it took a lot less effort than simulating basketball games.”
Could such a simple strategy work today, tweaked or otherwise? And in what circumstances would you apply it? That’s a more nuanced issue, but Superforecasters (who are normally very keen on big data sets) would be alert to it.
Aaron Brown sees trading contracts on the future as striking the right balance between under- and over-confidence, between prudence and decisiveness. The hard part about this, he observes, is that confidence is negatively correlated to accuracy. Even experienced risk takers bet more when they’re wrong than when they’re right, he says, and the most confident people are generally the least reliable.
The solution, he maintains, is to keep careful, objective records, preferably by a third party.
That’s right – even experienced risk takers bet more when they’re wrong than when they’re right. If true, this is a critical insight.
So how might a Superforecaster go about constructing a sports forecasting model?
Let’s say he wants to construct a model to forecast the outcome of a football match or a golf tournament. In the former, he might focus on assessing the likely team line-up before its announcement, and draw on his hopefully extensive data set to eke out an edge from that. The football market is very liquid and likely to be quite efficient to known information, so any forecasting edge in terms of estimating future information, like team shape, can be critical. The same might apply to rugby, cricket, and other team games.
In terms of golf, he could include statistics on the average length of drive of the players, their tee to green percentages, their putting performance, the weather, the type of course, and so on. But where is the edge over the market?
He could try to develop a better model than others, including using new, state-of-the-art econometric techniques. In trying to improve the model, he could also seek to identify additional explanatory variables.
He might also turn to the field of ‘prospect theory’, a body of work pioneered by Daniel Kahneman and Amos Tversky. This states that people behave and make decisions according to the frame of reference rather than just the final outcome. Humans, according to prospect theory, do not think or think or behave totally rationally, and this could be built that into the model.
In particular, a key plank of prospect theory is ‘loss aversion’, the idea that people treat losses more harshly than equivalent gains, and that they view these losses and gains with regard to a sometimes artificial frame of reference.
An excellent seminal paper on this effect in golf (by Devin Pope and Maurice Schweitzer, in the American Economic Review), is a good example of the sort of way in which study of the economic literature can improve sports modelling. The key contribution of the Pope and Schweitzer paper is that it shows how prospect theory can play a role even in the behaviour of highly experienced and well-incentivised professionals. In particular, they demonstrate, using a database of millions of putts, that professional golfers are significantly more likely to make a putt for par than a putt for birdie, even when all other factors, such as distance to the pin, break, are allowed for. But why? And how does prospect theory explain it?
To find the explanation, they examine a number of possible explanations, and reject them one by one until they determine the true explanation. The find it is because golfers see par as the ‘reference’ score, and so a missed par is viewed (subconsciously or otherwise) by these very human golfers as a significantly greater loss than a missed birdie. They react irrationally in consequence, and cannot help themselves from doing so even when made aware of it. The researchers show that equivalent birdie putts tend to come up slightly too short relative to par putts. This is valuable information for Superforecasters, or even the casual bettor. It is also valuable information for a sports psychologist. If only someone could stand close to a professional golfer every time they stand over a birdie putt and whisper in their ear ‘This is for Par’, it would over time make a significant difference to their performance and pay.
So Superforecasters will Improve their model by increments, taking into account factors which more conventional thinkers might not even consider, and will apply due weight to updating their forecasts as new information emerges.
In conclusion, how might we sum up the difference between a Superforecaster and an ordinary mortal? Watch them as they view the final holes of the Masters golf tournament. What’s the chance of Sergio Garcia sinking that 10-footer? The ordinary mortal will just see the putt, the distance to the hole and the potential break of the ball on the green. The Superforecaster is going one step further, and also asking whether the 10-footer is for par or birdie. It really does make a difference, and it’s why she is watching from the members’ area at the Augusta National Golf Club. She has earned her place there, and she knew it before anyone else.
Further Reading and Links
D.G. Pope and M.E. Schweitzer, 2011, Is Tiger Woods Loss-Averse? Persistent Bias in the Face of Experience, Competition and High Stakes, American Economic Review, 101(1), 129-157.
Philip Tetlock and Dan Gardner, Superforecasting: The Art and Science of Prediction, 2016, London: Random House.
Further and deeper exploration of paradoxes and challenges of intuition and logic can be found in my recently published book, Probability, Choice and Reason.
New Amsterdam has 1,000 taxis. 850 are yellow, 150 are green. One of these taxis accidentally knocks down a pedestrian and then drives away without stopping. We have no reason to believe that drivers of green taxis are any more or any less likely than drivers of yellow taxis to knock down a pedestrian and drive away. Neither do we have any reason to believe that green or yellow taxis are disproportionately represented in the area of New Amsterdam where the hit and run took place.
There is one witness, however, who did see the event. The witness says the colour of the taxi was green.
The witness is given a rigorous observation test, which recreates as closely as possible the event in question, and her judgment proves correct right 80 per cent of the time. We have no reason to doubt the integrity of the witness.
So what is the probability that the taxi was green?
The intuitive answer is in the region of 80 per cent, as the only evidence is that of the witness, and the test of her powers of observation shows that she is right 80 per cent of the time. That is not the Bayesian approach, however, which is to also consider the evidence in the light of the baseline, or prior, probability that the taxi was green before the witness evidence came to light.
The prior probability can be derived from an identification of the proportion of taxis in New Amsterdam that are green. This is 15 per cent (of the 1,000 taxis, 150 are green).
Now, the (posterior) probability that a hypothesis is true after obtaining new evidence, according to the x,y,z formula of Bayes’ Theorem, is equal to:
xy/[xy+z(1-x)]
x is the prior probability, i.e. the probability that a hypothesis is true before you the new evidence arises.
y is the probability the new evidence would arise if the hypothesis is true.
z is the probability the new evidence would arise if the hypothesis is false.
This is a straightforward calculation.
x = 0.15 (15 per cent of taxis are green)
y = 0.8 (the witness is correct 80 per cent of the time)
z = 0.2 (the witness is wrong 20 per cent of the time)
Inserting these numbers into the formula gives:
Posterior probability = 0.15 x 0.8/ (0.15×0.8 + 0.2×0.85) = 0.12/ (0.12+0.17) = 41%
In other words, the true probability that the taxi that knocked down the pedestrian was green is not 80 per cent (despite the witness evidence) but about half that. The baseline probability is that important.
But Bayesians are not content to leave it that. The next step is to look for further new evidence.
Say, for example, that a new witness appears, totally independent of the other, and is also given the observation test, revealing a reliability score of 90 per cent. Again, we have no reason to doubt the integrity of this witness. What a Bayesian does now is to insert that number (0.9) into the Bayes formula (y=0.9) so that z (the probability that the witness is mistaken) = 0.1.
The new baseline (or prior) probability, x, is no longer 0.15, as it was before the first witness appeared, but 0.41 (the probability incorporating the evidence of the first witness).
New posterior probability = 0.41 x 0.9/ (0.41×0.9 + 0.1×0.59) = 0.369/ (0.369+0.059) = 86.2%
This is also the new baseline probability underpinning any new evidence which might arise.
There are three illustrative cases which bear highlighting.
The first is a scenario where the new witness scores 50 per cent on the observation test. Here is a case where intuition and Bayes’ formula converge. Intuition tells us that a witness who is right only half the time about the colour of the taxi is also wrong half the time, and so any evidence they give is worthless. In terms of the equation, such a witness would be accorded y = 0.5 and z = 0.5.
Putting these values of y and z into the equation leads to the following:
xy/[xy+z(1-x)] BECOMES 0.5x / [0.5x + 0.5 (1-x)]
0.5x / [0.5x + 0.5 (1-x)] = 0.5x / (0.5 + 0.5x – 0.5x) = 0.5x / 0.5 = x
So when x and y both equal 0.5 in regard to new evidence, this evidence has no impact on the probability of the hypothesis being tested being true. The posterior probability (x) equals the prior probability (x).
In other words, when y = z = 0.5, the posterior probability equals the prior probability. In this case, the witness’s evidence can be discounted.
The second illustrative case is where a new witness is 100 per cent reliable about the colour of the taxi. In this case, y =1 and z =0. Intuition tells us that the evidence of such a witness solves the case. If the infallible witness says the taxi was green, it was green. Bayes’ formula agrees. Inserting y = 1, z = 0 into the formula gives:
xy/[xy+z(1-x)] = x / (x + 0) = x/x = 1.
So the new (posterior) probability that the taxi is green = 1.
This leads directly to the third illustrative case. If the new witness scores 0 per cent on the observation test, this indicates that they always identify the wrong colour for the taxi. If they say it is green, it is definitely not green. So the chance (posterior probability) that the taxi is green if they say so is zero. This accords with the formula.
xy/[xy+z(1-x)] = 0 / [0 + (1-x)] = 0
Of course, this is valuable information, as it can be reversed to useful effect. A witness who always identifies a green taxi as yellow and vice-versa, and is 100 per cent consistent in doing so, yields us infallible information simply by reversing their identified colour.
So if the witness says the taxi is yellow, we can now identify the taxi as definitely green. This now converges on the second illustrative case.
Similarly, a witness who is, say, 25 per cent accurate in identifying the colour of the taxi in the observation test also yields us valuable information. By reversing the identified colour, this yields a 75 per cent accuracy score, which can be inserted accordingly into Bayes’ formula to update the probability that the taxi that knocked down the pedestrian was green.
The only observation evidence that is worthless, therefore, is evidence that could have been produced by the flip of a fair coin.
And the conclusion to the case? CCTV evidence was later produced in court which was able to conclusively identify the taxi and the driver. The pedestrian never regained consciousness. The driver told the jury that he panicked when the pedestrian unexpectedly stepped out in front of him, and drove off because he feared he would lose his livelihood. He was completely unaware that the victim had hit his head awkwardly, and had thought at the time that it was a very minor accident.
This was rejected by the jury, who accepted the prosecution’s contention that he had acted with premeditation. They based their decision on their view that a driver who was so motivated would indeed have driven off. The taxi driver in this case did drive off, which was what someone who acted wilfully, deliberately and with premeditation would do. It was all the evidence they needed to reach their unanimous verdict.
James Parker, a 29-year-old long-time resident of New Amsterdam, of previous good character, with no previous convictions or any known motive for the crime, is currently serving a sentence of life in a maximum security prison with no possibility of parole.
Further Reading and Links
A murder has been committed. There are five suspects, all of whom we consider equally likely to be guilty at the start of the investigation.
So 20 per cent is the prior probability of guilt for each suspect, before any new evidence is found. The names of the suspects are: Reverend Green, Colonel Mustard, Miss Scarlett, Professor Plum and Mrs. Peacock. The codename for the murder investigation is Operation Cluedo. The victim was Sir Caliban Mackenzie, a famed anthropologist, who was shot in the library while examining a rare first edition of Newton’s Principia.
Four hours into the investigation, evidence turns up which eliminates Reverend Green. He was leading the Holy Communion Service in the chapel at the time of the murder. There are now four remaining suspects, and so the probability that each of the remaining suspects is guilty rises to 25 per cent.
Two hours later, a new clue now arises which casts some doubt on the alibi of Colonel Mustard, whose probability of guilt we now judge to rise from 25 per cent to 40 per cent.
As a result, the probability that one of the other three suspects is guilty falls by 15 per cent, down from a total of 75 per cent to 60 per cent. Since each of the three is equally likely to be guilty, we can now assign each a probability of guilt of 20 per cent, down from 25 per cent.
After a further 45 minutes, a third clue emerges, which eliminates Mrs. Peacock. She had been spotted by a number of reliable witnesses at the Communion service in the chapel along with Reverend Green.
So the big question is how should we now adjust the probabilities that Colonel Mustard, Miss Scarlett and Professor Plum pulled the trigger?
In other words, now that Mrs. Peacock has been eliminated, and taking account of the evidence which doubled the original likelihood that Colonel Mustard wielded the murder weapon (to 40 per cent), what is the best estimate of the revised probability that each of Mustard, Scarlett and Plum committed the murder?
Spoiler Alert: The Solution
One possibility would be take the 20 per cent probability of guilt we had previously attached to Mrs. Peacock, and divide this equally between the three remaining suspects.
But to do so would be wrong, and notably at variance with the toolkit of a Bayesian detective, i.e. a detective who conducts investigations using the Bayesian approach to evidence and probability.
The Bayesian approach to detective work tells us always to consider the prior probability that each suspect is guilty before deducing the probability after some new evidence is brought to bear on it. Applying this method, the correct way to adjust the probabilities attached to the remaining suspects is to do so in a way that is proportional to their prior probability of guilt before Mrs. Peacock was eliminated from the enquiry.
Since Colonel Mustard was the prime suspect, with a probability of guilt of 40 per cent before Peacock’s elimination (compared to 20 per cent for Miss Scarlett and Professor Plum), a good Bayesian needs to increase the probability we assign to his guilt by twice as much as we increase theirs. So we should now raise the estimate of the probability that Colonel Mustard shot Sir Caliban from 40 per cent to 50 per cent, while we should increase the probability we assign to Miss Scarlett and Professor Plum from 20 per cent to 25 per cent.
This is all derived from Bayes’ Theorem, which tells us that in order to calculate the probability of a hypothesis being true given new evidence, we must multiply by the prior probability of the hypothesis being true before we are aware of the new evidence (Mrs. Peacock’s elimination from the enquiry). This prior probability is twice as big for Colonel Mustard as for either of the other remaining suspects.
Epilogue:
The estimated 50 per cent probability of guilt was more than sufficient to persuade the Crown Prosecution Service to haul the Colonel before a jury of his peers. In the event they convicted him, falling victim to the classic Prosecutor’s Fallacy, which is to confuse the probability that someone is guilty given the evidence with the probability of the evidence arising if someone is guilty. The likelihood of Sir Caliban being shot in the library if the Colonel was guilty of murder was quite high, and this led to his conviction. Unfortunately for the Colonel, the relevant probability (that he was guilty of murder given that Sir Caliban was shot in the library) was rather smaller but bypassed in the jury’s deliberations.
Meanwhile, the actual killer, Miss Scarlett, got away scot-free. She had concealed an incriminating letter in the Principia, thinking it would be safe there, until Sir Caliban unhappily chanced upon it. This left her no option, in her mind, but to use the pistol hidden in the Georgian chest of drawers gracing the back wall of the library.
The Colonel’s appeal was unanimously rejected. He is serving a life sentence. Miss Scarlett is living as a tax exile in Belize.
Further Reading and Links
If there is a set of ‘game’ strategies with the property that no ‘player’ can benefit by changing their strategy while the other players keep their strategies unchanged, then that set of strategies and the corresponding payoffs constitute what is known as the ‘Nash equilibrium’.
Assume each Player can adopt a ‘Friendly’ or a ‘Hostile’ approach to the game. For example, a Friendly strategy might be to put down the weapon you are carrying in your hand. The Hostile strategy is to keep hold of it.
Now, depending on their respective actions let’s say the game organiser awards monetary payoffs to each player. These are shown below and are known to each player.
| Player B ‘Friendly’ | Player B ‘Hostile’ | |
| Player A ‘Friendly’ | 750 to A; 1000 to B | 25 to A; 2000 to B |
| Player A ‘Hostile’ | 1000 to A; 50 to B | 30 to A; 51 to B |
What is Player A’s best response to each of Player B’s actions?
If Player B acts ‘Friendly’, player A’s best payoff is if he acts ‘Hostile.’ This yields a payoff of 1000. If he had acted ‘Friendly’ he would have earned a payoff of only 750.
If Player B acts ‘Hostile’, player A’s best response is if he acts ‘Hostile. He earns 30 instead of a payoff of 25 if he acted ‘Friendly.’
In both cases his best response is to act ‘Hostile’.
What is Player B’s best response to each of Player A’s actions?
If Player A acts ‘Friendly’, player B’s best payoff is if he acts ‘Hostile.’ This yields a payoff of 2000. If he had acted ‘Friendly’ he would have earned a payoff of only 1000.
If Player A acts ‘Hostile’, player B’s best response is if he acts ‘Hostile’. He earns 51 instead of a payoff of 50 if he acted ‘Friendly.’
In both cases his best response is to act ‘Hostile.’
Now, a Nash Equilibrium exists when Player B’s best response is the same as Player A’s best response. In this case, both Player A and Player B have the same best response to either action of his opponent. Both should act ‘Hostile’, in which case Player A wins 30 and Player B wins 51.
But if both had been able to communicate and reach a joint, enforceable decision, they would both presumably have acted ‘Friendly.’
Let’s now turn to the world of espionage in seeking out a Nash equilibrium. Let’s assume that there are two possible codes, and Agent A can select either of them and so can Agent B. The payoff to selecting non-matching codes is zero.
| Agent B ‘Uses Code A’ | Agent B ‘Uses Code B’ | |
| Agent A ‘Uses Code A’ | 1000 to A; 500 to B | 0 to A; 0 to B |
| Agent A ‘Uses Code B’ | 0 to A; 0 to B | 500 to A; 1000 to B |
So where is the Nash equilibrium?
Top left: Neither Agent can increase their payoff by choosing a different action to the current one. In other words, there is no incentive for either Agent to switch given the strategy of the other Agent. So this is a Nash equilibrium.
Bottom right: As above
Top right: By choosing to switch to Code B instead of code A, Agent A obtains a payoff of 500, given Agent B’s actions. Similarly for Agent B, who would gain by switching to code A, given Agent A’s strategy. So top right (Agent A uses code A and Agent B uses Code B) is NOT a Nash equilibrium, as both Agents have an incentive to switch given what the other Agent is doing.
Bottom left is the same as Top right. As above, there are incentives to switch. So it is NOT a Nash equilibrium.
In conclusion, this game has two Nash equilibria, top left (Agent A and Agent B both use Code A) and bottom right (Agent A and Agent B both use Code B).
Turning now to the classic Safe/Crash problem. In this problem if both drivers drive on the left of the road, they will be safe, while they will crash if one decides to adhere to one side of the road and the other to the opposite. This is shown in the box diagram below.
| Driver B ‘ Drives on left’ | Driver B ‘Drives on right’ | |
| Driver A ‘Drives on left’ | Safe; Safe | Crash; Crash |
| Driver A ‘Drives on right’ | Crash; Crash | Safe; Safe |
So there are again two Nash equilibria here. Top left and Bottom right. In both these scenarios, there is no incentive for either Driver to switch to the other side of the road given the driving strategy of the other driver.
Now let’s consider the case of two companies who each have the option of using one of two emblems. We shall call the first the Blue Badger emblem and the other the Black Bull emblem.
| Firm B uses Black Bull emblem | Firm B uses Blue Badger emblem | |
| Firm A uses Black Bullemblem | 1000 to A, 500 to B | 500 to A, 1000 to B |
| Firm A uses Blue Badger emblem | 500 to A, 1000 to B | 1000 to A, 500 to B |
If we consider each section in turn, we arrive at the following result.
Top left: Firm B gains by switching from the Black Bull to the Blue Badger emblem.
Top right: Firm A gains by switching from the Black Bull to the Blue Badger emblem.
Bottom left: Firm A gains by switching from the Blue Badger to the Black Bull emblem.
Bottom right: Firm B gains by switching from the Blue Badger to the Black Bull emblem.
So this game has no Nash equilibrium.
So we have highlighted examples of games with one, two, and no Nash equilibria.
This leads us to the classic ‘Prisoner’s Dilemma’ problem. Are there any Nash equilibria here, and if so how many? In this scenario, two prisoners, linked to the same crime, are offered a discount on their prison terms for confessing if the other prisoner continues to deny it, in which case the other prisoner will receive a much stiffer sentence. However, they will both be better off if both deny the crime than if both confess to it. The problem each faces is that they can’t communicate and strike an enforceable deal. The box diagram below shows an example of the Prisoner’s Dilemma in action.
| Prisoner 2 Confesses | Prisoner 2 Denies | |
| Prisoner 1 Confesses | 2 years each | Freedom for P1; 8 years for P2 |
| Prisoner 1 Denies | 8 years for P1; Freedom for P2 | 1 year each |
The Nash Equilibrium is for both to confess, in which case they will both receive 2 years. But this is not the outcome they would have chosen if they could have agreed in advance to a mutually enforceable deal. In that case they would have chosen a scenario where both denied the crime and received 1 year each.
So, to summarise, a Nash equilibrium is a stable state that involves interacting participants in which none can gain by a change of strategy as long as the other participants remain unchanged. It is not necessarily the best outcome for the parties involved, but it is the outcome we would predict in a non-cooperative game of rational, self-interested actors.
The Question: Choose an integer number between 0 and 100. You win a prize if your number is equal or closest to 2/3 of the average number chosen by all other participants. What number should you choose?
If you think that the other participants will choose a random number within the range, the average will be 50. Hence you choose 33.
But hang on. Just as you chose 33, so presumably will other participants, at least on average, based on your same line of reasoning. So if the average number chosen by all participants is 33, then the smart thing to do is to choose 22.
But do you really think you are smarter than the others? Just as you figured out that 22 is the smart choice, so will others, at least on average. So the super smart thing to do is to choose 15.
But … We are heading towards 0 (you get there after 12 iterations). Zero is the only rational choice to make if you don’t think you are smarter than the other participants.
You start to get the strong feeling that if you choose 0 you are not going to win the prize. This is because, although you don’t think you are smarter than most, it is reasonable to assume that at least some of the players are not as smart or rational as you.
For example, if 10 per cent of players are totally naïve and choose a random number – 50 on average – then the overall average will be 5 and the right answer will be 3.
However, if the rest of the players share your thoughts and assumptions, they will also choose 3, thereby increasing the average to 8 and the right answer to 5. Then you answer 5, but so will the rest, thus increasing the right answer to 6.
The process converges to 8.
Well, 8 is the right answer if 90 per cent of players are as smart as you are and 10 per cent are totally naïve.
If 20 per cent are naïve, the process converges to 14; with 30 per cent it converges to 18, and so on.
But then it may also be the case that the less rational players are not totally naïve (Level 0 rationality) but, for example, exhibit Level 1 rationality, where the average answer is 33.
In this case, with 10 per cent Level 1 players the process converges to 5; with 20 per cent to 9; with 30 per cent to 12, and so on. Of course, there are plenty more combinations, with varying proportions of players at Level 0, Level 1, Level 2 and so on.
The higher the winning number, the larger is the percentage of less rational players in the game.
In an experiment conducted with Financial Times readers by economist Richard Thaler, made up of 1,476 participants, the winning number was in fact 13.
This is roughly consistent with:
1. All players exhibit Level 3 rationality
OR 2. 80% are fully rational and 20% are totally naïve.
OR 3. 70% are fully rational and 30% exhibit Level 1 rationality.
Etc.
John Maynard Keynes, in Chapter 12 of his ‘General Theory of Employment, Interest and Money’, frames the paradox in terms of the money markets, in a more prosaic way:
“Professional investment may be likened to those newspaper competitions in which the competitors have to pick out the six prettiest faces from a hundred photographs, the prize being awarded to the competitor whose choice most nearly corresponds to the average preferences of the competitors as a whole; so that each competitor has to pick, not those faces which he himself finds prettiest, but those which he thinks likeliest to catch the fancy of the other competitors, all of whom are looking at the problem from the same point of view. It is not a case of choosing those which, to the best of one’s judgment, are really the prettiest, not even those which average opinion genuinely thinks the prettiest. We have reached the third degree where we devote our intelligences to anticipating what average opinion expects the average opinion to be. And there are some, I believe, who practise the fourth, fifth and higher degrees.”
In other words, it is those who are able to best out-guess the best guesses of the rest of the crowd, who stand to win the prize.
Or put another way, the ten pound note you spot lying on the floor might well be real after all. Nobody has picked it up yet because they have all assumed that someone else would have picked it up if it were real. You realise that everyone else is thinking like this, and you win yourself a tenner. Let’s call that super-rationality. Ultimately, it’s the kind of rationality that can make you a fortune.
Further Reading and Links
Bosch-Domenesch, J.G. Montalvo, R. Nagel and A. Satorra (2002), One, Two, Three, Infinity, … :Newspaper and Lab Beauty-Contest Experiments, American Economic Review, 1687-1701, December.
This is a true story about New York gambling-house operator, Fat the Butch, who made his fortune booking dice games. In 1952 he was famously challenged by a bigtime gambler known as The Brain to a simple wager. The bet was an even-money proposition that the Butch could throw a double-six in 21 rolls of the dice. On the face of it, the edge seems to be with Butch. After all, there are 36 possible combinations that could come up when throwing two dice, from 1-1, 1-2, 1-3, to 6-4, 6-5, 6-6. Intuition would suggest, therefore, that 18 throws should give you a 50-50 chance of throwing any one of these combinations, including a double-six. In 21 throws, the chance of a double-six should, therefore, be more than 50-50. On this basis, the Butch accepted the even bet at $1,000 a roll. After twelve hours of rolling, the Brain was $49,000 up, at which point the Butch called it a day, sensing that something was wrong with his strategy.
The Brain had in fact profited from a classic probability puzzle known as the Chevalier’s Dice problem, which can be traced to the 17th French gambler and bon vivant, Antoine Gombaud, better known as the Chevalier de Méré.
The Chevalier would agree even money odds that in four rolls of a single die he would get at least one six. His logic seemed impeccable. The Chevalier reasoned that since the chance that a 6 will come up in any one roll of the die is 1 in 6, then the chance of getting a 6 in four rolls is 4/6, or 2/3, which is a good bet at even money.
If the probability was a half, he would break even at even money. For example, in 300 games, at 1 French franc a game, he would stake 300 francs and expect to win 150 times, returning him 150 francs for each win with his stake returned on each occasion (total of 300 francs). With a probability of 2/3, he would expect to win 200 times, yielding a good profit.
It is easy to show intuitively that this reasoning is faulty, for if it were correct, then we would calculate the chance of a 6 in five rolls of the die as 5/6. And that therefore the chance of a 6 in six rolls of the die would be 6/6 = 100%, and in 7 rolls, 7/6!!! Something is therefore clearly wrong here.
Still, even though his reasoning was faulty, he continued to make a profit by playing the game at even money. To see why, we need to calculate the true probability of getting a 6 in four rolls of the die. The key idea here is that the number that comes up on each roll is independent of any other rolls, i.e. dice have no memory. Since each event is independent, we can (according to the laws of probability) multiply the probabilities.
So the probability of a 6 followed by a 6, followed by a 6, followed by a 6, is: 1/6 x 1/6 x 1/6 x 1/6 = 1/1296.
So what is the chance of getting at least one six in four rolls of the die?
Since the probability of getting a 6 in any one roll of the die = 1/6, the probability of NOT getting a 6 in any one roll of the die = 5/6.
So the chance of NOT getting a 6 in four rolls of the die is:
5/6 x 5/6 x 5/6 x 5/6 = 625/1296
So the chance of getting at least one 6 is 1 minus this, i.e. 1 – (625/1296) = 671/1296 = 0.5177, which > 0.5.
So, the odds are still in favour of the Chevalier, since he is agreeing even money odds on an event with a probability of 51.77%.
This was all very well as long as it lasted, but eventually the Chevalier decided to branch out and invent a new, slightly modified game. In the new game, he asked for even money odds that a pair of dice, when rolled 24 times, will come up with a double-6 at least once. His reasoning was the same as before, and quite similar to the reasoning employed by the Butch.
If the chance of a 6 on one roll of the die is 1/6, then the chance of a double-6 when two dice are thrown = 1/6 x 1/6 (as they are independent events) = 1/36.
So, reasoned the Chevalier, the chance of at least one double-6 in 24 throws is: 24/36 = 2/3.
So this is very profitable game for the Chevalier. Or is it?
No it isn’t, and this time Monsieur Gombaud paid for his faulty reasoning. He started losing. In desperation, he consulted the mathematician and philosopher, Blaise Pascal.
Pascal derived the correct probabilities as follows:
The probability of a double-6 in one throw of a pair of dice = 1/6 x 1/6 = 1/36.
So the probability of NO double-6 in one throw of a pair of dice = 35/36.
So, the probability of no double-6 in 24 throws of a pair of dice = 35/36 x 35/36 … 24 times = 35/36 to the power of 24, i.e. (35/36)24 = 0.5086.
So probability of at least one double-6 is 1 minus this, i.e. 1 – 0.5086 = 0.4914, i.e. less than 0.5
Under the terms of the new game, the Chevalier was betting at even money on a game which he lost more often than he won.
It was an error that the Butch was to repeat almost 300 years later!
Meantime, that letter from the Chevalier de Mere to Blaise Pascal was to lead to a historic correspondence between Pascal and Pierre Fermat (of ‘Fermat’s Last Theorem’ fame) which was to lay the groundwork of modern probability theory. All from a dice game!
Further Reading and Links
Further and deeper exploration of paradoxes and challenges of intuition and logic can be found in my recently published book, Probability, Choice and Reason.
You are presented with four cards, with the face-up side on display, showing either a letter or a number. You are promised that each has a letter on one side and a number on the other.
Red Card displays the letter D
Orange Card displays the letter N
Blue Card displays the number 21
Yellow Card displays the number 16
You are now presented with the following statement: Every card with D on one side has 21 on the other side.
The Question is: What is the minimum number of cards needed to determine whether this statement is true? What are the colours of the cards you need to turn over to determine this?
Think about it: Do you need to turn over the Red Card? Do you need to turn over the Orange Card? Do you need to turn over the Blue Card? Do you need to turn over the Yellow Card?
Spoiler Alert (Solution).
When given this puzzle to solve, the great majority get it wrong.
You must turn over the Red Card to see if it has 21 on the other side. If it does not, the statement is false.
You must turn over the Blue Card to see if it has N on the other side. If it does, the statement is false.
Turning over the Orange Card does not help you verify or falsify the statement.
Turning over the Yellow Card does not help you verify or falsify the statement.
So the minimum number of cards need to determine whether the statement is true is two, and they are the Red Card and the Blue Card.
Bonus Question (The Tyre Problem)
Two employees turn up late to an important meeting. They claim that one of the tyres on their car had a puncture, but it is a lie.
Their suspicious boss send them to separate rooms and asks each of them to write down which tyre was punctured.
The Question: Assuming they have not colluded beforehand, and have no particular reason to think that one tyre is more likely to have been punctured, what is the likelihood that they will randomly name the same tyre?
Is it (1/4)2 = 1 in 16?
Think about it: There are four tyres. Each employee is choosing a tyre randomly and independently of one another. Maybe it is easier to think of a two-wheeled vehicle. In the same scenario, what is the likelihood they will randomly name the same tyre if they arrived on a motor bike? Is it (1/2) x (1/2) = 1 in 4?
Spoiler Alert (Solution)
Once the first employee randomly chooses a tyre on the car, there is a 1 in 4 chance that the other employee will choose the same one, e.g. if employee 1 chooses front left tyre, employee 2 has a 1 in 4 chance of randomly selecting the same one. Similarly, if the first employee randomly chose the back tyre on the motor bike, the chance that the second employee would come up with the same tyre is 1 in 2.
Further Reading and Links
Two suspected witches of Salem are subjected to a test by the Witchfinder General.
To ascertain whether they have magical powers of telepathy (They haven’t, by the way) they will be separated and seated at a table in the blue room (Suspect 1) and the yellow room (Suspect 2). They will be unable to see each other or communicate in any way.
Before being separated they are allowed a few private moments together.
After being separated, they are given a deck of cards each and asked to extract one card from the deck.
They are allowed to look at their chosen card if they wish, but what they must actually do is to name the colour of the card that the other suspect has drawn.
It is a standard deck of cards, so there is a 1 in 2 chance the chosen card is black, and the same that it is red.
The game will be repeated ten times, to reduce the chance that they will survive by simple good fortune.
If in any round they both correctly identify the colour of the other person’s card, then they will both die.
If both suspects are wrong, or one is wrong, in every round, then both are free to go.
There are two questions:
1. What is the probability they will survive by chance?
2. Is there a co-operative strategy they could agree on before being separated to guarantee they both survive?
Think about it: In any round, what is the chance that each suspect will correctly name the colour of the other suspect’s card? A half? A quarter? What about over ten successive rounds?
To survive, they must avoid this over ten rounds. Is there a way they can take chance out of it, and make sure that at least one of them names the wrong colour for the other suspect’s card, for ten rounds in a row.
If so, that is the door to freedom. Remember that they can secretly hatch a joint strategy and they either both survive or both die, so they can trust each other to stick to the plan, if there is one.
Spoiler Alert (The Solution)
In the first round, the chance that the suspect in the blue room will correctly name the colour of the other suspect’s card is ½. Similarly for the suspect in the yellow room.
These are independent events, so the probability of being condemned after first hands are dealt (i.e. both name the colour of the other suspect’s card correctly) = ½ x ½ = ¼.
So probability of surviving first hand = ¾
Probability of surviving 10 hands = (3/4)10 = 0.0563, i.e. 5.63%
But there is a strategy to ensure survival, if they can agree on it before.
Can you work it out?
The solution is for player 1 to guess the same colour as his own card, and player 2 to guess a different colour to his card. This way they will always survive.
Thus:
Red Red gives Red Black – they survive.
Black Black gives Black Red – they survive.
Black Red gives Black Black – they survive.
Red Black gives Red Red – they survive.
To better conceal the strategy, they could also decide to alternate roles.
This is the optimal outcome in a game where the two players are able to co-ordinate a strategy in advance, and where trust is guaranteed because they both stand to gain by sticking to the strategy.
There are other scenarios in which the superior strategy from the point of view of one or both players is to defect from the strategy they would adopt if they were free to strike an enforceable deal. One such scenario is known as the Prisoner’s Dilemma problem. In this problem, the optimal strategy for each player, when a deal cannot be enforced, is to choose a strategy worse than they could reach co-operatively.
Further Reading and Links
Further and deeper exploration of paradoxes and challenges of intuition and logic can be found in my recently published book, Probability, Choice and Reason.
Benford’s Law is one of those laws of statistics that defies common intuition. Essentially, it states that if we randomly select a number from a table of real-life data, the probability that the first digit will be one particular number is significantly different to it being a different number. For example, the probability that the first digit will be a ‘1’ is about 30 per cent, rather than the intuitive 11 per cent or so, which assumes that all digits from 1 to 9 are equally likely. In particular, Benford’s Law applies to the distribution of leading digits in naturally occurring phenomena, such as the population of different countries or the heights of mountains. For example, choose a paper with a lot of numbers, and now circle the numbers that occur naturally, such as stock prices. So lengths of rivers and lakes could be included, but not artificial numbers like telephone numbers. About 30 per cent of these numbers will start with a 1, and it doesn’t matter what units they are in. So the lengths of rivers could be denominated in kilometres, miles, feet, centimetres, without it making a difference to the distribution frequency of the digits. Empirical support for this distribution can be traced to the man after whom the Law is named, physicist Frank Benford, in a paper he published in 1938, called ‘The Law of Anomalous Numbers.’ In that paper he examined 20,229 sets of numbers, as diverse as baseball statistics, the areas of rivers, numbers in magazine articles and so forth, confirming the 30 per cent rule for number 1. For information, the chance of throwing up a ‘2’ as first digit is 17.6 per cent, and of a ‘9’ just 4.6 per cent.
This has clear implications for fraud detection. In particular, if declared returns or receipts deviate significantly from the Benford distribution, we have an automatic red flag which those tackling fraud are, or should be, aware of.
To explain the basis of Benford’s Law, take £1 as a base. Assume this now grows at 10 per cent per day.
£1.10, £1.21, £1.33, £1.46, £1.61, £1.77, £1.94, £2.14, £2.35, £2.59, £2.85, £3.13, £3.45, £3.80, £4.18, £4.59, £5.05, £5.56, £6.11, £6.72, £7.40, £8.14, £8.95, £9.84, £10.83, £11.92, £13.11, £14.42, £15.86, £17.45, £19.19, £21.11, £23.22, £25.50, £28.10, £30.91, £34.00, £37.40, £41.14, £45.26, £49.79, £54.74, £60.24, £72.89, £80.18, £88.20, £97.02 …
So we see that the leading digits stay a long time in the teens, less in the 20s, and so on through the 90s, and this pattern continues through three digits and so forth. Benford noticed that the probability that a number starts with n = log (n+1) – log (n), so that:
NB log10 1 = 0; log10 2 = 0.301; log10 3 = 0.4771 … log10 10 = 1.
Leading digit Probability
• 1 30.1%
• 2 17.6%
• 3 12.5%
• 4 9.7%
• 5 7.9%
• 6 6.7%
• 7 5.8%
• 8 5.1%
• 9 4.6%
Further Reading and Links:
http://www.rexswain.com/benford.html
One of the classic problems of Mathemagistics, or Mathematical Magic, is the Bus Problem. It goes like this:
Question:
Every day, Fred gets the solitary 8 am bus to work. There is no other bus that will get him to his destination.
10 per cent of the time the bus is early and leaves before he arrives at 8 am.
10 per cent of the time the bus is late and leaves after 8.10 am.
The rest of the time the bus departs between 8 am and 8.10 am.
One morning Fred arrives at the bus stop at 8 am, sees no bus, and waits for 10 minutes without the bus arriving.
Now, what is the probability that Fred’s bus will still arrive?
Think about it:
Fred’s bus could yet arrive or he might have missed it. So there are two possibilities. So is it correct to assume that in the absence of further evidence the chance of each must be equal, so the probability at 8.10am that his bus will still arrive is 50 per cent?
But if that is the answer at 8.10am, was it also the correct answer at 8 am?
Or was 50 per cent the correct answer at 8am but not at 8.10am?
Or is it the wrong answer at both times, but was correct at 8.05am?
The solution is posted below.
Spoiler Alert (Solution):
Solution
When Fred arrives at 8am, there is a 10 per cent chance that his bus will have already left. After Fred has waited for 10 minutes, he can eliminate the 80 per cent chance of the bus arriving in the period between 8 am and 8.10 am. So only two possibilities remain.
Either the bus has arrived ahead of schedule or it will arrive more than ten minutes late.
Both outcomes are unusual, but since the two outcomes are mutually exclusive and equally likely (10 per cent chance of each), and there are no other possibilities, we should update the probability that the bus will still arrive from 10 per cent (the likelihood, or prior probability, when Fred woke up) to 50 per cent, as there is (once the 80 per cent probability is eliminated) an equal probability (out of the remaining 20%) that the bus will still turn up and that he has missed it. So there is a 1 in 2 chance that he will still catch his bus if he has the patience to wait further, and a 1 in 2 chance that he will wait in vain. The follow-up question is how long he should wait. That’s for another day.
Puzzle Extra:
Every day, Fred gets the solitary 8 am bus to work. There is no other bus that will get him to his destination.
10 per cent of the time the bus is early and leaves before he arrives at 8 am.
30 per cent of the time the bus is late and leaves after 8.10 am.
The rest of the time the bus departs between 8 am and 8.10 am.
One morning Fred arrives at the bus stop at 8 am, sees no bus, and waits for 10 minutes without the bus arriving.
Now, what is the probability that Fred’s bus will still arrive?
Spoiler Alert (Solution):
Solution
When Fred arrives at 8am, there is a 10 per cent chance that his bus will have already left. After Fred has waited for 10 minutes, he can eliminate the 60 per cent chance of the bus arriving in the period between 8 am and 8.10 am. So only two possibilities remain, that the bus has already left early or it will still arrive – more than ten minutes late.
Both outcomes are less likely than that the bus would arrive between 8 and 8.10am, but since the two outcomes are mutually exclusive (10 per cent chance and 30 per cent chance respectively), and there are no other possibilities, we should update the probability that the bus will still arrive from 30 per cent to something else, and the probability that it arrived early from 10 per cent to something else. Once the 60 per cent probability is eliminated, this probability should be distributed (using Bayesian principles) in the ratio of the prior probabilities of the remaining options (3 to 1 in favour of it arriving late), so 45 of the 60 per cent should be added to the prior probability of 30 per cent that it is still to arrive, and 15 of the 60 per cent should be added to the prior probability of 10 per cent that it arrived before 8am. So, at 8.10am there remains a 75 per cent that the bus will still arrive and a 25 per cent that it has already arrived and left.
Further Reading and Links
