Professor Leighton Vaughan Williams – Written evidence (PPD0024)
1. In this evidence, I consider the relationship between political betting and political opinion polls, and highlight peerreviewed research I have undertaken into this. I also reference some other published work of mine on opinion polling and political forecasting more generally. Research I have undertaken into the impact of the dissemination of information via social media is also highlighted.
2. The recorded history of election betting markets can be traced as far back as 1868 for US presidential elections (Rhode and Strumpf, 2013) and 1503 for papal conclaves. Between 1868 and 2012, no clear favourite for the White House had lost the presidential election other than in 1948, when longshot Harry Truman defeated his Republican rival, Thomas Dewey. 2016 can be added to that list, following the defeat of strong favourite Hillary Clinton in the Electoral College.
3. The record of the betting markets in predicting the outcome of papal conclaves is somewhat more chequered and is considered in Vaughan Williams and Paton (2015) in which I examine, with my coauthor Professor David Paton, the success of papal betting markets historically.
4. The potential of the betting markets and prediction markets (markets created specifically to provide forecasts) to assimilate collective knowledge and wisdom has increased in recent years as the volume of money wagered and number of market participants has soared. Betting exchanges alone now see tens of millions of pounds trading on a single election.
5. An argument made for the value of betting markets in predicting the probable outcome of elections is that the collective wisdom of many people is greater than that of the few. We might also expect that those who know more, and are better able to process the available information, would on average tend to bet more.
6. The lower the transaction costs (the betting public have not paid tax on their bets in the UK since 2001, and margins have fallen since the advent of betting exchanges) and the lower the costs of accessing and processing information (through the development of the Internet and search engines), the more efficient we might expect betting markets to become in translating information into forecasts. Modern betting markets might be expected for these reasons to provide better forecasts than ever.
7. There is plenty of anecdotal evidence about the accuracy of political betting markets, especially compared to the polls. The 1985 byelection in Brecon and Radnor is a classic example. On Election Day, July 4th, an opinion poll undertaken by the Mori polling organisation was published which gave Labour a commanding lead of 18 percent over the Liberal Alliance candidate. Ladbrokes simultaneously made the Liberal the 4/7 favourite. The Liberal won.
8. Forward 20 years to a BBC World Service live radio debate in 2005, in the runup to the UK general election, when forecasts were swapped between the Mori representative and myself on the likely outcome of the election. I predicted a Labour majority of about 60, as I had done a few days earlier in the Economist magazine (Economist, April 14th, 2005) and on BBC Radio 4 Today (April, 18th, 2005), based on the betting at the time. The Mori representative predicted a Labour majority of over 100 based on their polling. The actual majority was 66.
9. More recent anecdotal evidence comes from the 2012 US presidential election. Barack Obama was the heavy favourite to win, while the average of the pollsters had the popular vote within 0.7%, and two leading polling organisations, Gallup and Rasmussen, had Mitt Romney ahead in final polls. Obama won by 3.9%.
10. During the later stages of the 2014 Scottish referendum campaign, the polling average had it relatively close (especially compared with the actual result), with more than one poll calling it for independence (one by 7%). The betting odds were always very strongly in favour of Scotland staying in the UK. The result echoed the 1995 Quebec separation referendum in Canada. There the final polling showed ‘Yes to separation’ with a six point lead. In the event, ‘No to separation’ won by one point. This late swing to the ‘status quo’ is credited by some with the confidence in the betting markets about a ‘NO’ outcome in Scotland.
11. In the 2015 general election in Israel, final polls showed Netanyahu’s Likud party trailing the main opposition party by 4% (Channel 2, Channel 10, Jerusalem Post), by 3% (Channel 1) and by 2% (Teleseker/Walla). Meanwhile, Israel’s Channel 2 television news on Election Day featured the odds on the online prediction market site, Predictwise. This gave Netanyahu an 80% chance of winning. The next day, Netanyahu declared that he had won “against the odds.” He actually won against the polls.
12. Polling averages during the 2015 UK general election campaign often showed Conservatives and Labour very close in terms of vote share. Meanwhile, the betting odds always had Conservative most seats as short oddson. On the Monday before polling day, for example, the polling average had it essentially tied in terms of vote share, while Conservatives to win most seats was trading on the markets as short as 1/6.
13. For the 2015 Irish samesex marriage referendum, the spread betting markets were offering a midpoint of 60% for YES to samesex marriage, and 40% for NO. The average of the final opinion polls had YES on 71% and NO on 29%. The final result was 62%38% for YES, much closer to the projection from the markets.
14. If this anecdotal evidence is correct, it is natural to ask why the betting markets outperform the opinion polls in terms of forecast accuracy. One obvious reason is that there is an asymmetry. People who bet in significant sums on an election outcome will usually have access to the polling evidence, while opinion polls do not take account of information contained in the betting odds (though the opinions expressed might). Sophisticated political bettors also take into account the past experience of how good different pollsters are, what tends to happen to those who are undecided when they actually vote, differential turnout of voters, what might drive the agenda between the dates of the polling surveys and election day itself, and so on. All of this can in principle be captured in the markets.
15. Pollsters, except perhaps with their final polls, tend to claim that they are not producing a forecast, but a snapshot of opinion. In contrast, the betting markets are generating odds about the final result. Moreover, the polls are used by those trading the markets to improve their forecasts, so they are a valuable input. But they are only one input. Those betting in the markets have access to much other information as well including, for example, informed political analysis, statistical modelling, focus groups and ontheground information including local canvass returns.
16. To test the reliability of the anecdotal evidence pointing to the superior forecasting performance of the betting markets over the polls, I collected vast data sets of every matched contract placed on two leading betting exchanges and from a dedicated prediction market for US elections since 2000. This was collected over 900 days before the 2008 election alone, and to indicate the size, a single data set was made up of 411,858 observations from one exchange alone for that year. Data was derived notably from presidential elections at national and state level, Senate elections, House elections and elections for Governor and Mayor. Democrat and Republican selection primaries were also included. Information was collected on the polling company, the length of time over which the poll was conducted, and the type of poll.
17. My coauthor, Dr. James Reade, and I compared the betting over the entire period with the opinion polls published over that period, and also with expert opinion and a statistical model.
18. In a paper, titled ‘Forecasting Elections’ (Vaughan Williams and Reade, 2016b), published in the ‘Journal of Forecasting’ – see also Vaughan Williams and Reade, 2017, 2015), we specifically assessed opinion polls, prediction and betting markets, expert opinion and statistical modelling over this vast data set of elections in order to determine which performed better in terms of forecasting outcomes. We
considered accuracy, bias and precision over different time horizons before an election.
19. A very simple measure of accuracy is the percentage of correct forecasts, i.e. how often a forecast correctly predicts the election outcome.
20. A related but distinctly different concept to accuracy is unbiasedness. An unbiased vote share forecast is, on average, equal to the true vote share outcome. An unbiased probability forecast is also, on average, equal to the true probability that the candidate wins the election. Forecasts that are accurate can also be biased, provided the bias is in the correct direction. If polls are consistently upward biased for candidates that eventually win, then despite being biased they will be very accurate in predicting the outcome, whereas polls that are consistently downward biased for candidates that eventually win will be very inaccurate as well as biased
21. We also identified the precision of the forecasts, which relates to the spread of the forecasts.
22. We considered accuracy, bias and precision over different time horizons before an election. We found that the betting/prediction markets provided the most accurate and precise forecasts and were similar in terms of bias to opinion polls. We found that betting/prediction market forecasts also tended to improve as the elections approached, while we found evidence of opinion polls tending to perform worse.
23. In Brown, Reade and Vaughan Williams (2017), we examine the precise impact of the release of information from a leading opinion polling company on the political betting markets. To do this, we use an extensive data set of over 25 million contracts that records (anonymised) individual trader IDs for the buyers and sellers of the contracts and align this to the exact time of release of this information. We find that polling releases by this prominent opinion pollster quickly influences trading volumes and market prices, but that experienced and more aggressive liquiditytaking traders bide their time before entering the market after such news events. We find that the market prices are not at their most informative in the immediate aftermath of a poll release.
24. We also conducted research into the impact of breaking news on the markets, notably via social media and live blogging. In Vaughan Williams and Paton (2015) we use an extensive data set of contracts matched on a leading betting exchange specifically regarding the outcome of the 2013 papal election. We found that genuine information released on Twitter was not reflected in the betting markets, and was only very partially incorporated when published later on the live blog of a major British newspaper. One possible explanation is that the information was not believed as it related to a closeddoor conclave (Vaughan Williams, 2015a, considers
closed door forecasting in another context). However, this finding was consistent in some respects with evidence in Vaughan Williams and Reade (2016a) about the limited impact on a leading betting exchange of major breaking news in a UK general election when released on Twitter, at least until the news was validated by traditional media.
25. In summary, the overwhelming consensus of evidence prior to the 2015 UK General Election pointed to the success of political betting markets in predicting the outcome of elections. In contrast, the 2015 UK General Election, the 2016 EU referendum in the UK, the 2016 US presidential election and the 2017 UK election, all produced results that were a shock to the great majority of pollsters as well as to the betting markets. In each case, the longshot outcome (Conservative overall majority, Brexit, Trump, No overall majority) prevailed.
26. There are various theories as to why the polls and markets broke down in these recent big votes. One theory is based on the simple laws of probability. An 80% favourite can be expected to lose one time in five, if the odds are correct. In the long run, according to this explanation, things should balance out.
27. A second theory to explain recent surprise results is that something fundamental has changed in the way that information contained in political betting markets is perceived and processed. One interpretation is that the widespread success of the betting markets in forecasting election outcomes, and the publicity that was given to this, turned them into an accepted measure of the state of a race, creating a perception which was difficult to shift in response to new information. To this extent, the market prices to some extent led opinion rather than simply reflecting it. From this perspective, the prices in the markets became somewhat sticky.
28. A third theory is that conventional patterns of voting broke down in 2015 and subsequently, primarily due to unprecedented differential voter turnout patterns across key demographics, which were not correctly modelled in most of the polling and which were not picked up by those trading the betting markets.
29. There are other theories, which may be linked to the above, including the impact of social media, and manipulation of this, on voter perceptions and voting patterns.
30. I explore how well the pollsters, ‘expert opinion’, modellers, prediction and betting markets performed in the 2017 UK general election in Vaughan Williams (2017a) – “Report card: how well did UK election forecasters perform this time?” and explore the polling failure in the 2015 UK general election in Vaughan Williams (2015b) – “Why the polls got it so wrong in the British election”, and some implications in a followup article (Vaughan Williams, 2015c).
31. I explore how well the pollsters, ‘expert opinion’, modellers, prediction and betting markets performed in the 2016 US presidential election in Vaughan Williams (2016) – “The madness of crowds, polls and experts confirmed by Trump victory”, and the implications of turnout projections for opinion polling in Vaughan Williams, 2017b – “Election pollsters put their methods to the test – and turnout is the key.”
References
BBC Radio 4 Today, Are betting markets a better guide to election results than opinion polls? April 18th, 2005, 0740. http://www.bbc.co.uk/radio4/today/listenagain/listenagain_20050418.shtml
Brown, A., Reade, J.J. and Vaughan Williams, L. (2017), ‘When are Prediction Market Prices Most Informative?’ Working Paper.
Economist, Punters v pollsters. Are betting markets a better guide to election results than opinion polls? April 14th, 2005. http://www.economist.com/node/3868824
Rhode, P.W. and Strumpf, K. (2013), ‘The Long History of Political Betting Markets: An International Perspective’, in: The Oxford Handbook of the Economics of Gambling, ed. L. Vaughan Williams and D. Siegel, 560588.
Vaughan Williams, L. (2017a), ‘Report card: how well did UK election forecasters perform this time?’ The Conversation, June 10. http://theconversation.com/reportcardhowwell didukelectionforecastersperformthistime79237
Vaughan Williams, L. (2017b), ‘Election pollsters put their methods to the test – and turnout is the key’, The Conversation, June 2. http://theconversation.com/electionpollstersput theirmethodstothetestandturnoutisthekey78778
Vaughan Williams, L. (2016), ‘The madness of crowds, polls and experts confirmed by Trump victory’, The Conversation, November 9. http://theconversation.com/themadnessof crowdspollsandexpertsconfirmedbytrumpvictory68547
Vaughan Williams, L. (2015a), ‘Forecasting the decisions of the US Supreme Court: lessons from the ‘affordable care act’ judgment,’ The Journal of Prediction Markets, 9 (2), 6478.
Vaughan Williams, L. (2015b), ‘Why the polls got it so wrong in the British election’, The Conversation, May 8. http://theconversation.com/whythepollsgotitsowronginthe britishelection41530
Vaughan Williams, L. (2015c), ‘How looking at bad polls can show Labour how to win the next election’, The Conversation, May 20. http://theconversation.com/howlookingatbad pollscanshowlabourhowtowinthenextelection42065
Vaughan Williams, L. and Paton, D. (2015), ‘Forecasting the Outcome of ClosedDoor Decisions: Evidence from 500 Years of Betting on Papal Conclaves’, Journal of Forecasting, 34 (5), 391404.
Vaughan Williams, L. and Reade, J.J. (2016a), ‘Prediction Markets, Social Media and Information Efficiency’, Kyklos, 69 (3), 518556.
Vaughan Williams, L. and Reade, J.J. (2016b), ‘Forecasting Elections’, Journal of Forecasting, 35 (4), 308328.
Vaughan Williams, L. and Reade, J.J. (2017), ‘Polls to Probabilities: Prediction Markets and Opinion Polls’, Working Paper.
Vaughan Williams, L. and Reade, J.J. (2015), ‘Prediction Markets and Polls as Election Forecasts’, Working Paper.
31 October 2017
When looking at many variables, it is easy to overlook how many possible correlations that are being tested. Multiple comparisons arise when a statistical analysis involves multiple simultaneous statistical tests, each of which has a potential to produce a “discovery.” For example, with a thousand variables, there are almost half a million (1,000×999/2) potential pairs of variables that might appear correlated by chance alone. While each pair is extremely unlikely in itself to show dependence, from the half a million pairs, it is very possible that a large number will appear to be dependent.
Say, for example, more than 20 comparisons are made where there is a 95% confidence level for each. In this case, you may well get a false comparison by chance. This becomes a fallacy when that false comparison is seen as significant rather than a statistical probability. This fallacy can be addressed by the use of more sophisticated statistical tests.
A classic example of the multiple comparisons fallacy is the Birthday Paradox. In a group of 23 people (assuming each of their birthdays is an independently chosen day of the year with all days equally likely), there is in fact greater than a 50 per cent chance that at least two of the group share the same birthday. This seems counterintuitive, since it is rare to meet someone that shares a birthday. Indeed, if you select two random people, the chance that they share a birthday is about 1 in 365. With 23 people, however, there are 253 (23×22/2) pairs of people who might have a common birthday. So by looking across the whole group, we are checking whether any one of these 253 pairings, each of which independently has a tiny chance of coinciding, does indeed match. Because there are so many possibilities of a pair , it makes it more likely than not, statistically, for coincidental matches to arise. For a group of as 40 people, say, it is nearly nine times as likely that at least share a birthday than that they do not.
References and Links
Multiple Comparisons Fallacy. In: Paradoxes of Probability and other statistical strangeness. The Conversation. Woodcock, S. April 4, 2017. https://theconversation.com/paradoxesofprobabilityandotherstatisticalstrangeness74440
Multiple Comparisons Fallacy. Logically Fallacious. https://www.logicallyfallacious.com/tools/lp/Bo/LogicalFallacies/130/MultipleComparisonsFallacy
The Multiple Comparisons Fallacy. Fallacy Files. http://www.fallacyfiles.org/multcomp.html
The Misleading Effect of Noise: The Misleading Comparisons Problem. Koehrsen, W. Feb. 7, 2018. whttps://towardsdatascience.com/themultiplecomparisonsprobleme5573e8b9578
Birthday Problem. Wikipedia. https://en.wikipedia.org/wiki/Birthday_problem
Further and deeper exploration of paradoxes and challenges of intuition and logic can be found in my recently published book, Probability, Choice and Reason.
The Will Rogers Phenomenon occurs when transferring something from one group into another group raises the average of both groups, even though there has been no change in actual values. The name of the phenomenon is derived from a comment made by comedian Will Rogers that “when the Okies left Oklahoma and moved to California, they raised the average intelligence in both states”.
In moving a data point from one group into another, the Will Rogers phenomenon occurs if the point is below the average of the group it is leaving, but above the average of the one it is joining. In this case, the average of both groups will increase.
To take an example, consider six individuals, the life expectancy of whom is assessed in turn as 5, 15, 25, 35, 45 and 55.
The individuals with an assessed life expectancy of 5 and 15 years respectively have been diagnosed with a particular medical condition. Those with the assessed life expectancies of 25, 35, 45 and 55 have not. So the mean life expectancy of those with the diagnosed condition is 10 years and those without is 40 years.
If diagnostic medical science now improves such that the individual with the 25 year life expectancy is now identified as suffering from the medical condition (previously this diagnosis was missed), then the mean life expectancy within the group diagnosed with the condition increases from 10 years to 15 years (5+15+25, divided by three). Simultaneously, the mean life expectancy of those not diagnosed with the condition rises by 5 years, from 40 years to 45 years (35+ 45+55, divided by three).
So, by moving a data point from one group into the other (undiagnosed into diagnosed), the average of both groups has increased, despite there being no change in actual values. This is because the point is below the average of the group it is leaving (25, compared to a group average of 40), but above the average of the one it is joining (25, compared to a group average of 10).
Exercise
Take the following groups of data, A and B.
 A={10, 20, 30, 40}
 B={50, 60, 70, 80, 90}
The arithmetic mean of A is 25, and the arithmetic mean of B is 70.
Show how transferring one data point from B to A can increase the mean of both.
Now take the following example:
 A={10, 30, 50, 70, 90, 110, 130}
 B={60, 80, 100, 120, 140, 160, 180}
By moving the data point 100 from B to A, what happens to the arithmetic mean of A and of B?
To demonstrate the Will Rogers Phenomenon, does the element which is moved have to be the very lowest of its set or does it simply have to lie between the arithmetic means of the two sets?
References and Links
The Will Rogers Phenomenon. Simple City. Dec. 1, 2012. https://richardelwes.co.uk/2012/12/01/thewillrogersphenomenon/
Will Rogers Phenomenon. Stats Mini Blog. Nov. 21, 2014. https://blogs.bmj.com/adc/2014/11/21/statsminiblogwillrogersphenomenon/
The “Will Rogers Phenomenon” lets you save lives by doing nothing. https://io9.gizmodo.com/thewillrogersphenomenonletsyousavelivesbydoi1443177486
Will Rogers Phenomenon. In: Paradoxes of Probability and Other Statistical Strangeness. Stephen Woodcock. May 26, 2017. https://quillette.com/2017/05/26/paradoxesprobabilitystatisticalstrangeness/
Will Rogers Phenomenon. Wikipedia. https://en.m.wikipedia.org/wiki/Will_Rogers_phenomenon
There are five influential articles that have been published since 1982 on the key source of home advantage. All are agreed.
Jack Dowie’s article in New Scientist was a seminal piece. Dowie distinguishes the three Fs – fatigue, familiarity and fans, each of which might have contributed to home advantage.
Fatigue: In a sample of 40 years of data, Dowie looked for evidence that away teams’ performances drop off relative to home teams as the game progresses, as measured by the likelihood of scoring a goal at any given point during the course of the match. Away teams did score fewer goals, on average, than home teams, but this disparity got no worse as the game developed.
Familiarity: Is familiarity with the pitch a bonus for the home team? If this is a key factor, teams who are travelling from a similar pitch to the home team should be less disadvantaged than those who are travelling to a very different sort of pitch. One obvious way to test this is ask whether teams who play on relatively big pitches have a particular statistical advantage when laying host to visitors whose own home ground boasts a small pitch, and vice versa. In fact, home advantage seemed to remain constant whatever the relative pitch sizes of hosts and visitors.
Fans: Is it the absolute number of fans, or is it the relative number of home and away fans? The data showed that the advantage conferred by playing at home was significantly greater for games played in the lower divisions than in the top division, even though the absolute number of supporters was much smaller in these games. Moreover, the advantage was much less in ‘local derbies.’ The conclusion is that the balance of support is what matters at the ground.
Nevill, Balmer and Williams looked into this further in 2002, showing 40 qualified referees video footage of 47 tackles from a Premiership match. The referees were divided into two groups, half of whom were exposed to the original soundtrack, while the other half listened to a silent version of the match. Neither group had access to the original referee’s decision. In actual matches, about 60% of bookings points (10 for a yellow, 25 for a red) are awarded to the visiting team. Those referees who watched the original soundtrack were reluctant to penalise the home team, judging 15% fewer of the tackles by home players to be fouls as compared to those referees who watched the silent footage. So in the absence of crowd noise the officials were more evenhanded between the home and away sides. The original referees’ decisions, however, more accurately mirrored the behaviour of those armchair referees who had access to sound. It is as if, to get the crowd off their back, they wave play on.
In ‘Scorecasting’, Moskowitz and Wertheim (2011) compile further data to test a variety of popular theories explaining home advantage. They argue that when athletes play at home, they don’t seem to hit or pitch better in baseball … or pass better in football. The crowd doesn’t appear to be helping the home team or harming the visitors. They also checked scheduling bias against the away team, concluding that while this explains some of the homefield advantage, particularly in college sports, it’s irrelevant in many sports.
Thomas Dohmen looked at home advantage in the Bundesliga, the premier football league in Germany. Dohmen found that home advantage was smaller in stadiums that happened to have a running track surrounding the soccer pitch, and larger in stadiums without a track. Why? Apparently, when the crowd sits closer to the field, the officials are more susceptible to getting caught up in the homecrowd emotion. The social atmosphere in the stadium, he argues, leads referees into favouritism despite the fact that being impartial is optimal for them in career terms.
Here is the take of Steven Levitt and Stephen Dubner. “It’s worth noting that a soccer referee has more latitude to influence a game’s outcome than officials in other sports, which helps explain why the homefield advantage is greater in soccer, around the world, than in any other pro sport … officials don’t consciously decide to give the home team an advantage – but rather, being social creatures (and human beings) like the rest of us, they assimilate the emotion of the home crowd and, once in a while, make a call that makes a whole lot of closeby, noisy people very happy.”
References and Links
Dohmen, T.J. (2008). The Influence of Social Forces: Evidence from the Behavior of Soccer Referees. Economic Inquiry, 46, 3, 411424. https://onlinelibrary.wiley.com/doi/pdf/10.1111/j.14657295.2007.00112.x
Dowie, J. Why Spain Should Win the World Cup, New Scientist, 1982, 94 (10), 693695. https://books.google.co.uk/books?id=OFCXnqlSFKwC&pg=PA693&lpg=PA693&dq=why+spain+should+win+the+world+cup+dowie&source=bl&ots=YLnc7jJr9L&sig=ACfU3U0PEmuQAsgtRjXyo7J1IDfmJ1VOg&hl=en&sa=X&ved=2ahUKEwjTqIjg28rhAhWBtXEKHRiXCZAQ6AEwDHoECAYQAQ#v=onepage&q=why%20spain%20should%20win%20the%20world%20cup%20dowie&f=false
Nevill, A.M., Balmer, N.J. and Williams, A.M. (2002), The influence of crowd noise and experience upon refereeing decisions in football, Psychology of Sport and Exercise, 3 (4), 261272. https://www.sciencedirect.com/science/article/pii/S1469029201000334
Moskowitz, T.J. and Wertheim, L.J. (2011), Scorecasting. Random House.
Levitt, S.D. and Dubner, S.J. (2015), ‘When to Rob a Bank’, Penguin Books, pp. 21112.
Further and deeper exploration of paradoxes and challenges of intuition and logic can be found in my recently published book, Probability, Choice and Reason.
The basis of the martingale betting system is a strategy in which the gambler doubles the bet, such as a coin toss, after every loss, so that the first win would recover all previous losses plus a profit equal to the original stake. The martingale strategy has been applied to roulette in particular, where the probability of hitting either red or black is near to 50 per cent.
Take the case of a gambler who wagers £2 on Heads, at even money, so profits by £2 if the coin lands Heads and loses £2 if it lands Tails. If he loses, he doubles the stake on the next bet, to £4, and wins £4 if it lands Heads, minus £2 lost on the first bet, securing a net profit over both bets of £2 (£4 – £2). If it lands Tails again, however, he is £6 down, so he doubles the stake in the next bet to £8. If it lands Heads he wins £8, minus £6 lost on the first two nets, securing a net profit over the three bets of £2 (£8 – £6). This can be generalized for any number of bets. Whenever he wins, the gambler secures a net profit over all bets of £2.
The strategy is essentially, therefore, one of chasing losses. In the above example, the loss after n losing rounds is equal to 2+2²+2³+…+ 2^{n}
So the strategy is to bet in the next round 2+2²+2³+…+ 2^{n }+2
In this way, the profit whenever the coin lands Heads is 2.
For a gambler with infinite wealth, and hence an infinite number of coin tosses to eventually generate heads, the martingale betting strategy has been interpreted as a sure win.
However, the gambler’s expected value remains zero (or less than zero) because the small probability of a very large loss exactly balances out the expected gain. In a casino, the expected value is in fact negative, due to the house edge. There is also conventionally a house limit on bet size.
The martingale strategy fails, therefore, whenever there is a limit on earnings or on bets or bet size, as is the case in the real world. It is only with infinite or boundless wealth, bet size and time that it could be argued that the martingale becomes a winning strategy.
Appendix
Probability of losing three fair coin tosses = 1/8
Probability of losing n times = 1/2^{n}
Total loss with starting stake of 2, with 3 losses of coin toss = 2 + 4 + 8 = 14.
So martingale strategy suggests a bet of 14 + 2 = 16.
Loss after n losing rounds = 2 + 2^{2 }+ … + 2^{n}
So martingale bet = (2 + 2^{2} + … + 2^{n}) + 2 = 2^{n+1}
This strategy always wins a net 2.
This strategy, of always betting to win more than lost so far, works in principle, regardless of the odds, or whether they are fair. If each bet has a 1 in 10 chance of success, for example, the probability of 12 successive losses is about 30%, but the martingale strategy is to bet to win more on the 13^{th} coin toss than the sum of losses to that point.
This holds so long as there is no finite stopping point at which the next martingale bet is not available (such as a maximum bet limit) or can’t be afforded.
So, let us assume that everyone has some number of losses such that they don’t have enough money to pay a stake large enough for the next round that it would cover the sum of the losses to that point. Call this run of losses n.
n differs across people and could be very high or very low.
Probability of losing n times = 1/2^{n}
Using a martingale +2 strategy, the player wins 2 if able to play on, and then wins.
So, the player wins 2 with a probability of (11/2^{n})
Total losses after n losing bets = (2 + 2^{2} + … + 2^{n}) = (2^{n+1} – 2)
Expected gain is equal to the probability of not folding times the gain plus the probability of folding times the loss.
Expectation = (1 – 1/2^{n}) . 2 – 1/2^{n} (2^{n+1} – 2)
= 2 – 2/2^{n} – 2 + 2/2^{n} = 0.
So the expected gain in a fair game for any finite number of bets is zero using the martingale system, but it is positive if the system can be played to infinity. The increment per round need not be 2, but could be any number, x. The net gain to a winning bet is this number, x.
The intuitive explanation for the zero expectation is that the player (take the simplest case of an increment per round of 2) wins a modest gain (2) with a very good probability (1 – 1/2^{n}) but with a small probability (1/2^{n}) makes a disastrous loss (2^{n+1} – 2).
More generally, for an increment of x:
Expectation = (1 – 1/x^{n}) . x – 1/x^{n} (x^{n+1} – x)
= x – x/x^{n} – x + x/x^{n} = 0.
The mathematical paradox remains. In the case where on the nth round, the bet is 2^{n}, the martingale expectation = ½ x 2 + ¼ x 2^{2} + 1/8 x 2^{3} + … = 1 + 1 + 1 + 1 … ∞
Yet the actual expectation, when the odds are fair, in all realistic cases = 0.
If the odds are tilted against the bettor, so that for example the bettor wins less if a fair coin lands Heads than he loses if it lands Tails, the expected gain in a finite series of coin tosses is less than zero, but the same principle applies.
Exercise
Show that the expected value of martingale strategy in a fair game of heads/tails is zero. Show how this can be reconciled with the fact that whenever the player wins, the net overall profit to the player is positive.
References and Links
Martingale (betting system). Wikipedia. https://en.m.wikipedia.org/wiki/Martingale_(betting_system)
Further and deeper exploration of paradoxes and challenges of intuition and logic can be found in my recently published book, Probability, Choice and Reason.
It is said that on returning from a day at the races, a certain Lord Falmouth was asked by a friend how he had fared. “I’m quits on the day”, came the triumphant reply. “You mean by that,” asked the friend, “that you are glad when you are quits?” When Falmouth replied that indeed he was, his companion suggested that there was a far easier way of breaking even, and without the trouble or annoyance. “By not betting at all!” The noble lord said that he had never looked at it like that and, according to legend, gave up betting from that very moment.
While this may well serve as a very instructive tale for many, Ed Thorpe, writing in 1962, took a rather different view. He had devised a strategy, based on probability theory, for consistently beating the house at Blackjack (or ‘21’). In his book, ‘Beat the Dealer: A Winning Strategy for the Game of Twenty – One’, Thorp presents the system. On the inside cover of the dust jacket he claims that “the player can gain and keep a decided advantage over the house by relying on the strategy”.
The basic rules of blackjack are simple. To win a round, the player has to draw cards to beat the dealer’s total and not exceed a total of 21. Because players have choices to make, most obviously as to whether to take another card or not, there is an optimal strategy for playing the game. The precise strategy depends on the house rules, but generally speaking it pays, for example, to hit (take another card) when the total of your cards is 14 and the dealer’s faceup card is 7 or higher. If the dealer’s faceup card is a 6 or lower, on the other hand, you should stand (decline another card). This is known as ‘basic strategy.’
While basic strategy will reduce the house edge, it is not enough to turn the edge in the player’s favour. That requires exploitation of the additional factor inherent in the tradition that the used cards are put to one side and not shuffled back into the deck. This means that by counting which cards have been removed from the deck, we can reevaluate the probabilities of particular cards or card sizes being dealt moving forward. For example, a disproportionate number of high cards in the deck is good for the player, not least because in those situations where the rules dictate that the house is obliged to take a card, a plethora of remaining high cards increases the dealer’s probability of going bust (exceeding a total of 21).
Thorp’s genius was in devising a method of reducing this strategy to a few simple rules which could be understood, memorized and made operational by the average player in real time. As the book blurb puts it, “The presentation of the system lends itself readily to the rapid play normally encountered in the casinos.” Essentially, all that is needed is to attach a tag to specific cards (such as +1 or 1) and then add or subtract the tags as the cards are dealt.Depending on the net score in relation to the cards dealt, it is easy to see whether the edge iswith the house or the player. This system is called keeping a ‘running count.’
There are variations on this theme, but the core strategy and original insights hold. The problem simply changed to one familiar to many successful horse players, i.e. how to get your money on before being closed down.
References and Links
Card counting. Wikipedia. https://en.wikipedia.org/wiki/Card_counting
https://wizardofodds.com/games/blackjack/cardcounting/introduction/
https://wizardofodds.com/games/blackjack/cardcounting/highlow/
4Deck to 8Deck Blackjack Strategy. https://wizardofodds.com/games/blackjack/strategy/4decks/
The AceFive Count. https://wizardofodds.com/games/blackjack/appendix/17/
Superforecasting’ is a term popularised from insights gained as part of a fascinating idea known as the ‘Good Judgment Project’, which consists of running tournaments where entrants compete to forecast the outcome of national and international events.
The key conclusion of this project is that an identifiable element of those taking part (socalled ‘Superforecasters’) were able to consistently and significantly outpredict their peers. To the extent that this ‘superforecasting’ is real, and it seems to be, it provides support for the belief that markets can not only be beaten but systematically so.
So what is special about these ‘Superforecasters’? A key distinguishing feature of these wizards of prediction is that they tend to update their estimates much more frequently than regular forecasters, and they do so in smaller increments. Moreover, they tend to break big intractable problems down into smaller tractable ones.
They are also much better than regular forecasters at avoiding the trap of underweighting new information or overweighting it. In particular, they are good at evaluating probabilities dispassionately using a socalled Bayesian approach, i.e. establishing a prior (or baseline) probability that an event will occur, and then constantly updating that probability as new information emerges, incrementally updating in proportion to the weight of the new evidence.
In adopting this approach, the Superforecasters are echoing the response of legendary economist, John Maynard Keynes, to a criticism made to his face that he had changed his position on monetary policy.
“When my information changes, I alter my conclusions. What do you do, Sir?”
In this, Keynes was one of the great ‘Superforecasters.’ Keynes went on to earn a fortune betting in the currency and commodity markets.
Superforecasters in the field of sports betting can benefit in particular from betting inrunning, while the event is taking place. Their evaluations are also likely to be datadriven, and are updated as frequently as possible, taking into account variables some of which may not even exist prematch.
They will be aware of players who tend to struggle to close the deal, whether in golf, tennis, snooker, or whatever, and who may be value ‘lays’ when trading inrunning at short prices. Or shaky starters, like batsmen whose average belies their likely performance once they get into double figures. This information is only valuable, however, if the market doesn’t already incorporate it. So they gain an edge by access to and dispassionate analysis of large data sets. Moreover, they are very aware that patterns spotted, and conclusions derived, from small data sets can be dangerous, and potentially very hazardous to the accumulation of wealth.
Superforecasters also tend to use ‘Triage’. This is the process of determining the most important things from amongst a large number that require attention. Risk expert and Hedge Fund manager, Aaron Brown offers an example of how, when he first got interested in basketball in the 1970s there were data analysts who tried to analyse the game from scratch. He considered that a hard proposition compared to asking which team was likely to attract more betting interest. As Los Angeles was a rich and highbetting city, and the LA Lakers a glamorous team, he figured it wasn’t hard to guess that the betting public would disproportionately favour the Laker and that therefore the spread would be slanted against them. ‘Bet against the Lakers at home’ became his strategy, and he observes that it took a lot less effort than simulating basketball games.”
Could such a simple strategy work today, tweaked or otherwise? And in what circumstances would you apply it? That’s a more nuanced issue, but Superforecasters (who are normally very keen on big data sets) would be alert to it.
Aaron Brown sees trading contracts on the future as striking the right balance between under and overconfidence, between prudence and decisiveness. The hard part about this, he observes, is that confidence is negatively correlated to accuracy. Even experienced risk takers bet more when they’re wrong than when they’re right, he says, and the most confident people are generally the least reliable.
The solution, he maintains, is to keep careful, objective records, preferably by a third party.
That’s right – even experienced risk takers bet more when they’re wrong than when they’re right. If true, this is a critical insight.
So how might a Superforecaster go about constructing a sports forecasting model?
Let’s say he wants to construct a model to forecast the outcome of a football match or a golf tournament. In the former, he might focus on assessing the likely team lineup before its announcement, and draw on his hopefully extensive data set to eke out an edge from that. The football market is very liquid and likely to be quite efficient to known information, so any forecasting edge in terms of estimating future information, like team shape, can be critical. The same might apply to rugby, cricket, and other team games.
In terms of golf, he could include statistics on the average length of drive of the players, their tee to green percentages, their putting performance, the weather, the type of course, and so on. But where is the edge over the market?
He could try to develop a better model than others, including using new, stateoftheart econometric techniques. In trying to improve the model, he could also seek to identify additional explanatory variables.
He might also turn to the field of ‘prospect theory’, a body of work pioneered by Daniel Kahneman and Amos Tversky. This states that people behave and make decisions according to the frame of reference rather than just the final outcome. Humans, according to prospect theory, do not think or think or behave totally rationally, and this could be built that into the model.
In particular, a key plank of prospect theory is ‘loss aversion’, the idea that people treat losses more harshly than equivalent gains, and that they view these losses and gains with regard to a sometimes artificial frame of reference.
An excellent seminal paper on this effect in golf (by Devin Pope and Maurice Schweitzer, in the American Economic Review), is a good example of the sort of way in which study of the economic literature can improve sports modelling. The key contribution of the Pope and Schweitzer paper is that it shows how prospect theory can play a role even in the behaviour of highly experienced and wellincentivised professionals. In particular, they demonstrate, using a database of millions of putts, that professional golfers are significantly more likely to make a putt for par than a putt for birdie, even when all other factors, such as distance to the pin, break, are allowed for. But why? And how does prospect theory explain it?
To find the explanation, they examine a number of possible explanations, and reject them one by one until they determine the true explanation. The find it is because golfers see par as the ‘reference’ score, and so a missed par is viewed (subconsciously or otherwise) by these very human golfers as a significantly greater loss than a missed birdie. They react irrationally in consequence, and cannot help themselves from doing so even when made aware of it. The researchers show that equivalent birdie putts tend to come up slightly too short relative to par putts. This is valuable information for Superforecasters, or even the casual bettor. It is also valuable information for a sports psychologist. If only someone could stand close to a professional golfer every time they stand over a birdie putt and whisper in their ear ‘This is for Par’, it would over time make a significant difference to their performance and pay.
So Superforecasters will Improve their model by increments, taking into account factors which more conventional thinkers might not even consider, and will apply due weight to updating their forecasts as new information emerges.
In conclusion, how might we sum up the difference between a Superforecaster and an ordinary mortal? Watch them as they view the final holes of the Masters golf tournament. What’s the chance of Sergio Garcia sinking that 10footer? The ordinary mortal will just see the putt, the distance to the hole and the potential break of the ball on the green. The Superforecaster is going one step further, and also asking whether the 10footer is for par or birdie. It really does make a difference, and it’s why she is watching from the members’ area at the Augusta National Golf Club. She has earned her place there, and she knew it before anyone else.
Further Reading and Links
D.G. Pope and M.E. Schweitzer, 2011, Is Tiger Woods LossAverse? Persistent Bias in the Face of Experience, Competition and High Stakes, American Economic Review, 101(1), 129157. https://repository.upenn.edu/cgi/viewcontent.cgi?article=1215&context=mgmt_papers
Philip Tetlock and Dan Gardner, Superforecasting: The Art and Science of Prediction, 2016, London: Random House.
Superforecasting: The Art and Science of Predicting. Review and Summary. Stringfelloe, W. Jan 24, 2017. https://medium.com/weststringfellow/superforecastingtheartandscienceofpredictionreviewandsummarye075be35a936
Superforecasting. Wikipedia. https://en.wikipedia.org/wiki/Superforecasting
Further and deeper exploration of paradoxes and challenges of intuition and logic can be found in my recently published book, Probability, Choice and Reason.
John needs £216 to pay off an urgent debt, but has only £108 available. This is unacceptable to the lender and as good as nothing. He decides to try to win the money at the roulette wheel.
So what is his best strategy? The answer might be a little surprising. He should, in fact, put the whole lot on one spin of the wheel. Yes, that’s right. In unfavourable games (house edge against you) bold play is best, timid play is worst. Always place the fewest bets you need to reach your target.
Take the case, for example of a singlezero roulette wheel. So there are 36 slots and the zero and the payout to a winning bet is 35/1, while the chance of winning is 1 in 37 (so the payout should be at odds of 36/1). The way to look at it is that the house edge is equal to the proportion of times the ball lands in the zero slot, which is 1/37 or 2.7 per cent. This edge in favour of the house is the same whatever individual bet we make.
So let’s see what happens when John goes for the ‘bold’ play and stakes the entire £108 on Red. In this case, 18 times out of 37 (statistically speaking), or 48.6 per cent of the time, John can cash his chips immediately for £216. Of course, he is only doing this once, so this 48.6 per cent should be interpreted as the probability that he will win the £216.
An alternative ‘timid’ strategy is to divide his money into 18 equal piles of £6, and be prepared to make successive bets on a single number until he either runs out of cash or one bet (at 35 to 1) yields him £210 plus his stake = £216.
To calculate the odds of success using this timid strategy, first calculate the chance that all the bets lose. So any single bet loses with a probability of 36 in 37. So the chance that all 18 bets lose = (36/37)18 = 0.61. Therefore, the probability that at least one bet wins = 1 0.61 = 0.39. The chance that he will achieve his target has been reduced, therefore, from 48.6 per cent to 39 per cent by substituting the timid strategy for the bold play.
There are many alternative staking strategies that might put John over the top, but none of them can make it more probable that he will achieve his target than the boldest play of them all – the full amount on one spin of the wheel.
Exercise
You need £432 to pay off an urgent debt, but has only a bank of £216 available. This is unacceptable to the lender and as good as nothing. You decide to try to win the money at the roulette wheel.
What is the probability that you will win the target sum if you place all your bank on one spin of the wheel?
What is the probability that you will win the target sum if you divide up your bank and place £36 each on six spins of the wheel?
References and Links
StackExchange. How to win at roulette? https://math.stackexchange.com/questions/98981/howtowinatroulette
Dubins, L.E. and Savage, L.J. (1960). Optimal Gambling Systems. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC223086/pdf/pnas002110067.pdf
Further and deeper exploration of paradoxes and challenges of intuition and logic can be found in my recently published book, Probability, Choice and Reason.
What is Game Theory? Game theory is the study of models of conflict, cooperation and interaction between rational decisionmakers. A key idea in the study of Game Theory is the Nash Equilibrium (named after John Nash), which is a solution to a game involving two or more players who want the best outcome for themselves and must take account of the actions of others.
Specifically, if there is a set of ‘game’ strategies with the property that no ‘player’ can benefit by changing their strategy while the other players keep their strategies unchanged, then that set of strategies and the corresponding payoffs constitute the Nash equilibrium. Assume, for example, there is a simple twoplayer game in which each Player (Bill and Ben) can adopt a ‘Friendly’ (smiles) or a ‘Hostile’ (scowls) approach. Now, depending on their respective actions, let’s say the game organiser awards monetary payoffs to each player.
An example of a payoff structure is shown in the next table and is known to each player.

Ben ‘Friendly’ 
Ben ‘Hostile’ 
Bill ‘Friendly’ 
750 to A; 1000 to B 
25 to A; 2000 to B 
Bill ‘Hostile’ 
1000 A; 50 to B

30 to A; 51 to B 
Now, what is Bill’s best response to each of Ben’s actions?
If Ben acts ‘Friendly’, Bill’s best payoff is to act ‘Hostile.’ This yields a payoff of 1000. If he had acted ‘Friendly’ he would have earned a payoff of only 750.
If Ben acts ‘Hostile’, Bill’s best response is if he acts ‘Hostile’. He earns 30 instead of a payoff of 25 if he acted ‘Friendly.’
In both cases his best response is to act ‘Hostile’.
Now, what is Ben’s best response to each of Bill’s actions?
If Bill acts ‘Friendly’, Ben’s best payoff is if he acts ‘Hostile.’ This yields a payoff of 2000. If he had acted ‘Friendly’ he would have earned a payoff of only 1000.
If Bill acts ‘Hostile’, Ben’s best response is if he acts ‘Hostile’. He earns 51 instead of a payoff of 50 if he acted ‘Friendly.’
In both cases his best response is to act ‘Hostile.’
A Nash Equilibrium exists when Ben’s best response is the same as Bill’s best response.
Bill and Ben have the same best response to either action of his opponent. Both should act ‘Hostile’, in which case Bill wins 30 and Ben wins 51.
But if both had been able to communicate and reach a joint, enforceable decision, they would both presumably have acted ‘Friendly.’
So, in conclusion, they would have been better off by smiling. Instead, they both scowled, which was the rational thing for them both to do, even though it was the less satisfactory outcome for both. A case of the best strategy being the worst strategy.
Let’s turn now to the world of espionage in seeking out a Nash equilibrium. Let’s assume that there are two possible codes, and Agent Anna can select either of them and so can Agent Barbara. The payoff to selecting nonmatching codes is zero. An example of a payoff structure is shown in the next slide and is known to each Agent.

Barbara uses Code ‘A’ 
Barbara uses Code ‘B’ 
Anna uses Code ‘A’ 
1000 to Anna; 500 to Barbara 
0 to Anna; 0 to Barbara 
Anna uses Code ‘B’ 
0 to Anna; 0 to Barbara 
500 to Anna; 1000 to Barbara 
So where is the Nash equilibrium?
Let’s look at the Top Left box. Here neither Agent Anna nor Agent Barbara can increase their payoff by choosing a different action to the current one. So there is no incentive for either Agent to switch given the strategy of the other Agent. So this is a Nash equilibrium.
How about Bottom right. This is the same. Again, neither Agent Anna nor Agent Barbara can increase their payoff by choosing a different action to the current one. So there is no incentive for either Agent to switch given the strategy of the other Agent. So this is also a Nash equilibrium.
How about Top right. By choosing to use Code B instead of code A, Agent Anna obtains a payoff of 500, given Agent Barbara’s actions. Similarly for Agent Barbara, who would gain by switching to code A, given Agent Anna’s strategy. So this box (Agent Anna uses code A and Agent Barbara uses code B) is NOT a Nash equilibrium, as both Agents have an incentive to switch given what the other Agent is doing.
How about Bottom left? This is the same as Top right. There are again incentives to switchgiven what the other Agent is doing. So it is NOT a Nash equilibrium.
In conclusion, this game has two Nash equilibria – top left (both Agents use code A) and bottom right (both Agents use code B).
Let’s turning now to the classic ‘Live or Die’ problem. In this problem, there are two drivers, Peter and Paul. If both Peter and Paul drive on the left of the road, they will be safe, whilst they will crash if one decides to adhere to one side of the road and the other to the opposite.

Paul drives on the left 
Paul drives on the right 
Peter drives on the left 
Safe, Safe 
Crash, Crash 
Peter drives on the right 
Crash, Crash 
Safe, Safe 
At Top left and at Bottom right, there is no incentive for either Driver to switch to the other side of the road given the driving strategy of the other driver. They will both be safe if they adopt this strategy. So both Top left and Bottom right are Nash equilibria.
In both other scenarios (Top right and Bottom left), there is a very strong incentive to switch to the other side given the driving strategy of the other Driver. So neither Top right nor Bottom left is a Nash equilibrium.
In summary, there are two Nash equilibria in the ‘Live or Die’ problem.
Now let’s consider the case of two companies, Alligator PLC and Crocodile PLC, who each have the option of using one of two emblems. Let’s call the first the Blue Badger Emblem and the other the Black Bull emblem.

Crocodile uses Black Bull emblem 
Crocodile uses Blue Badger emblem 
Alligator uses Black Bull emblem 
1000 to Alligator; 500 to Crocodile 
500 to Alligator; 1000 to Crocodile 
Alligator uses Blue Badger emblem 
500 to Alligator; 1000 to Crocodile 
1000 to Alligator; 500 to Crocodile 
Top left: Crocodile gains by switching from Black Bull to Blue Badger.
Top right: Alligator gains by switching from Black Bull to Blue Badger.
Bottom left: Alligator gains by switching from Blue Badger to Black Bull.
Bottom right: Crocodile gains by switching Blue Badger to Black Bull.
So this game has no Nash equilibrium. There is always an incentive to switch.
So how many Nash equilibria can there be in these sorts of game? Let us recall that if there is a set of ‘game’ strategies with the property that no ‘player’ can benefit by changing their strategy while the other players keep their strategies unchanged, then that set of strategies and the corresponding payoffs constitute what is known as the ‘Nash equilibrium’.
There may be one (e.g. the Friendly/Hostile game). There may be more than one (e.g. Spy problem, ‘Live or Die’ problem). There may be none (e.g. company emblems problem).
This leads us to the classic ‘Prisoner’s Dilemma’ problem. In this scenario, two prisoners, linked to the same crime, are offered a discount on their prison terms for confessing if the other prisoner continues to deny it, in which case the other prisoner will receive a much stiffer sentence. However, they will both be better off if both deny the crime than if both confess to it. The problem each faces is that they can’t communicate and strike an enforceable deal. The box diagram below shows an example of the Prisoner’s Dilemma in action.

Prisoner 2 Confesses 
Prisoner 2 Denies 
Prisoner 1 Confesses 
2 years each 
Freedom for P1; 8 years for P2 
Prisoner 1 Denies 
8 years for P1; Freedom for P2 
1 year each 
The Nash Equilibrium is for both to confess, in which case they will both receive 2 years. But this is not the outcome they would have chosen if they could have agreed in advance to a mutually enforceable deal. In that case they would have chosen a scenario where both denied the crime and received 1 year each.
Note that the action that gave each of the prisoners the least jail time did not depend on what the other prisoner did. There was what is called a ‘dominant strategy’ for each player, and hence a single dominant strategy equilibrium. That’s the definition of a dominant strategy. It is the strategy that will give the highest payoff whatever the other person does.
Often there is no dominant strategy. We have already looked at such a situation. Driving on the right or on the left. If others drive on the right, your best response is to drive on the right too. If they drive on the left, your best response is to drive on the left. In the US, everyone driving on the right is an equilibrium, in the sense that no one would want to change their strategy given what others are doing. In game theory, if everyone is playing their best response to the strategies of everyone else, these strategies are, as we know, termed a Nash equilibrium. In Japan, though, Drive on the Left is a Nash equilibrium. So the Live or Die ‘game’ has two Nash equilibria but no dominant strategy equilibrium.
Many interactions do not have dominant strategy equilibria, but if we can find a Nash equilibrium, it gives us a prediction of what we should observe. So a Nash equilibrium is a stable state that involves interacting participants in which none can gain by a change of strategy as long as the other participants remain unchanged. It is not necessarily the best outcome for the parties involved, but it is the outcome we would most likely predict. Once again, we find that the best strategy in a world of rational selfinterested people is not the one that is actually in their selfinterest.
Perhaps the best example of an attempted reallife resolution to the Prisoner’s Dilemma was demonstrated in the TV ‘Golden Balls’ quiz show. In the game, two players must select a ball which, unknown to the other player, is either a ‘Split’ or ‘Steal’ Ball. If both choose Split, they share the prize money. If both choose ‘Steal’ they each go away with nothing. If one chooses ‘Steal’ and one chooses ‘ Split’, the contestant who chose ‘Steal wins all the money, and the contestant who chose ‘Split’ gets nothing. In this game, the Nash equilibrium among selfinterested players is StealSteal as Steal dominates Split (wins all the money compared to sharing the money if choosing Split) but loses nothing to Steal compared to choosing ‘Split’ (wins nothing either way). Steal in the Golden Balls game is this equivalent to Confess in the traditional Prisoner’s Dilemma game.
The YouTube video shown linked below is a classic demonstration of an attempt to resolve the dilemma.
Exercise
Is every Nash Equilibrium a Dominant Strategy Equilibrium? Is every Dominant Strategy Equilibrium a Nash Equilibrium? Illustrate your answer, using an example.
In the Golden Balls game, with no communication allowed outside the game format, is there a dominant strategy for each player? Is there a dominant strategy equilibrium? Is there are a Nash equilibrium? If so, what is it?
References and Links
Social Interaction: Game Theory. CORE. https://coreecon.org/theeconomy/book/text/04.html#41socialinteractionsgametheory
Equilibrium in the Invisible Hand Game. CORE. https://coreecon.org/theeconomy/book/text/04.html#42equilibriumintheinvisiblehandgame
The Prisoners’ Dilemma. CORE. https://coreecon.org/theeconomy/book/text/04.html#43theprisonersdilemma
Social Preferences: Altruism. CORE. https://coreecon.org/theeconomy/book/text/04.html#44socialpreferencesaltruism
Altruistic Preferences in the Prisoners’ Dilemma. CORE. https://coreecon.org/theeconomy/book/text/04.html#45altruisticpreferencesintheprisonersdilemma
Social interactions: Conflicts in the choice among Nash equilibria. CORE. https://coreecon.org/theeconomy/book/text/04.html#413socialinteractionsconflictsinthechoiceamongnashequilibria
Social Interactions: Conclusion. CORE. https://coreecon.org/theeconomy/book/text/04.html#414conclusion
Further and deeper exploration of paradoxes and challenges of intuition and logic can be found in my recently published book, Probability, Choice and Reason.
If there is a set of ‘game’ strategies with the property that no ‘player’ can benefit by changing their strategy while the other players keep their strategies unchanged, then that set of strategies and the corresponding payoffs constitute what is known as the ‘Nash equilibrium’.
This leads us to the classic ‘Prisoner’s Dilemma’ problem. In this scenario, two prisoners, linked to the same crime, are offered a discount on their prison terms for confessing if the other prisoner continues to deny it, in which case the other prisoner will receive a much stiffer sentence. However, they will both be better off if both deny the crime than if both confess to it. The problem each faces is that they can’t communicate and strike an enforceable deal. The box diagram below shows an example of the Prisoner’s Dilemma in action.
Prisoner 2 Confesses  Prisoner 2 Denies  
Prisoner 1 Confesses  2 years each  Freedom for P1; 8 years for P2 
Prisoner 1 Denies  8 years for P1; Freedom for P2  1 year each 
The Nash Equilibrium is for both to confess, in which case they will both receive 2 years. But this is not the outcome they would have chosen if they could have agreed in advance to a mutually enforceable deal. In that case they would have chosen a scenario where both denied the crime and received 1 year each.
So a Nash equilibrium is a stable state that involves interacting participants in which none can gain by a change of strategy as long as the other participants remain unchanged. It is not necessarily the best outcome for the parties involved, but it is the outcome we would most likely predict.
The Prisoner’s Dilemma is a onestage game, however. What happens in games with more than one round, where players can learn from the previous moves of the other players?
Take the case of a 2round game. The payoff from the game will equal the sum of payoffs from both moves.
The game starts with two players, each of whom is given £100 to place into a pot. They can then secretly choose to honour the deal or to cheat on the deal, by means of giving an envelope to the host containing the card ‘Honour’ or ‘Cheat’. If they both choose to ‘Honour’ the deal, an additional £100 is added to the pot, yielding each an additional £50. So they end up with £150 each. But if one honours the deal and the other cheats on the deal, the ‘Cheat’ wins the original pot (£200) and the ‘Honour’ player loses all the money in that round. A third outcome is that both players choose to ‘Cheat’, in which case each keeps the original £100. So in this round, the dominant strategy for each player (assuming no further rounds) is to ‘Cheat’, as this yields a higher payoff if the opponent ‘Honours’ the deal (£200 instead of £150) and a higher payoff if the opponent ‘Cheats’ (£100 instead of zero). The negotiated, mutually enforceable outcome, on the other hand, would be to agree to both ‘Honour’ the deal and go away with £150.
But how does this change in a 2round game.
Actually, it makes no difference. In this scenario, the next round is the final round, in which you may as well ‘Cheat’ as there are no future rounds to realise the benefit of any goodwill realised from honouring the deal. Your opponent knows this, so you can assume your opponent who wishes to maximise his total payoff, will be hostile on the second move. He will assume the same about you.
Since you will both ‘Cheat’ on the second and final move, why be friendly on the first move?
So the dominant strategy is to ‘Cheat’ on the first round.
What if there are three rounds? The same applies. You know that your opponent will ‘Cheat’ on the final round and therefore the penultimate round as well. So your dominant strategy is to ‘Cheat’ on the first round, the second round and the final round. The same goes for your opponent. And so on. In any finite, predetermined number of rounds, the dominant strategy in any round is to ‘Cheat.’
But what if the game involves an indeterminate number of moves? Suppose that after each move, you roll two dice. If you get a doublesix, the game ends. Any other combination of numbers, play another round. Keep playing until you get a doublesix. Your score for the game is the sum of your payoffs.
This sort of game in fact mirrors many realworld situations. In real life, you often don’t know when the game will end.
What is the best strategy in repeated play? For the game outlined above, we shall denote ‘Honour the deal’ as a ‘Friendly’ move and ‘Cheat’ as a hostile move. But the notion of a Friendly or Hostile approach can adopt other guises in different games.
There are seven proposed strategies here.
 Always Friendly. Be friendly every time
 Always Hostile. Be hostile every time
 Retaliate. Be Friendly as long as your opponent is Friendly but if your opponent is ever Hostile, you be Hostile from that point on.
 Tit for tat. Be Friendly on the first move. Thereafter, do whatever your opponent did on the previous move.
 Random. On each move, toss a coin. If Heads, be Friendly. If tails, be Hostile.
 Alternate. Be Friendly on evennumbered moves, and Hostile on oddnumbered moves, or viceversa.
 Fraction. Be Friendly on the first move. Thereafter, be Friendly if the fraction of times your opponent has been Friendly until that point is less than a half. Be Hostile if it is less than or equal to a half.
Which of these is the dominant strategy in this game of iterated play? Actually, there is no dominant strategy in an iterated game, but which strategy actually wins if every strategy plays every other strategy.
‘Always Hostile’ does best against ‘Always Friendly’ because every time you are Friendly against an ‘Always Hostile’, you are punished with the ‘sucker’ payoff.
‘Always Friendly’ does best against Retaliation, because the extra payoff you get from a Hostile move is eventually negated by the Retaliation.
Thus even the choice of whether to be Friendly or Hostile on the first move depends on the opponent’s strategy.
For every two distinct strategies, A and B, there is a strategy C against which A does better than B, and a strategy D against which B does better than A.
So which strategy wins when every strategy plays every other strategy in a tournament? This has been computer simulated many times. And the winner is Tit for Tat.
It’s true that Tit for Tat can never get a higher score than a particular opponent, but it wins tournaments where each strategy plays every other strategy. In particular, it does well against Friendly strategies, while it is not exploited by Hostile strategies. So you can trust Tit for Tat. It won’t take advantage of another strategy. Tit for Tat and its opponents both do best when both are Friendly. Look at this way. There are two reasons for a player to be unilaterally hostile, i.e. to take advantage of an opponent or to avoid being taken advantage of by an opponent. Tit for Tat eliminates the reasons for being Hostile.
What accounts for Tit for Tat’s success, therefore, is its combination of being nice, retaliatory, forgiving and clear.
In other words, success in an evolutionary ‘game’ is correlated with the following characteristics:
Be willing to be nice: cooperate, never be the first to defect.
Don’t be played for a sucker: return defection for defection, cooperation for cooperation.
Don’t be envious: focus on how well you are doing, as opposed to ensuring you are doing better than everyone else.
Be forgiving if someone is willing to change their ways and cooperate with you. Don’t bear grudges for old actions.
Don’t be too clever or too tricky. Clarity is essential for others to cooperate with you.
As Robert Axelrod, who pioneered this area of game theory in his book, ‘The Evolution of Cooperation’: Tit for Tat’s “niceness prevents it from getting into unnecessary trouble. Its retaliation discourages the other side from persisting whenever defection is tried. Its forgiveness helps restore mutual cooperation. And its clarity makes it intelligible to the other player, thereby eliciting longterm cooperation.”
How about the bigger picture? Can Tit for Tat perhaps teach us a lesson in how to play the game of life? Yes, in my view it probably can.
Further Reading and Links
Axelrod, Robert (1984), The Evolution of Cooperation, Basic Books
Axelrod, Robert (2006), The Evolution of Cooperation (Revised ed.), Perseus Books Group
Axelrod, R. and Hamilton, W.D. (1981), The Evolution of Cooperation, Science, 211, 139096. http://wwwpersonal.umich.edu/~axe/research/Axelrod%20and%20Hamilton%20EC%201981.pdf
https://en.wikipedia.org/wiki/The_Evolution_of_Cooperation