*Solution to Exercise*

**Question 1. **You should switch to the red box.

There was a 1 in 3 chance at the outset that your original choice, the blue box, contained the prize. This does not change when I open the box which I know to be empty. There was a 2 in 3 chance that it was either the red box or the yellow box before I opened the box and by opening the yellow box, which I know to be empty, that can be eliminated. So the chance it is the red box is now 2 in 3, compared to 1 in 3 for your original choice, the blue box.

**Question 2. **It makes no difference whether you switch or not.

There was a 1 in 3 chance at the outset that your original choice, the green box, contained the prize. There was a 2 in 3 chance that it was either the pink box or the violet box before I opened the box. By randomly opening a box (I don’t know which box contains the prize), I am giving you no new information. It is the same as asking you to choose a box to open. If you randomly opened the pink box, which might have contained the prize, this means there are now two boxes left (green and violet). Each of these started with a 1 in 3 chance of containing the prize. I have not deliberately eliminated a box potentially containing the prize, so I have given you no new information to indicate which box contains the prize. So the chance of both remaining boxes rises to ½ in each case. So it makes no difference whether you switch or not.

The Gambler’s Fallacy, also known as the Monte Carlo Fallacy, is the proposition that people, instead of accepting an actual independence of successive outcomes, are influenced in their perceptions of the next possible outcome by the results of the preceding sequence of outcomes – e.g. throws of a die, spins of a wheel. Put another way, the fallacy is the mistaken belief that the probability of an event is decreased when the event has occurred recently, even though the probability of the event is objectively known to be independent across trials.

This can be illustrated by considering the repeated toss of a fair coin. The outcomes of each coin toss are in fact independent of each other, and the probability of getting heads on a single toss is 1/2. The probability of getting two heads in two tosses is 1/4, of three heads in three tosses is 1/8, and of four heads in a row is 1/16. Since the probability of a run of five successive heads is 1/32, the fallacy is to believe that the next toss would be more likely to come up tails rather than heads again. In fact, “5 heads in a row” and “4 heads, then tails” both have a probability of 1/32. Since the first four tosses turn u heads, the probability that the next toss is a head is 1/2, and similarly for tails.

While a run of five heads in a row has a probability of 1/32, this applies only before the first coin is tossed. After the first four tosses, the next coin toss has a probability of 1/2 Heads and 1/2 Tails.

The so-called Inverse Gambler’s Fallacy is where someone entering a room sees an individual rolling a double six with a pair of fair dice and concludes (with flawed logic) that the person must have been rolling the dice for some time, as it is unlikely that they would roll a double six on a first or early attempt.

The existence of a ‘gambler’s fallacy’ can be traced to laboratory studies and lottery-type games (Clotfelter and Cook, 1993; Terrell, 1994). Clotfelter and Cook found (in a study of a Maryland numbers game) a significant fall in the amount of money wagered on winning numbers in the days following the win, an effect which did not disappear entirely until after about sixty days. This particular game was, however, characterized by a fixed-odds payout to a unit bet, and so the gambler’s fallacy had no effect on expected returns. In pari-mutuel games, on the other hand, the return to a winning number is linked to the amount of money bet on that number, and so the operation of a systematic bias against certain numbers will tend to increase the expected return on those numbers.

Terrell (1994) investigated one such pari-mutuel system, the New Jersey State Lottery. In a sample of 1,785 drawings from 1988 to 1993, he constructed a subsample of 97 winners which repeated as a winner within the 60 day cut-off point suggested by Clotfelter and Cook. He found that these numbers had a higher payout than when they previously won on 80 of the 97 occasions. To determine the relationship, he regressed the payout to winning numbers on the number of days since the last win by that number. The expected payout increased by 28% one day after winning, and decreased from this level by c. 0.5% each day after the number won, returning to its original level 60 days later. The size of the gambler’s fallacy, while significant, was less than that found by Clotfelter and Cook in their fixed-odds numbers game.

It is as if irrational behaviour exists, but reduces as the cost of the anomalous behaviour increases.

An opposite effect is where people tend to predict the same outcome as the previous event, resulting in a belief that there are streaks in performance. This is known as the ‘hot hand effect’, and normally applies in the context of human performance, as in basketball shots, whereas the Gambler’s Fallacy is applied to inanimate games such as coin tosses or spins of a roulette wheel. This is because human performance may not be perceived as random in the same way as, say, a coin flip. ^{}

**Exercise**

Distinguish between the Gambler’s Fallacy, the Inverse Gambler’s Fallacy and the Hot Hand Effect. Can these three phenomena be logically reconciled?

**References and Links**

Gambler’s Fallacy. Wikipedia. https://en.wikipedia.org/wiki/Gambler%27s_fallacy

Gambler’s Fallacy. Logically Fallacious. https://www.logicallyfallacious.com/tools/lp/Bo/LogicalFallacies/98/Gambler-s-Fallacy

Gambler’s Fallacy. RationalWiki. https://rationalwiki.org/wiki/Gambler%27s_fallacy

Inverse Gambler’s Fallacy. Wikipedia. https://en.wikipedia.org/wiki/Inverse_gambler%27s_fallacy

Inverse Gambler’s Fallacy. RationalWiki. https://rationalwiki.org/wiki/Gambler%27s_fallacy

Hot Hand. Wikipedia. https://en.wikipedia.org/wiki/Hot_hand

Clotfelter, C.T. and Cook, P.J. (1993). Notes: The “Gambler’s Fallacy” in Lottery Play, Management Science, 39.12,i-1553. https://pubsonline.informs.org/doi/abs/10.1287/mnsc.39.12.1521

https://www.nber.org/papers/w3769.pdf

Terrell, D. (1994). A Test of the Gambler’s Fallacy: Evidence from Pari-Mutuel Games. Journal of Risk and Uncertainty. 8,3, 309-317. https://link.springer.com/article/10.1007/BF01064047

The Base Rate Fallacy occurs when we disregard or undervalue prior information when making a judgment on how likely something is. In particular, if presented with related base rate information (i.e. generic, general information) and specific information (information pertaining only to a certain case), the fallacy arises from a tendency to and focus on the latter at the expense of the former.^{}

For example, we are informed that someone is an avid book-lover, we might think it more likely that they are a librarian than a nurse. There are, however, many more nurses than librarians. In this example, we have not taken sufficient account of the base rate for the number of nurses relative to librarians.

Now consider testing for a medical condition, which affects 2% of the population. Assume there’s a test for this condition which will correctly identify them with this condition 95% of the time. If someone does not have the condition, the test will correctly identify them as being clear of this condition 80% of the time.

Now consider a test a random group of people. Of the 2% of patients who are suffering from the condition, 95% will be correctly diagnosed with the condition, whereas of the 98% of patients who do not have the condition, 20% will be incorrectly diagnosed as having the condition (almost 20% of the population).

What this means is that of the 21.5% of the population (0.95 x 2% + 0.2 x 98%) who are diagnosed with the condition, slightly less than 2% (0.95 x 2% = 1.9%) actually are suffering from the condition, i.e. 8.8%.

**Exercise**

Consider testing for a medical condition, which affects 4% of the population. Assume there’s a test for this condition which will correctly identify them with this condition 90% of the time. If someone does not have the condition, the test will correctly identify them as being clear of this condition 90% of the time.

If someone tests positive for the condition, what is the probability that they have the condition?

**Reading and Links**

Base Rate Fallacy. In: Paradoxes of probability and other statistical strangeness. UTS, 5 April, 2017. S. Woodcock. http://newsroom.uts.edu.au/news/2017/04/paradoxes-probability-and-other-statistical-strangeness

Base Rate Fallacy. Wikipedia. https://en.wikipedia.org/wiki/Base_rate_fallacy

*Professor Leighton Vaughan Williams – Written evidence (PPD0024)*

1. In this evidence, I consider the relationship between political betting and political opinion polls, and highlight peer-reviewed research I have undertaken into this. I also reference some other published work of mine on opinion polling and political forecasting more generally. Research I have undertaken into the impact of the dissemination of information via social media is also highlighted.

2. The recorded history of election betting markets can be traced as far back as 1868 for US presidential elections (Rhode and Strumpf, 2013) and 1503 for papal conclaves. Between 1868 and 2012, no clear favourite for the White House had lost the presidential election other than in 1948, when longshot Harry Truman defeated his Republican rival, Thomas Dewey. 2016 can be added to that list, following the defeat of strong favourite Hillary Clinton in the Electoral College.

3. The record of the betting markets in predicting the outcome of papal conclaves is somewhat more chequered and is considered in Vaughan Williams and Paton (2015) in which I examine, with my co-author Professor David Paton, the success of papal betting markets historically.

4. The potential of the betting markets and prediction markets (markets created specifically to provide forecasts) to assimilate collective knowledge and wisdom has increased in recent years as the volume of money wagered and number of market participants has soared. Betting exchanges alone now see tens of millions of pounds trading on a single election.

5. An argument made for the value of betting markets in predicting the probable outcome of elections is that the collective wisdom of many people is greater than that of the few. We might also expect that those who know more, and are better able to process the available information, would on average tend to bet more.

6. The lower the transaction costs (the betting public have not paid tax on their bets in the UK since 2001, and margins have fallen since the advent of betting exchanges) and the lower the costs of accessing and processing information (through the development of the Internet and search engines), the more efficient we might expect betting markets to become in translating information into forecasts. Modern betting markets might be expected for these reasons to provide better forecasts than ever.

7. There is plenty of anecdotal evidence about the accuracy of political betting markets, especially compared to the polls. The 1985 by-election in Brecon and Radnor is a classic example. On Election Day, July 4th, an opinion poll undertaken by the Mori polling organisation was published which gave Labour a commanding lead of 18 percent over the Liberal Alliance candidate. Ladbrokes simultaneously made the Liberal the 4/7 favourite. The Liberal won.

8. Forward 20 years to a BBC World Service live radio debate in 2005, in the run-up to the UK general election, when forecasts were swapped between the Mori representative and myself on the likely outcome of the election. I predicted a Labour majority of about 60, as I had done a few days earlier in the Economist magazine (Economist, April 14th, 2005) and on BBC Radio 4 Today (April, 18th, 2005), based on the betting at the time. The Mori representative predicted a Labour majority of over 100 based on their polling. The actual majority was 66.

9. More recent anecdotal evidence comes from the 2012 US presidential election. Barack Obama was the heavy favourite to win, while the average of the pollsters had the popular vote within 0.7%, and two leading polling organisations, Gallup and Rasmussen, had Mitt Romney ahead in final polls. Obama won by 3.9%.

10. During the later stages of the 2014 Scottish referendum campaign, the polling average had it relatively close (especially compared with the actual result), with more than one poll calling it for independence (one by 7%). The betting odds were always very strongly in favour of Scotland staying in the UK. The result echoed the 1995 Quebec separation referendum in Canada. There the final polling showed ‘Yes to separation’ with a six point lead. In the event, ‘No to separation’ won by one point. This late swing to the ‘status quo’ is credited by some with the confidence in the betting markets about a ‘NO’ outcome in Scotland.

11. In the 2015 general election in Israel, final polls showed Netanyahu’s Likud party trailing the main opposition party by 4% (Channel 2, Channel 10, Jerusalem Post), by 3% (Channel 1) and by 2% (Teleseker/Walla). Meanwhile, Israel’s Channel 2 television news on Election Day featured the odds on the online prediction market site, Predictwise. This gave Netanyahu an 80% chance of winning. The next day, Netanyahu declared that he had won “against the odds.” He actually won against the polls.

12. Polling averages during the 2015 UK general election campaign often showed Conservatives and Labour very close in terms of vote share. Meanwhile, the betting odds always had Conservative most seats as short odds-on. On the Monday before polling day, for example, the polling average had it essentially tied in terms of vote share, while Conservatives to win most seats was trading on the markets as short as 1/6.

13. For the 2015 Irish same-sex marriage referendum, the spread betting markets were offering a mid-point of 60% for YES to same-sex marriage, and 40% for NO. The average of the final opinion polls had YES on 71% and NO on 29%. The final result was 62%-38% for YES, much closer to the projection from the markets.

14. If this anecdotal evidence is correct, it is natural to ask why the betting markets outperform the opinion polls in terms of forecast accuracy. One obvious reason is that there is an asymmetry. People who bet in significant sums on an election outcome will usually have access to the polling evidence, while opinion polls do not take account of information contained in the betting odds (though the opinions expressed might). Sophisticated political bettors also take into account the past experience of how good different pollsters are, what tends to happen to those who are undecided when they actually vote, differential turnout of voters, what might drive the agenda between the dates of the polling surveys and election day itself, and so on. All of this can in principle be captured in the markets.

15. Pollsters, except perhaps with their final polls, tend to claim that they are not producing a forecast, but a snapshot of opinion. In contrast, the betting markets are generating odds about the final result. Moreover, the polls are used by those trading the markets to improve their forecasts, so they are a valuable input. But they are only one input. Those betting in the markets have access to much other information as well including, for example, informed political analysis, statistical modelling, focus groups and on-the-ground information including local canvass returns.

16. To test the reliability of the anecdotal evidence pointing to the superior forecasting performance of the betting markets over the polls, I collected vast data sets of every matched contract placed on two leading betting exchanges and from a dedicated prediction market for US elections since 2000. This was collected over 900 days before the 2008 election alone, and to indicate the size, a single data set was made up of 411,858 observations from one exchange alone for that year. Data was derived notably from presidential elections at national and state level, Senate elections, House elections and elections for Governor and Mayor. Democrat and Republican selection primaries were also included. Information was collected on the polling company, the length of time over which the poll was conducted, and the type of poll.

17. My co-author, Dr. James Reade, and I compared the betting over the entire period with the opinion polls published over that period, and also with expert opinion and a statistical model.

18. In a paper, titled ‘Forecasting Elections’ (Vaughan Williams and Reade, 2016b), published in the ‘Journal of Forecasting’ – see also Vaughan Williams and Reade, 2017, 2015), we specifically assessed opinion polls, prediction and betting markets, expert opinion and statistical modelling over this vast data set of elections in order to determine which performed better in terms of forecasting outcomes. We

considered accuracy, bias and precision over different time horizons before an election.

19. A very simple measure of accuracy is the percentage of correct forecasts, i.e. how often a forecast correctly predicts the election outcome.

20. A related but distinctly different concept to accuracy is unbiasedness. An unbiased vote share forecast is, on average, equal to the true vote share outcome. An unbiased probability forecast is also, on average, equal to the true probability that the candidate wins the election. Forecasts that are accurate can also be biased, provided the bias is in the correct direction. If polls are consistently upward biased for candidates that eventually win, then despite being biased they will be very accurate in predicting the outcome, whereas polls that are consistently downward biased for candidates that eventually win will be very inaccurate as well as biased

21. We also identified the precision of the forecasts, which relates to the spread of the forecasts.

22. We considered accuracy, bias and precision over different time horizons before an election. We found that the betting/prediction markets provided the most accurate and precise forecasts and were similar in terms of bias to opinion polls. We found that betting/prediction market forecasts also tended to improve as the elections approached, while we found evidence of opinion polls tending to perform worse.

23. In Brown, Reade and Vaughan Williams (2017), we examine the precise impact of the release of information from a leading opinion polling company on the political betting markets. To do this, we use an extensive data set of over 25 million contracts that records (anonymised) individual trader IDs for the buyers and sellers of the contracts and align this to the exact time of release of this information. We find that polling releases by this prominent opinion pollster quickly influences trading volumes and market prices, but that experienced and more aggressive liquidity-taking traders bide their time before entering the market after such news events. We find that the market prices are not at their most informative in the immediate aftermath of a poll release.

24. We also conducted research into the impact of breaking news on the markets, notably via social media and live blogging. In Vaughan Williams and Paton (2015) we use an extensive data set of contracts matched on a leading betting exchange specifically regarding the outcome of the 2013 papal election. We found that genuine information released on Twitter was not reflected in the betting markets, and was only very partially incorporated when published later on the live blog of a major British newspaper. One possible explanation is that the information was not believed as it related to a closed-door conclave (Vaughan Williams, 2015a, considers

closed door forecasting in another context). However, this finding was consistent in some respects with evidence in Vaughan Williams and Reade (2016a) about the limited impact on a leading betting exchange of major breaking news in a UK general election when released on Twitter, at least until the news was validated by traditional media.

25. In summary, the overwhelming consensus of evidence prior to the 2015 UK General Election pointed to the success of political betting markets in predicting the outcome of elections. In contrast, the 2015 UK General Election, the 2016 EU referendum in the UK, the 2016 US presidential election and the 2017 UK election, all produced results that were a shock to the great majority of pollsters as well as to the betting markets. In each case, the longshot outcome (Conservative overall majority, Brexit, Trump, No overall majority) prevailed.

26. There are various theories as to why the polls and markets broke down in these recent big votes. One theory is based on the simple laws of probability. An 80% favourite can be expected to lose one time in five, if the odds are correct. In the long run, according to this explanation, things should balance out.

27. A second theory to explain recent surprise results is that something fundamental has changed in the way that information contained in political betting markets is perceived and processed. One interpretation is that the widespread success of the betting markets in forecasting election outcomes, and the publicity that was given to this, turned them into an accepted measure of the state of a race, creating a perception which was difficult to shift in response to new information. To this extent, the market prices to some extent led opinion rather than simply reflecting it. From this perspective, the prices in the markets became somewhat sticky.

28. A third theory is that conventional patterns of voting broke down in 2015 and subsequently, primarily due to unprecedented differential voter turnout patterns across key demographics, which were not correctly modelled in most of the polling and which were not picked up by those trading the betting markets.

29. There are other theories, which may be linked to the above, including the impact of social media, and manipulation of this, on voter perceptions and voting patterns.

30. I explore how well the pollsters, ‘expert opinion’, modellers, prediction and betting markets performed in the 2017 UK general election in Vaughan Williams (2017a) – “Report card: how well did UK election forecasters perform this time?” and explore the polling failure in the 2015 UK general election in Vaughan Williams (2015b) – “Why the polls got it so wrong in the British election”, and some implications in a follow-up article (Vaughan Williams, 2015c).

31. I explore how well the pollsters, ‘expert opinion’, modellers, prediction and betting markets performed in the 2016 US presidential election in Vaughan Williams (2016) – “The madness of crowds, polls and experts confirmed by Trump victory”, and the implications of turnout projections for opinion polling in Vaughan Williams, 2017b – “Election pollsters put their methods to the test – and turnout is the key.”

References

BBC Radio 4 Today, Are betting markets a better guide to election results than opinion polls? April 18th, 2005, 0740. http://www.bbc.co.uk/radio4/today/listenagain/listenagain_20050418.shtml

Brown, A., Reade, J.J. and Vaughan Williams, L. (2017), ‘When are Prediction Market Prices Most Informative?’ Working Paper.

Economist, Punters v pollsters. Are betting markets a better guide to election results than opinion polls? April 14th, 2005. http://www.economist.com/node/3868824

Rhode, P.W. and Strumpf, K. (2013), ‘The Long History of Political Betting Markets: An International Perspective’, in: The Oxford Handbook of the Economics of Gambling, ed. L. Vaughan Williams and D. Siegel, 560-588.

Vaughan Williams, L. (2017a), ‘Report card: how well did UK election forecasters perform this time?’ The Conversation, June 10. http://theconversation.com/report-card-how-well- did-uk-election-forecasters-perform-this-time-79237

Vaughan Williams, L. (2017b), ‘Election pollsters put their methods to the test – and turnout is the key’, The Conversation, June 2. http://theconversation.com/election-pollsters-put- their-methods-to-the-test-and-turnout-is-the-key-78778

Vaughan Williams, L. (2016), ‘The madness of crowds, polls and experts confirmed by Trump victory’, The Conversation, November 9. http://theconversation.com/the-madness-of- crowds-polls-and-experts-confirmed-by-trump-victory-68547

Vaughan Williams, L. (2015a), ‘Forecasting the decisions of the US Supreme Court: lessons from the ‘affordable care act’ judgment,’ The Journal of Prediction Markets, 9 (2), 64-78.

Vaughan Williams, L. (2015b), ‘Why the polls got it so wrong in the British election’, The Conversation, May 8. http://theconversation.com/why-the-polls-got-it-so-wrong-in-the- british-election-41530

Vaughan Williams, L. (2015c), ‘How looking at bad polls can show Labour how to win the next election’, The Conversation, May 20. http://theconversation.com/how-looking-at-bad- polls-can-show-labour-how-to-win-the-next-election-42065

Vaughan Williams, L. and Paton, D. (2015), ‘Forecasting the Outcome of Closed-Door Decisions: Evidence from 500 Years of Betting on Papal Conclaves’, Journal of Forecasting, 34 (5), 391-404.

Vaughan Williams, L. and Reade, J.J. (2016a), ‘Prediction Markets, Social Media and Information Efficiency’, Kyklos, 69 (3), 518-556.

Vaughan Williams, L. and Reade, J.J. (2016b), ‘Forecasting Elections’, Journal of Forecasting, 35 (4), 308-328.

Vaughan Williams, L. and Reade, J.J. (2017), ‘Polls to Probabilities: Prediction Markets and Opinion Polls’, Working Paper.

Vaughan Williams, L. and Reade, J.J. (2015), ‘Prediction Markets and Polls as Election Forecasts’, Working Paper.

31 October 2017

When looking at many variables, it is easy to overlook how many possible correlations that are being tested. Multiple comparisons arise when a statistical analysis involves multiple simultaneous statistical tests, each of which has a potential to produce a “discovery.” For example, with a thousand variables, there are almost half a million (1,000×999/2) potential pairs of variables that might appear correlated by chance alone. While each pair is extremely unlikely in itself to show dependence, from the half a million pairs, it is very possible that a large number will appear to be dependent.

Say, for example, more than 20 comparisons are made where there is a 95% confidence level for each. In this case, you may well get a false comparison by chance. This becomes a fallacy when that false comparison is seen as significant rather than a statistical probability. This fallacy can be addressed by the use of more sophisticated statistical tests.

A classic example of the multiple comparisons fallacy is the Birthday Paradox. In a group of 23 people (assuming each of their birthdays is an independently chosen day of the year with all days equally likely), there is in fact greater than a 50 per cent chance that at least two of the group share the same birthday. This seems counter-intuitive, since it is rare to meet someone that shares a birthday. Indeed, if you select two random people, the chance that they share a birthday is about 1 in 365. With 23 people, however, there are 253 (23×22/2) pairs of people who might have a common birthday. So by looking across the whole group, we are checking whether any one of these 253 pairings, each of which independently has a tiny chance of coinciding, does indeed match. Because there are so many possibilities of a pair , it makes it more likely than not, statistically, for coincidental matches to arise. For a group of as 40 people, say, it is nearly nine times as likely that at least share a birthday than that they do not.

*References and Links*

Multiple Comparisons Fallacy. In: Paradoxes of Probability and other statistical strangeness. The Conversation. Woodcock, S. April 4, 2017. https://theconversation.com/paradoxes-of-probability-and-other-statistical-strangeness-74440

Multiple Comparisons Fallacy. Logically Fallacious. https://www.logicallyfallacious.com/tools/lp/Bo/LogicalFallacies/130/Multiple-Comparisons-Fallacy

The Multiple Comparisons Fallacy. Fallacy Files. http://www.fallacyfiles.org/multcomp.html

The Misleading Effect of Noise: The Misleading Comparisons Problem. Koehrsen, W. Feb. 7, 2018. whttps://towardsdatascience.com/the-multiple-comparisons-problem-e5573e8b9578

Birthday Problem. Wikipedia. https://en.wikipedia.org/wiki/Birthday_problem

The Will Rogers Phenomenon occurs when transferring something from one group into another group raises the average of both groups, even though there has been no change in actual values. The name of the phenomenon is derived from a comment made by comedian Will Rogers that “when the Okies left Oklahoma and moved to California, they raised the average intelligence in both states”.

In moving a data point from one group into another, the Will Rogers phenomenon occurs if the point is below the average of the group it is leaving, but above the average of the one it is joining. In this case, the average of both groups will increase.

To take an example, consider six individuals, the life expectancy of whom is assessed in turn as 5, 15, 25, 35, 45 and 55.

The individuals with an assessed life expectancy of 5 and 15 years respectively have been diagnosed with a particular medical condition. Those with the assessed life expectancies of 25, 35, 45 and 55 have not. So the mean life expectancy of those with the diagnosed condition is 10 years and those without is 40 years.

If diagnostic medical science now improves such that the individual with the 25 year life expectancy is now identified as suffering from the medical condition (previously this diagnosis was missed), then the mean life expectancy within the group diagnosed with the condition increases from 10 years to 15 years (5+15+25, divided by three). Simultaneously, the mean life expectancy of those not diagnosed with the condition rises by 5 years, from 40 years to 45 years (35+ 45+55, divided by three).

So, by moving a data point from one group into the other (undiagnosed into diagnosed), the average of both groups has increased, despite there being no change in actual values. This is because the point is below the average of the group it is leaving (25, compared to a group average of 40), but above the average of the one it is joining (25, compared to a group average of 10).

**Exercise**

Take the following groups of data, A and B.

- A={10, 20, 30, 40}
- B={50, 60, 70, 80, 90}

The arithmetic mean of A is 25, and the arithmetic mean of B is 70.

Show how transferring one data point from B to A can increase the mean of both.

Now take the following example:

- A={10, 30, 50, 70, 90, 110, 130}
- B={60, 80, 100, 120, 140, 160, 180}

By moving the data point 100 from B to A, what happens to the arithmetic mean of A and of B?

To demonstrate the Will Rogers Phenomenon, does the element which is moved have to be the very lowest of its set or does it simply have to lie between the arithmetic means of the two sets?

*References and Links*

The Will Rogers Phenomenon. Simple City. Dec. 1, 2012. https://richardelwes.co.uk/2012/12/01/the-will-rogers-phenomenon/

Will Rogers Phenomenon. Stats Mini Blog. Nov. 21, 2014. https://blogs.bmj.com/adc/2014/11/21/statsminiblog-will-rogers-phenomenon/

The “Will Rogers Phenomenon” lets you save lives by doing nothing. https://io9.gizmodo.com/the-will-rogers-phenomenon-lets-you-save-lives-by-doi-1443177486

Will Rogers Phenomenon. In: Paradoxes of Probability and Other Statistical Strangeness. Stephen Woodcock. May 26, 2017. https://quillette.com/2017/05/26/paradoxes-probability-statistical-strangeness/

Will Rogers Phenomenon. Wikipedia. https://en.m.wikipedia.org/wiki/Will_Rogers_phenomenon

There are five influential articles that have been published since 1982 on the key source of home advantage. All are agreed.

Jack Dowie’s article in New Scientist was a seminal piece. Dowie distinguishes the three Fs – fatigue, familiarity and fans, each of which might have contributed to home advantage.

Fatigue: In a sample of 40 years of data, Dowie looked for evidence that away teams’ performances drop off relative to home teams as the game progresses, as measured by the likelihood of scoring a goal at any given point during the course of the match. Away teams did score fewer goals, on average, than home teams, but this disparity got no worse as the game developed.

Familiarity: Is familiarity with the pitch a bonus for the home team? If this is a key factor, teams who are travelling from a similar pitch to the home team should be less disadvantaged than those who are travelling to a very different sort of pitch. One obvious way to test this is ask whether teams who play on relatively big pitches have a particular statistical advantage when laying host to visitors whose own home ground boasts a small pitch, and vice versa. In fact, home advantage seemed to remain constant whatever the relative pitch sizes of hosts and visitors.

Fans: Is it the absolute number of fans, or is it the relative number of home and away fans? The data showed that the advantage conferred by playing at home was significantly greater for games played in the lower divisions than in the top division, even though the absolute number of supporters was much smaller in these games. Moreover, the advantage was much less in ‘local derbies.’ The conclusion is that the balance of support is what matters at the ground.

Nevill, Balmer and Williams looked into this further in 2002, showing 40 qualified referees video footage of 47 tackles from a Premiership match. The referees were divided into two groups, half of whom were exposed to the original soundtrack, while the other half listened to a silent version of the match. Neither group had access to the original referee’s decision. In actual matches, about 60% of bookings points (10 for a yellow, 25 for a red) are awarded to the visiting team. Those referees who watched the original soundtrack were reluctant to penalise the home team, judging 15% fewer of the tackles by home players to be fouls as compared to those referees who watched the silent footage. So in the absence of crowd noise the officials were more even-handed between the home and away sides. The original referees’ decisions, however, more accurately mirrored the behaviour of those armchair referees who had access to sound. It is as if, to get the crowd off their back, they wave play on.

In ‘Scorecasting’, Moskowitz and Wertheim (2011) compile further data to test a variety of popular theories explaining home advantage. They argue that when athletes play at home, they don’t seem to hit or pitch better in baseball … or pass better in football. The crowd doesn’t appear to be helping the home team or harming the visitors. They also checked scheduling bias against the away team, concluding that while this explains some of the home-field advantage, particularly in college sports, it’s irrelevant in many sports.

Thomas Dohmen looked at home advantage in the Bundesliga, the premier football league in Germany. Dohmen found that home advantage was smaller in stadiums that happened to have a running track surrounding the soccer pitch, and larger in stadiums without a track. Why? Apparently, when the crowd sits closer to the field, the officials are more susceptible to getting caught up in the home-crowd emotion. The social atmosphere in the stadium, he argues, leads referees into favouritism despite the fact that being impartial is optimal for them in career terms.

Here is the take of Steven Levitt and Stephen Dubner. “It’s worth noting that a soccer referee has more latitude to influence a game’s outcome than officials in other sports, which helps explain why the home-field advantage is greater in soccer, around the world, than in any other pro sport … officials don’t consciously decide to give the home team an advantage – but rather, being social creatures (and human beings) like the rest of us, they assimilate the emotion of the home crowd and, once in a while, make a call that makes a whole lot of close-by, noisy people very happy.”

**References and Links**

Dohmen, T.J. (2008). The Influence of Social Forces: Evidence from the Behavior of Soccer Referees. Economic Inquiry, 46, 3, 411-424. https://onlinelibrary.wiley.com/doi/pdf/10.1111/j.1465-7295.2007.00112.x

Dowie, J. Why Spain Should Win the World Cup, New Scientist, 1982, 94 (10), 693-695. https://books.google.co.uk/books?id=OFCXnqlSFKwC&pg=PA693&lpg=PA693&dq=why+spain+should+win+the+world+cup+dowie&source=bl&ots=YLnc7jJr9L&sig=ACfU3U0PEmuQAsgtRjXyo7J-1IDfmJ1VOg&hl=en&sa=X&ved=2ahUKEwjTqIjg28rhAhWBtXEKHRiXCZAQ6AEwDHoECAYQAQ#v=onepage&q=why%20spain%20should%20win%20the%20world%20cup%20dowie&f=false

Nevill, A.M., Balmer, N.J. and Williams, A.M. (2002), The influence of crowd noise and experience upon refereeing decisions in football, Psychology of Sport and Exercise, 3 (4), 261-272. https://www.sciencedirect.com/science/article/pii/S1469029201000334

Moskowitz, T.J. and Wertheim, L.J. (2011), Scorecasting. Random House.

Levitt, S.D. and Dubner, S.J. (2015), ‘When to Rob a Bank’, Penguin Books, pp. 211-12.