Professor Leighton Vaughan Williams

April 18, 2017

Managing and beating the line in betting markets: a primer

The ‘over-round’

In a two-horse race, if both horses have an equal chance of winning (objectively), and both are offered at evens, then the expected profit of the market-maker (and of the bettor) is zero, ignoring operating, information and transactions costs.

In a two-horse race, if both are offered at evens (regardless of the respective probabilities of victory of the two horses), then it would require a stake of £x (split equally between the two horses) to be sure of being returned that £x (a net profit of zero) whichever horse wins. In this circumstance, the over-round of the bookmaker is said to be 100%, i.e. a notional profit margin of zero.

In practice, even if the notional profit margin is zero, the bookmaker is at a disadvantage if the horses are not equally matched, as a sophisticated bettor can take advantage by staking more than half on the horse with the greater chance of winning.

More generally, the over-round does not yield an accurate indicator of the bookmaker’s profit margin if bettors do not stake across all options in such a way as to ensure that their total stake of £x yields a certain return of £x, factored by the over-round.

For example, if the over-round is 120%, the notional margin to the bookmaker is 20%, and put simply bettors would have to stake £120 to ensure a return of £100. Say, for instance, that both horses in a 2-horse race are being offered at 4 to 6. Then the bettor would need to stake £60 on each (£120 in total) to be guaranteed a return of £100 (£40 plus the £60 stake returned) whichever horse won. In such circumstances, the bookmaker is guaranteed at 20% profit, regardless of the outcome.

If one horse is offered at 4 to 6 and the other at 6 to 4, the bettor can guarantee a zero profit (and loss) by staking £60 at 4 to 6 and £40 at 6 to 4. That way, a £100 return is guaranteed for a total stake of £100, regardless of the outcome. Again, if the horse offered at 4 to 6 is actually a 4 to 7 chance, and bettors stake exclusively on this horse, their expected return is positive (although there is now a risk of losing the entire stake), and the expected return of the bookmaker is negative (though the actual return may be positive).

To summarize, the notional margin, as implied in the over-round, formally equates to the actual margin only if bettors stake proportionately more on the outcome offered at shorter odds.

Creating an over-round

Take as an example the following odds offered about a binary proposition to players, where the odds-maker believes that the objective probability of X winning is 1 in 5 (0.2) and of Y winning is 4 in 5 (0.8).

Assuming an over-round of 100% (i.e. margin of zero), the odds-setter (taken here to be a bookmaker) would set the following odds:

Odds about X = 5.0 (4 to 1): Odds about Y = 1.25 (1 to 4).

Assume now that the odds-maker wishes to create an over-round of 108%.

In each case the odds offered should be cut, by 8 per cent in each case. So 8% of 5.0 = 0.4. Deducting 0.4 from 5.0 gives 4.6. 8% of 1.25 = 0.1. Deducting 0.1 from 1.25 gives 1.15.

So in the particular example, the odds offered would be as follows:

Odds about X = 4.6; Odds about Y = 1.15.

Assuming an equal amount bet (say £1,000) bet on both sides of the proposition (i.e. a total of £2,000, consisting of perhaps 200 people betting £10 each), the profit (loss) to the bookmaker would vary depending on the outcome.

If horse X wins, the bookmaker will pay out:

4.6x £1,000 = £4,600

Total amount staked (on X and Y) = £2,000.

Net profit to bookmaker if horse X wins = £2,000 – £4,600 = – £2,600

So if horse X wins, bookmaker loses £2,600.

If horse Y wins, the bookmaker will pay out:

1.15 x £1,000 = £1,150

Total amount staked (on X and Y) = £2,000

Net profit to bookmaker if horse Y wins = £2,000 – £1,150 = £850

Expected value of profit = expected value of profit from X + expected value of profit from Y = (-£2,600) x 0.2 + (£850) x 0.8 = -£520 + £680 = £160.

This is assuming that the implied probabilities in the odds are the correct probabilities, i.e. odds of 4/1 = probability of 1/5 (0.2); odds of 1/4 = probability of 4/5 (0.8).

Note also that £160 = 8% of total stake on X and Y (£2,000).

This all assumes, as observed, that the objective probabilities are correctly observed and that the amount staked on both sides of the proposition are equal.

Even if we assume that the objective probabilities are correctly observed then there is still substantial volatility of outcome (i.e. risk) for the bookmaker. If the objective probability is incorrectly observed, however, the outcome for the bookmaker may be worse, i.e. a systematic loss.

For example, assume the probability of horse X winning is actually 25%; assume probability of horse Y winning is 75%.

At the given odds levels, and assuming equal stakes across both propositions, we derive the following.

As above, if horse X wins, the bookmaker will pay out, as before:

4.6 x £1,000 = £4,600

Total amount staked (on X and Y) = £2,000.

Net profit to bookmaker if horse X wins = £2,000 – £4,600 = – £2,600

So if horse X wins, bookmaker loses £2,600.

If horse Y wins, the bookmaker will pay out, as before:

1.15 x £1,000 = £1,150

Total amount staked (on X and Y) = £2,000

Net profit to bookmaker if horse Y wins = £2,000 – £1,150 = £850

Expected value of profit = expected value of profit from X + expected value of profit from Y = (-£2,600) x 0.25 + (£850) x 0.75 = -£650 + £637.50 = -£12.50, i.e. a loss of £12.50.

Insofar as the objective probability of horse X winning is greater than 20%, the expected profit to the bookmaker will decline. At 24.65%, the profit (rounded to the nearest pound) can be shown to be equal to zero, and above that to turn negative.

Assume objective probability of horse X winning = 0.2465; objective probability of horse Y winning = 0.753.

Then, expected value of profit = expected value of profit from X + expected value of profit from Y = (-£2,600) x 0.2465 + (£850) x 0.7535 = -£640 + £640 = 0

To the extent that the objective probabilities are inaccurately estimated, therefore there is significant potential from the bookmaker’s point of view for a negative expected (as well as actual) profit.

Using the probabilities from the original example, the staking pattern from the bettor’s point of view that will lead to a unique expected loss (8% in this case) across both betting propositions is to bet more on the favourite and less on the longshot, in this case £1,600 and £400 respectively.

This leads to the following outcomes:

Profit to a £400 bet on horse X (if it wins) at 4.60 = £1,840

Profit to a £1,600 on horse Y (if it wins) at 1.15 = £1,840

Guaranteed profit by staking these sums on each horse from the bettor’s point of view = – £160, i.e. a net loss of 8% of total stake.

Insofar as bettors can be induced to bet in these proportions, the operator is guaranteed a profit regardless of the outcome. If the average bet size is the same for bets made on either side, then we need four times as many bettors on the favourite as the longshot to achieve this. Otherwise, the same outcome can be achieved if those who are backing the favourite bet four times as much in total as those backing the longshot.

Another way to manage risk in the face of unbalanced staking patterns is to move the odds so as to limit the maximum loss.

In order to reduce the maximum downside (i.e. when X wins) the bookmaker may move the odds in such a way as to attract money on one horse and away from the other horse. To do this, the odds about one horse may be lengthened and those about the other horse shortened before a negative downside is occurred to ether outcome. While such a strategy may reduce the exposure of the operator, the price may be paid in reduced profits.

Ultimately, line management from the operator’s point of view is about balancing risk and return, while maintaining an edge in favour of the ‘house’. From the bettor’s point of view, it is about exploiting opportunities which might arise where one (or more) of the odds making up that over-round are mispriced in the bettor’s favour, a possibility which can arise even when the over-round favours the ‘house.’

April 18, 2017

Three strikes of the clock: Betting on the man to be Pope since 1503.

The history of forecasting election outcomes for betting purposes is well-documented for open elections, such as presidential elections in the US, and for longer, though in less detail for the closed elections of the Pope.

In the former, it has been traced, according to contemporaries, to the election of George Washington and has existed in organized markets since the 1860s.

The first recorded example of betting on a papal election, however, can be traced much further back, to the papal conclave of September, 1503, at which time it was considered already an old practice.

The brokers in the Roman banking houses (sensali) who made books and offered odds on who would be elected, made Cardinal Francesco Piccolomini the 100/30 favourite, ahead of Cardinals Guiliano della Rovere (100/15) and Georges d’Amboise (the favourite if judged by the vocal support of the street crowds) at 100/13.

Although Piccolomini is thought to have trailed in the first round of voting with 4 votes to 13 for d’Amboise and 15 for della Rovere, Piccolomini apparently benefited from a switch of votes from d’Amboise to himself in subsequent voting, and duly became Pope Pius III.

The bookmakers were proved right.

The next conclave for which we have the betting odds is that of December, 1521, in which odds were offered on no fewer than twenty cardinals.

Giulio de’Medici, the cousin of Leo X, was the betting favourite, at 100 to 25 (4/1), followed closely by Cardinal Alessandro Farnese at 100/20 (5/1), whose odds shortened to 100 to 40 (5/2) after a Roman mob plundered his house.

Though Farnese at one point came close to being elected Pope, he could not reach the required two-thirds of the vote, and ultimately the cardinals looked outside of the conclave, electing Adrian of Utrecht as Pope Adrian VI.

In the papal election of 1549-50, Cardinal Gianmaria del Monte (who was eventually elected Julius III) had opened in the betting as the 5/1 (against) favourite, but within three days Cardinal Reginald Pole had been established at odds of 4/1. On December 5, as balloting began, Pole was clear favourite at 100/95.

On that day, he received 26 of the 28 votes that would have given him the two-thirds majority required to elect him Pontiff. Although on the point of being made Pope by acclamation, Pole insisted on waiting until he won the formal two-thirds majority.

By the time that four additional French cardinals, opposed to Pole, arrived December 11, however, he was trading at 5/2, and a month later he was being offered at odds of 100/16. His chance had gone.

In the papal conclave of April, 1555, Gian Pietro Carafa stood a good chance of being elected pope, ranking among the top three papabile in the first ballot of the conclave. It is reported that brokers intentionally “spread the rumour that Naples [i.e. Carafa] had died”, in order to attract money on the other candidates. Carafa went on to be elected Pope.

The first 1590 conclave, in September, is the earliest in which reports of insider trading emerged, when two of the key influencers of votes in the conclave, Cardinals Montalto and Sforza secretly agreed to join forces in support of Niccolo Sfondrato.

It is reported that both made fortunes betting on him, at odds of 10/1 the day before he was elected as Pope Urban VII.

As the conclave opened, he had been trading at 100/11, compared to Giambattista Castagna, who was offered at 100/22.

During the second conclave of 1590, Cardinal Gabriele Paleotti at one point increased to an implied probability of 70 per cent in the betting: “Wednesday at the twenty-second hour rumour began to hold Paleotti as pope, and it went on increasing so that at the end of the morning, he had risen to 70 in the wagering.” The odds were not reflected in the outcome. Giovanni Battista Castagna was elected Pope Urban VII.

In 1603, despite a papal bull ‘Cogit Nos’, by Pope Gregory XIV, issued on March 21, 1591, which imposed a penalty of excommunication for wagering on papal or cardinal elections, or length of the papal reign, 21 cardinals were quoted odds of winning by the bookmakers.

The favourite was Cesare Baronius, at 10/1. The closest he came to election, however, was gaining the support of 32 cardinal electors, nine short of the required tally. Ultimately, Alessandro de’Medici became Pope Leo XI.

This ban on papal betting was abrogated in 1918 by Pope Benedict XV’s reforms.

In relation to the papal conclave of 1878, a New York Times correspondent wrote that: “The death and advents of the Popes has always given rise to an excessive amount of gambling in the lottery, and today the people of Italy are in a state of excitement that is indescribable.” There is no available known record, however, of the odds offered on that election. Similarly, the papal conclaves of 1903 and 1922 also attracted a great deal of wagering interest, which was reported widely in the international press, though no known record remains of the odds offered.

Bookmaker odds in Milan are available, however, for the 1958 conclave, which show Cardinal Angelo Roncalli the 2/1 favourite, followed by Cardinals Agagianian and Ottaviani at 3/1, then Stefan Wyszynski and Giuseppe Siri at 4/ 1. The odds were justified when Cardinal Roncalli was elected Pope John XXIII.

For the first conclave of 1978, bookmakers in London were offering odds of 5/2 about Cardinal Sergio Pignedoli, 7/2 about Sebastian Baggio and Ugo Poletti and 4/1 about Carlo Benelli. The best odds about a non-Italian were 8/1 about Johannes Willebrands. Of these only Pignedoli showed any strength in the voting, unconfirmed reports of the voting indicating that he obtained about 18 votes in the first ballot, compared to about 23 for Albino Luciani and 25 for Giuseppe Siri. Ultimately, Cardinal Luciani was elected Pope John Paul I.

For the second conclave of 1978, following the death of Pope John Paul I, the Associated Press noted that:

“Once again, there is no odds-on favourite to be elected as the new pope of the Roman Catholic Church … mentioned most often are Corradi Ursi, Salvatore Pappalardo, Ugo Poletti, Giuseppe Siri, Giovanni Colombo, Giovanni Benelli and Antonio Poma… Non-Italian front-runners include Argentinian Eduardo Pironio, 57, and Dutchman Johannes Willebrands, 68.”

Cardinal Carol Wojtyla, archbishop of Krakow, was elected Pope John Paul II, after the eighth ballot.

In 2005, Cardinal Joseph Ratzinger opened in the betting at 12/1 with one major bookmaker.

At that point, another leading bookmaker made Cardinal Arinze favourite, with Archbishop Tettamanzi, Cardinal Ratzinger and Cardinal Hummes as the next in the betting.

After three ballots, Ratzinger was favourite on two out of the three online betting boards monitored by CNN, his shortest odds being 5/2. He was at that point in the conclave being offered at between 9/2 favourite and 11/2 second favourite.

By the last day of the conclave, Cardinal Ratzinger had shortened to a clear 3/1 favourite, closely followed by Carlo Martini at 100/30 and Jean-Marie Lustiger at 7/2.

By that point, Francis Arinze had dropped back to 8/1, the same price as Claudio Hummes (who was now in the top six in all three lists). He had opened at 12/1. At the same time, Jorge Bergoglio was trading at 12/1 and Angelo Scola at 25/1.

According to a newspaper report, “among those speculating about who the next pope will be, the big money – literally is on Joseph Ratzinger, who delivered a stirring homily at the late Pope’s funeral … As of yesterday, most gambling sites gave Ratzinger … the best odds, with a host of second-tier candidates not far behind.”

Side bets were available on the name of the next pope.

Benedict was the 3 to 1 favourite. John Paul was offered at 7 to 2. Pius at 6 to 1. Peter at 8 to 1. John at 10 to 1.

Joseph Ratzinger was elected Benedict XVI.

The first show of odds following the 2005 conclave for the successor to Benedict was: Angelo Scola 6-1; Christoph Schonborn 7-1; Oscar Maradiga 7-1; Jorge Bergoglio 9-1; Francis Arinze 10-1; Dionigi Tettamanzi 25-1.

In 2013, a survey of the so-called experts made Angelo Scola favourite, although the expert assessment and the betting odds diverged to some degree after that. A survey of Vatican watchers by YouTrend.It listed Timothy Dolan of the United States as the second most likely pope, followed by Cardinals Marc Ouellet, Odilo Scherer and Thomas O’Malley. Luis Tagle of the Phillipines was sixth was ranked sixth. Some of the bookmakers’ favourites, notably Cardinals Turkson and Bertone, did not appear on this experts’ list.

The implied win probabilities in the Oddschecker display of best bookmaker odds on March 3^rd were as follows: Scola, 23%; Turkson, 22%; Bertone, 16%; Ouellet, 12%; Bagnasco, 10%; Ravasi, 8%; Sandri, 7%; Erdo, 7%; Scherer, 6%; Schonborn, 6%; Maradiaga, 5%; Arinze, 5%; O’Malley, 4%; Tagle, 4%; Bergoglio, 4%; Dolan, 3%; Hummes, 3%; Grocholewski, 3%; Dziwisz, 3%; Carrera, 2%; Piacenza, 2%; Marini, 2%; Rylko, 2%; Sarah, 2%; Martino 2%. Note that the probabilities add up to more than 100 due to rounding and the in-built margin in the bookmakers’ odds.

A Washington Post analysis, published on March 11th, calculated the implied probabilities of the ‘frontrunners’ based on betting sites including the betting exchange, Betfair.

The results were: Scola, 19.9%; Scherer, 11.9%, Turkson, 9.7%; Bertone, 8.3%; Ouellet, 5%; Erdo, 4.9%; O’Malley, 3.8%; Schonborn, 3.7%; Ravasi, 3.4%; Tagle, 2.6%; Sandri, 2.5%; Dolan, 2.3%; Bagnasco, 2.3%.

On the morning of the final ballot, on March 13th, 2013, the Guardian newspaper Liveblog reported that: “Ladbrokes has Scola at 9/4, Scherer at 3/1 and Turkson at 6/1. Paddy Power has Scola at 11/4, Scherer at 7/2 and Turkson at 9/2.”

A post by Vatican Insider journalist Andrea Tornielli was also published ahead of the final ballot, stating that “The first casting of ballots, which will serve as a primary, will see votes merge towards the Archbishop of Milan, Angelo Scola, as well as the Canadian Marc Ouellet and the Brazilian Odilo Pedro Scherer. Some votes might also go to the Argentinian Jorge Mario Bergoglio and to other cardinals mentioned during the past few hours, such as the Sinhalese Malcolm Ranjith, the American Timothy Dolan and others. It remains to be seen if, among these nominations, there will be one able to garner at least two-thirds of the votes.”

Despite this level of detail, the same article declared that “From the moment cardinal electors entered the Santa Marta residence, they have not had any contact with the outside world and have to use protected paths that are constantly under surveillance, to get about. Every space they enter is monitored and blocked off from all forms of communication… All those who have to access the Holy See during the Conclave are bound to the strictest confidentiality.”

Then came the three strikes of the clock.

The first strike of the clock was a post by Vatican Insider journalist Giacomo Galeazzi, time-stamped on Vatican Insider Twitter at 8.24am that morning. It noted that there were only five candidates left in the running: Scola, Scherer, Bergoglio, Ouellet, Dolan.

The second strike of the clock was a link to a post by Vatican Insider journalist Giacomo Galeazzi, time-stamped on Vatican Insider Twitter at 11.12am: “After the first negative scrutinies, lunch breaks and dinners in Santa Marta House, the cardinals’ residence during the conclave, become opportunities for informal discussions on disregarding candidates with weaker consensuses, to the advantage of the papabile who have obtained more votes so far (Scola, Bergoglio, Ouellet).”

So, by 11.12 am, according to Galeazzi, it was effectively down to three – Cardinals Scola, Bergoglio and Ouellet.

The third strike of the clock came at 11.57am, when the Guardian Liveblog reported that: “La Stampa’s Vatican Insider claims that most of the votes have been going to Cardinals Scola, Bergoglio and Ouellet. This morning it was claiming most of them were going to Scola, Scherer, Bergoglio, Ouellet and Dolan. But it’s hard to know where they can be getting this information from.”

So what was actually going on while the clock was striking once, twice, thrice? A post-election report, published in La Repubblica, claims that Scola received approximately 35 votes in the first vote, to 20 for Bergoglio and 15 for Ouellet. National Catholic Reporter also reports that there was some support for Scherer: “After two rounds of voting Wednesday morning, it had become clear that neither Scola nor Scherer were likely to cross the finish line and gain the 77 votes needed for election … The fourth ballot, the first of Wednesday afternoon, saw Bergoglio separate himself from the pack.”

So it appears that Galeazzi’s tweeted reports conformed broadly to what we now understand to have been the case. Somehow it seems he knew!!!

But the markets failed to respond except for a flicker towards Bergoglio on the exchanges after the Guardian Liveblog posted the niche Galeazzi tweets to their wider audience.

So, either the new information was not (for good or bad reason) sufficiently believed. Or it was for the most part overlooked by those trading on the exchanges. Or the market was not sufficiently liquid to make it possible to earn a significant return, so most sophisticated traders did not bother to participate.

Whatever the reason, the betting markets did not perform as well as might have been expected in responding to new public information, which subsequently turned out to be accurate, unless the reports were accurate by sheer chance and deserved to be disbelieved. After all, it was ‘Vatican Insider’ itself that declared how “All those who have to access the Holy See during the Conclave are bound to the strictest confidentiality.”

This cannot be explained either in terms of the fog of conflicting signals as there were no other credible sources issuing conflicting information.

So the ‘Galeazzi anomaly’, as I term it, turns into a mystery, partly because he seemed to know what he shouldn’t have known, but also because hardly anyone seemed to believe him. Giacomo Galeazzi shouted wolf, and there was a wolf! It is a lesson that some, in an efficient market, will now have learned.

Can the Henery Hypothesis Explain the Favourite-Longshot Bias?

The Favourite-Longshot Bias is the well-established tendency in most betting markets for bettors to over-bet ‘longshots’ (events with long odds, i.e. low probability events) and to relatively under-bet ‘favourites’ (events with short odds, i.e. high probability events).

Assume, for example, that Mr. Miller and Mr. Stiller both start with £1,000.

Now Mr. Miller places a level £10 stake on 100 horses quoted at 2 to 1

Mr. Stiller places a level £10 stake on 100 horses quoted at 20 to 1.

Who is likely to end up with more money at the end?

My Ladbrokes Flat Season Pocket Companion for 1990 provides a nicely laid out piece of evidence here for British flat horse racing between 1985 and 1989. The table conveniently presented in the Companion shows that not one out of 35 favourites sent off at 1/8 or shorter (as short as 1/25) lost between 1985 and 1989. This means a return of between 4% and 12.5% in a couple of minutes, which is an astronomical rate of interest. The point being made is that broadly speaking the shorter the odds, the better the return. The group of ‘white hot’ favourites (odds between 1/5 and 1/25) won 88 out of 96 races for a 6.5% profit. The following table looks at other odds groupings.

Odds Wins Runs Profit %

1/5-1/2 249 344 +£1.80 +0.52

4/7-5/4 881 1780 -£82.60 -4.64

6/4 -3/1 2187 7774 -£629 -8.09

7/2-6/1 3464 21681 -£2237 -10.32

8/1-20/1 2566 53741 -£19823 -36.89

25/1-100/1 441 43426 -£29424 -67.76

An interesting argument advanced by the Strathclyde-based statistician Dr. Robert Henery in 1985 is that the favourite-longshot bias is a consequence of bettors discounting a fixed fraction of their losses, i.e. they underweight their losses compared to their gains.

This argument also explains an observed link between the sum of bookmakers’ prices and the number of runners in a race. The prices being summed here are simply the odds. If, for example, odds of 3/1 (against) are offered about each of the five horses in a race, the implied probability of winning for each horse is ¼ and the sum of prices is 5/4.

In this context, an ‘over-round’ is defined as the excess of the sum of prices over 1, in this case ¼.

The rationale behind Henery’s hypothesis is that bettors will tend to explain away and therefore discount losses as atypical, or unrelated to the judgment of the bettor.

This is consistent with contemporaneous work on the psychology of gambling, such as Gilovich in 1983 and Gilovich and Douglas in 1986.

These studies demonstrate how gamblers tend to discount their losses, often as ‘near wins’ or the outcome of ‘fluke’ events, while bolstering their wins.

Let’s look more closely at how the Henery odds transformation works.

If the true probability of a horse losing a race is q, then the true odds against winning are q/(1-q).

For example, if the true probability of a horse losing a race (q) is ¾, the chance that it will win the race is ¼, i.e. 1- ¾. The odds against it winning are: q/(1-q) = 3/4/(1-3/4) = 3/4/(1/4) = 3/1.

Henery now applies a transformation whereby the bettor will assess the chance of losing not as q, but as Q which is equal to fq, where f is the fixed fraction of losses undiscounted by the bettor.

If, for example, f = ¾, and the true chance of a horse losing is ½ (q=1/2), then the bettor will rate subjectively the chance of the horse losing as Q = fq.

So Q = ½. ¾ = 3/8, i.e. a subjective chance of winning of 5/8.

So the perceived (subjective) odds of winning associated with true (objective odds) of losing of 50% (Evens, i.e. q=1/2) is 3/5 (60%), i.e. odds-on.

This is derived as follows:

Q/(1-Q) = fq/(1-fq) = 3/8/(1-3/8) = 3/8/(5/8) = 3/5

If the true probability of a horse losing a race is 80%, so that the true odds against winning are 4/1 (q = 0.8), then the bettor will assess the chance of losing not as q, but as Q which is equal to fq, where f is the fixed fraction of losses undiscounted by the bettor.

If, for example, f = ¾, and the true chance of a horse losing is 4/5 (q=0.8), then the bettor will rate subjectively the chance of the horse losing as Q = fq.

So Q = 3/4. 4/5 = 12/20, i.e. a subjective chance of winning of 8/20 (2/5).

So the perceived (subjective) odds of winning associated with true (objective odds) of losing of 80% (4 to 1, i.e. q=0.8) is 6/4 (40%).

This is derived as follows:

Q/(1-Q) = fq/(1-fq) = 12/20 / (1-12/20) = 12/8 = 6/4

To take this to the limit, if the true probability of a horse losing a race is 100%, so that the true odds against winning are ∞ to 1 against (q = 1), then the bettor will again assess the chance of losing not as q, but as Q which is equal to fq, where f is the fixed fraction of losses undiscounted by the bettor.

If, for example, f = ¾, and the true chance of a horse losing is 100% (q=1), then the bettor will rate subjectively the chance of the horse losing as Q = fq.

So Q = 3/4. 1 = 3/4, i.e. a subjective chance of winning of 1/4.

So the perceived (subjective) odds of winning associated with true (objective odds) of losing of 100% (∞ to 1, i.e. q=1) is 3/1 (25%).

This is derived as follows:

Q/(1-Q) = fq/(1-fq) = 3/4 / (1/4) = 3/1

Similarly, if the true probability of a horse losing a race is 0%, so that the true odds against winning are 0 to 1 against (q = 0), then the bettor will assess the chance of losing not as q, but as Q which is equal to fq, where f is the fixed fraction of losses undiscounted by the bettor.

If, for example, f = ¾, and the true chance of a horse losing is 0% (q=0), then the bettor will rate subjectively the chance of the horse losing as Q = fq.

So Q = 3/4. 0 = 0, i.e. a subjective chance of winning of 1.

So the perceived (subjective) odds associated of winning with true (objective odds) of losing of 0% (0 to 1, i.e. q=0) is also 0/1.

This is derived as follows:

Q/(1-Q) = fq/(1-fq) = 0 / 1 = 0/1

This can all be summarised in a table.

Objective odds (against) Subjective odds (against)
Evens 3/5
4/1 6/4
Infinity to 1 3/1
0/1 0/1

We can now use these stylised examples to establish the bias.

In particular, the implication of the Henery odds transformation is that, for a given f of ¾, 3/5 is perceived as fair odds for a horse with a 1 in 2 chance of winning.

In fact, £100 wagered at 3/5 yields £160 (3/5 x £100, plus stake returned) half of the time (true odds = evens), i.e. an expected return of £80.

£100 wagered at 6/4 yields £250 (6/4 x £100, plus the stake back) one fifth of the time (true odds = 4/1), i.e. an expected return of £50.

£100 wagered at 3/1 yields £0 (3/1 x £100, plus the stake back) none of the time (true odds = Infinity to 1), i.e. an expected return of £0.

It can be shown that the higher the odds the lower is the expected rate of return on the stake, although the relationship between the subjective and objective probabilities remains at a fixed fraction throughout.

Now on to the over-round.

The same simple assumption about bettors’ behaviour can explain the observed relationship between the over-round (sum of win probabilities minus 1) and the number of runners in a race, n.

If each horse is priced according to its true win probability, then over-round = 0. So in a six horse race, where each has a 1 in 6 chance, each would be priced at 5 to 1, so none of the lose probability is shaded by the bookmaker. Here the sum of probabilities = (6 x 1/6) – 1 = 0.

If only a fixed fraction of losses, f, is counted by bettors, the subjective probability of losing on any horse is f(qi), where qi is the objective probability of losing for horse i, and the odds will reflect this bias, i.e. they will be shorter than the true probabilities would imply. The subjective win probabilities in this case are now 1-f(qi), and the sum of these minus 1 gives the over-round.

Where there is no discounting of the odds, the over-round (OR) = 0, i.e. n times correct odds minus 1. Assume now that f = ¾, i.e. ¾ of losses are counted by the bettor.

If there is discounting, then the odds will reflect this, and the more runners the bigger will be the over-round.

So in a race with 5 runners, q is 4/5, but fq = 3/4 x 4/5 = 12/20, so subjective win probability = 1-fq = 8/20, not 1/5. So OR = (5 x 8/20) – 1 = 1.

With 6 runners, fq = ¾ x 5/6 = 15/24, so subjective win probability = 1 – fq = 9/24. OR = (6x 9/24) – 1 = (54/24) -1 = 1_1/4.

With 7 runners, fq = ¾ x 6/7 = 18/28, so subjective win probability = 1-fq = 10/28. OR = (7 x 10/28) – 1 = 42/28 = 1_1/2

If there is no discounting, then the subjective win probability equals the actual win probability, so an example in a 5-horse is that each has a win probability of 1/5. Here, OR = (5×1/5) – 1 = 0. In a 6-horse race, with no discounting, subjective probability = 1/6. OR = (6 x 1/6) – 1 = 0.

Hence, the over-round is linearly related to the number of runners, assuming that bettors discount a fixed fraction of losses (the ‘Henery Hypothesis’).

If the Henery Hypothesis is correct as a way of explaining the favourite-longshot bias, the bias can be explained as the natural outcome of bettors’ pre-existing perceptions and preferences.

This is quite consistent with a market efficiently processing the information available to it. Moreover, there is little evidence that the market offers opportunities for market players to earn abnormal returns or positive profits. Thus although possibilities clearly exist for earning above-average returns on the basis of weak form information, there is no convincing evidence that this contradicts a wider conceptualisation of this type of information efficiency.

Are there other explanations for the favourite-longshot bias, and the observed link between over-round and runners, which do not rely on the Henery Hypothesis?

One explanation is based on consumer preference for risk. A seminal article by Richard Emeric Quandt in the Quarterly Journal of Economics in 1986 explains the existence of the bias as a natural and necessary consequence of equilibrium in a market characterised by risk-loving bettors with homogeneous beliefs. As such, this idea that bettors are risk-loving runs contrary to conventional explanations of financial behaviour which tend to assume risk-aversion. It is possible however, that bettors should be classified differently to participants in other types of financial market, not least because of consumption benefits from racetrack and other types of betting which may not be replicated elsewhere.

Joe Golec and Maurry Tamarkin (1998, Journal of Political Economy) seek to arbitrate between the hypothesis of risk-loving bettors and a hypothesis that bettors are in fact skewness-lovers, arguing in favour of the latter explanation for the existence of a favourite-longshot bias in betting markets.

William Hurley and Lawrence McDonough (1995, American Economic Review) propose a quite different theoretical model of the favourite-longshot bias, which requires neither a hypothesis of risk-loving nor skewness-loving behaviour. Instead, the bias can arise in a risk-neutral environment, populated by at least some uninformed bettors and unsophisticated bettors, as a consequences of positive transactions and/or information costs. Michael Smith, David Paton and Leighton Vaughan Williams (2006, Economica) compare the size of the bias in person-to-person betting exchanges (characterised by lower margins/transactions costs) and bookmaker markets (higher margins/costs). They find the bias to be lower in the former, a finding which is at least consistent with this explanation.

So far, it should be noted that these are all demand-side explanations.

A major challenge to demand-side explanations of the bias was proposed by Hyun Song Shin (1991, Economic Journal), based on the idea that odds-setters respond to the adverse selection problem posed by insiders (bettors with superior information to bookmakers) by artificially squeezing odds at the longer end of the market. The consequence of this price-setting behaviour is for the betting odds to relatively understate the winning chances of favourites and to overstate the winning chances of longshots. This is the traditional favourite-longshot bias. Another implication of this modelling of odds-setting is that the over-round (the sum of implied probabilities in the odds) will tend to be greater as the number of runners increases, because more runners implies higher odds.

While Shin’s modelling can explain a favourite-longshot bias in betting markets characterised by odds-setters, and also a link between the number of runners and the bookmakers’ over-round, it can be shown (Vaughan Williams and Paton, 1997, Economic Journal) that identical results may result from demand-side explanations. To help arbitrate between these competing hypotheses, Vaughan Williams and Paton employ a large data set to distinguish between two types of race, on the basis of their relative potential for insider trading. It is shown that the correlation between the number of runners and the sum of prices is restricted to those races in which there are clear possibilities for the use of inside information. This lends empirical support to Shin’s supply-side explanation of the phenomenon. Even so, the favourite-longshot bias continues to exist in pari-mutuel markets, in which there are no odds-setters, but instead a pool of all bets which is paid out (minus fixed operator deductions) to winning bets.

To the extent that the favourite-longshot bias cannot be fully explained by the adverse selection problem facing odds-setters (certainly the case in pari-mutuel betting markets), most explanations can be classified as either preference-based or perception-based. Risk love or skewness love are examples of preference-based explanations.

Discounting of losses or other explanations based on a miscalibration of probabilities can be categorized as perception-based explanations. Marco Ottaviani and Peter Sorensen (2009, American Economic Journal), for example, show that information asymmetries between bettors may lead to misperceptions of the true probabilities of horses winning.

Behavioural theories suggest that cognitive errors and misperceptions of probabilities play a role in market mispricing. These theories incorporate laboratory studies by cognitive psychologists which show that people are systematically poor at discerning between small and tiny probabilities, and hence price both similarly. Further, people express a strong preference for certainty over extremely likely outcomes, leading highly probable gambles to be under-priced. These results form an important foundation of Prospect Theory (Daniel Kahneman and Amos Tversky, 1979).

A number of papers seek to arbitrate between preference and perceptions based explanations of the favourite-longshot bias. An example is Erik Snowberg and Justin Wolfers (2010, Journal of Political Economy), who use a novel data set comparing behaviour in simple win pools and more complex compound bets (e.g. exactas, involving identification of first and second place) to seek to discriminate between these explanations. Their results, they argue, are more consistent with misperceptions rather than risk-love. The bias persists in equilibrium because misperceptions are not large enough to generate profit opportunities for unbiased bettors. That said, the cost of the bias is still large, and de-biasing an individual bettor could reduce their costs of betting substantially.

A more recent paper seeks to extend this analysis into the world of online poker. In ‘Towards an understanding of the origins of the favourite-longshot bias: Evidence from online poker markets, a real-money natural laboratory’, first published online in Economica in 2016, Leighton Vaughan Williams and others find a favourite-longshot bias in online poker play, especially in lower stakes games. “We find that misperception rather than risk-love offers the best explanation for the behaviour that we identify.”

In conclusion, the favourite-longshot bias is a well-established market anomaly in sports betting markets, which can be traced in the published academic literature as far back as Richard Griffith (1949, American Journal of Psychology). Explanations can broadly be divided into demand-based and supply-based, preference-based and perceptions-based. A significant amount of modern research has been focused on seeking to arbitrate between these competing explanations of the bias by formulating predictions as to how data derived from these markets would behave if one or other explanation was correct. A compromise position, which may or may not be correct, is that all of these explanations have some merit, the relative merit of each depending on the market context.

April 17, 2017

The Birthday Problem

How large should a randomly chosen group of people be, to make it more likely than not that at least two of them share a birthday?

For convenience, assume that all dates in the calendar are equally likely as birthdays, and ignore the Leap Year special of February 29^th

The first thing to look at is the likelihood that two randomly chosen people would share the same birthday.

Let’s call them Fred and Felicity. Say Felicity’s birthday is May 1^st. What is the chance that Fred shares this birthday with Felicity? Well there are 365 days in the year, and only one of these is May 1^st and we are assuming that all dates in the calendar are equally likely as birthdays.

So, the probability that Fred’s birthday is May 1^st is 1/365, and the chance he shares a birthday with Felicity is 1/365.

So what is the probability that Fred’s birthday is not May 1^st?It is 364/365. This is the probability that Fred doesn’t share a birthday with Felicity.

More generally, for any randomly chosen group of two people, the probability that the second person has a different birthday to the first is 364/365.

With 3 people, the chance that all three are different is the chance that the first two are different (364/365) multiplied by the chance that the third birthday is different (363/365).

So, the probability that 3 people have different birthdays = 364/365 x 363/365

This can be written as (364)₂/ 365²

Similarly, probability that 5 people have different birthdays = (364)₄ / 365⁴

= 364x363x362x361/365⁴

So far, the chance of no matches is very high. But by the tenth person the probability of no matches is:

(364/365)*(363/365)(362/365)*(361/365)(360/365)*(359/365)(358/365)*(357/365) (356/365) = 0.8831

More generally, for n people, probability they all have different birthdays =

(364)_n-1 / 365^n-1

For 23 people, probability of all different birthdays = (364)₂₂/ 365² = 0.4927

For 22 people, probability of all different birthdays = (364)₂₁/ 365² = 0.5243

So, in a group of 23 people, there is a (1-0.4927) = 0.5073 chance of that at least two of the group share a birthday.

So how large should a randomly chosen group of people be, to make it more likely than not that at least two of them share a birthday? The answer is 23.

The intuition behind this is quite straightforward if we recognise just how many pairs of people there are in a group of 23 people, any pair of which could share a birthday.

In a group of 23 people, there are, according to the standard formula, ²³C₂pairs of people (called 23 Choose 2) pairs of people.

Generally, the number of ways k things can be chosen from n is:

ⁿ C _k = n! / (n-k)! k!

Thus, ²³C₂= 23! / 21! 2! = 23 x 22 / 2 = 253

So, in a group of 23 people, there are 253 pairs of people to choose from.

Therefore, a group of 23 people generates 253 chances, each of size 1/365, of having at least two people in the group sharing the same birthday.

These chances have some overlap: if A and B have a common birthday, and A and C have a common birthday, then inevitably so do B and C. So the probability of at least two people sharing a birthday in a group of 23 is less than 253/365 (69.3%). It is, as shown previously, 50.73%.

To conclude, the next time you see two football teams line up, include the referee. It is now more likely than not that two of those on the pitch share the same birthday. Strange, but true!

Appendix

Using experiments, events and sample spaces to solve the Birthday Problem.

Another way to look at the Birthday problem is by use of experiments and sample spaces. A sample space lists the possible outcomes of an experiment.

Take a coin-tossing experiment. In this case, a coin is tossed and it can land heads or tails.

Experiment: Toss a coin. Sample space = Heads; Tails.

Experiment: Toss a coin until it get Heads. Identify the number of tosses needed. Sample space = 1; 2; 3; 4; 5.

Experiment: Measure the time between two successive lightning strikes. Sample space = the set of positive numbers.

In many common examples, each outcome in the sample space is assigned an equal probability. An example is tossing a coin twice.

Here, the sample space = HH, HT, TH, TT.

Assign an equal probability to each of these outcomes. So, probability of each outcome = 1/4.

An ‘event’ is the name for a collection of outcomes.

The probability of an event = number of outcomes in the event / number of outcomes in the sample space.

Event of zero heads (TT) has probability = 1/4

Event of exactly one heads (HT, TH) has probability = 2/4 = 1/2

Event of two heads (HH) has probability = 1/4

Examples from dice (plural); die (singular). Sample space from one die = 1, 2, 3, 4, 5, 6.

Possible events:

a. Outcome is number 5

b. Outcome is an even number.

c. Outcome is even but is less than 6.

In a., probability = 1/6

In b., probability = 3/6

In c., probability = 2/6

Now, apply these concepts to the Birthday Problem.

Suppose that a room contains four people. What is the probability that at least two of these people share the same birthday?

The easiest way to solve this is to count the complementary event that none of the four share the same birthday and find that probability. We can then subtract this probability from 1 to establish the probability that at least two of the four share a birthday.

Size of the sample space = 365 x 365 x 365 x 365

Size of event that none of the four share the same birthday = 365 x 364 x 363 x 362

Probability that none of the four people share the same birthday =

365 x 364 x 363 x 362 / 365 x 365 x 365 x 365 = 0.984

Probability that at least two of them share the same birthday = 1 – 0.984 = 0.016

Similarly, it can be calculated that the probability of at least two sharing a birthday increases as n, the number in the room, increases, as below:

n = 16; probability = 0.284

n= 23; probability = 0.507

n = 32; probability = 0.753

n = 40; probability = 0.891

n= 56; probability = 0.988

n = 100; probability = 0.9999997

So, the probability that two share a birthday exceeds 0.5 in a room of 23 or more people.

April 16, 2017

What is independence? A primer in probability.

Let’s suppose Bill and Ben each toss separate coins. Let A represent the variable “Bill’s coin toss outcome”, and B represent the variable “Ben’s coin toss outcome”. Both A and B have two possible values (Heads and Tails). It would be uncontroversial to assume that A and B are independent. Evidence about B will not change our belief in A. In other words, the fact that Ben’s coin lands heads does not affect the likelihood that Bill will throw heads. What happens to Bill’s coin and Ben’s coin are unrelated. They are independent.

Now suppose both Bill and Ben toss the same coin. Again let A represent the variable “Bill’s coin toss outcome”, and B represent the variable “Ben’s coin toss outcome”. Assume also that there is a possibility that the coin is biased towards heads but we do not know this for certain. In this case A and B are not independent. Observing that Ben’s coin has landed heads might cause us to increase our belief that Bill will throw a Heads.

In the second example, the variables A and B are both dependent on a separate variable C, “the coin is biased towards Heads” (which has the values True or False). Although in this case A and B are not independent, it turns out that once we know for certain the value of C then any evidence about B cannot change our belief about A.

In such a case we say that A and B are conditionally independent given C.

In many real life situations variables which are believed to be independent are actually only independent conditional on some other variable. Let’s take an example. Suppose that Ted and Ned live on opposite sides of the city and come to work by completely different means. Let’s say Ted arrives by train while Ned drives to work. Let A represent the variable “Ted late” (which has values true or false) and similarly let B represent the variable “Ned late”. At first glance, it might seem that A and B are independent. However, even if Ted and Ned lived and worked in different countries there may be factors (such as an international fuel shortage) which could affect both Ted and Ned. In that case, A and B are not independent. Again, it doesn’t seem reasonable to exclude the possibility that both Ted and Ned may be affected by a rail strike (C). Clearly the likelihood that Ted will arrive late to work will increase if the rail strike takes place; but the likelihood that Ned will arrive late to work might also increase, indirectly, because of the additional traffic on the roads caused by the rail strike. ‘Ted to be late’ and ‘Ned to be late’ are in this case conditionally independent GIVEN the rail strike.

Two events, A and B, are defined to be conditionally independent, given some other event, C, if the probability of both A occurring and B occurring, given some other event, C, is equal to the probability of A occurring given C multiplied by the probability of B occurring given C, i.e.

The notation used for this is: P(AՈB I C) = P(AIC) . P(BIC)

In the example we have just considered, the probability that Ted and Ned are late to work given the train strike equals the probability that Ted is late given the strike multiplied by the probability that Ned is late given the strike.

This takes us to a new question.

Does conditional independence, given C, imply unconditional independence?

Say, for example, Jack is playing Jill at snooker. Jack and Jill know nothing about each other’s ability at snooker.

Now suppose Jill wins her first 5 games. This provides evidence for her to assess the strength of her opponent, Jack, and vice-versa.

But the games may be conditionally independent (Jill is equally likely to win the fifth game as the second given Jack and Jill’s relative skill at chess).

Even so, they are not independent (that would mean that winning the first five games tells you nothing about the likelihood of winning the sixth).

So the answer to the latest question is No. Conditional independence does not imply unconditional independence.

Finally, does unconditional independence imply conditional independence?

To answer this, let’s imagine an event with multiple causes.

Let A be the event that the fire alarm goes off.

Now suppose this could be caused by a genuine fire (F) or someone making popcorn (P), which sets off a false alarm.

Now let’s suppose that the probability of a fire is completely independent of the probability of someone making popcorn. But also that the probability the alarm is indicating a real fire is 100 per cent if nobody is making popcorn.

So the probability of a fire and the probability of making popcorn are independent of each other, yet the probability it’s a genuine fire if the alarm goes off is conditionally dependent on whether someone is making popcorn (you can be sure it’s a genuine fire if nobody is making popcorn).

So, does unconditional independence imply conditional independence? The answer is No.

So, in summary, events may be independent or they may be conditionally independent. Conditional independence does not, however, imply unconditional independence, and unconditional independence does not imply conditional independence.

Further Reading and Links

https://selectnetworks.net

April 15, 2017

The Newton-Pepys Problem. When the diarist wrote to the scientist.

One of the most celebrated pieces of correspondence in the history of probability and gambling, and one of which I am particularly fond, involves an exchange of letters between the greatest diarist of all time, Samuel Pepys, and the greatest scientist of all time, Sir Isaac Newton.

The six letters exchanged between Pepys in London and Newton in Cambridge related to a problem posed to Newton by Pepys about gambling odds. The interchange took place between November 22 and December 23, 1693. The ostensible reason for Mr. Pepys’ interest was to encourage the thirst for truth of his young friend, Mr. Smith. Whether Sir Isaac believed that tale or not we shall never know. The real reason, however, was later revealed in a letter written to a confidante by Pepys indicating that he himself was about to stake 10 pounds, a considerable sum in 1693, on such a bet. Now we’re talking!

The first letter to Newton introduced Mr. Smith as a fellow with a “general reputation…in this towne (inferiour to none, but superiour to most) for his maistery [of]…Arithmetick”.

What emerged has come down to us as the aptly named Newton-Pepys problem.

Essentially, the question came down to this:

Which of the following three propositions has the greatest chance of success.

A. Six fair dice are tossed independently and at least one ‘6’ appears

B. 12 fair dice are tossed independently and at least two ‘6’s appear.

C. 18 fair dice are tossed independently and at least three ‘6’s appear.

Pepys was convinced that C. had the highest probability and asked Newton to confirm this.

Newton chose A as the highest probability, then B, then C, and produced his calculations for Pepys, who wouldn’t accept them.

So who was right? Newton or Pepys?

Well, let’s see.

The first problem is the easiest to solve.

What is the probability of A?

Probability that one toss of a coin produces a ‘6’ = 1/6

So probability that one toss of a coin does not produce a ‘6’ = 5/6

So probability that six independent tosses of a coin produces no ‘6’ = (5/6)⁶

So probability of AT LEAST one ‘6’ in 6 tosses = 1 – (5/6)⁶ = 0.6651

So far, so good.

The probability of problem B and probability of problem C are more difficult to calculate and involve use of the binomial distribution, though Newton derived the answers from first principles, by his method of ‘Progressions’.

Both methods give the same answer, but using the more modern binomial distribution is easier.

So let’s do it, along the way by introducing the idea of so-called ‘Bernoulli trials’.

The nice thing about a Bernoulli trial is that it has only two possible outcomes.

Each outcome can be framed as a ‘yes’ or ‘no’ question (success or failure).

Let probability of success = p.

Let probability of failure = 1-p.

Each trial is independent of the others and the probability of the two outcomes remains constant for every trial.

An example is tossing a coin. Will it lands heads?

Another example is rolling a die. Will it come up ‘6’?

Yes = success (S); No = failure (F).

Let probability of success, P (S) = p; probability of failure, P (F) = 1-p.

So the question: How many Bernoulli trials are needed to get to the first success?

This is straightforward, as the only way to need exactly five trials, for example, is to begin with four failures, i.e. FFFFS.

Probability of this = (1-p) (1-p) (1-p) (1-p) p = (1-p)⁴p

Similarly, the only way to need exactly six trials is to begin with five failures, i.e. FFFFFS.

Probability of this = (1-p) (1-p) (1-p) (1-p) (1-p) p = (1-p)⁵ p

More generally, the probability that success starts on trial number n =

(1-p)^n-1 p

This is a geometric distribution. This distribution deals with the number of trials required for a single success.

But what is the chance that the first success takes AT LEAST some number of trials, say 12 trials?

One method is to add the probability of 12 trials to prob. of 13 trials to prob. of 14 trials to prob. of 15 trials, etc. …………………………

Easier method: The only time you will need at least 12 trials is when the first 11 trials are all failures, i.e. (1-p)¹¹

In a sequence of Bernoulli trials, the probability that the first success takes at least n trials is (1-p)^n-1

Let’s take a couple of examples.

Probability that the first success (heads on coin toss) takes at least three trials (tosses of the coin)= (1-0.5)² = 0.25

Probability that the first success (heads on coin toss) takes at least four trials (tosses of the coin)= (1-0.5)³ = 0.125

But so far we have only learned how to calculate the probability of one success in so many trials.

What if we want to know the probability of two, or three, or however many successes?

To take an example, what is the probability of exactly two ‘6’s in five throws of the die?

To determine this, we need to calculate the number of ways two ‘6’s can occur in five throws of the die, and multiply that by the probability of each of these ways occurring.

So, probability = number of ways something can occur multiplied by probability of each way occurring.

How many ways can we throw two ‘6’s in five throws of the die?

Where S = Success in throwing a ‘6’, F = Fail in throwing a ‘6’, we have:

SSFFF; SFSFF; SFFSF; SFFFS; FSSFF; FSFSF; FSFFS; FFSSF; FFSFS; FFFSS

So there are 10 ways of throwing two ‘6’s in five throws of the dice.

More formally, we are seeking to calculate how many ways 2 things can be chosen from 5. This is known as ‘5 Choose 2’, written as:

⁵C ₂= 10

More generally, the number of ways k things can be chosen from n is:

ⁿC _k = n! / (n-k)! k!

n! (known as n factorial) = n (n-1) (n-2) … 1

k! (known as k factorial) = k (k-1) (k-2) … 1

Thus, ⁵C ₂ = 5! / 3! 2! = 5x4x3x2x1 / (3x2x1x2x1) = 5×4/(2×1) = 20/2=10

So what is the probability of throwing exactly two ‘6’s in five throws of the die, in each of these ten cases? p is the probability of success. 1-p is the probability of failure.

In each case, the probability = p.p.(1-p).(1-p).(1-p)

= p² (1-p)³

Since there are ⁵ C ₂such sequences, the probability of exactly 2 ‘6’s =

10 p²(1-p)³

Generally, in a fixed sequence of n Bernoulli trials, the probability of exactly r successes is:

ⁿC _r x p^r (1-p) ^n-r

This is the binomial distribution. Note that it requires that the probability of success on each trial be constant. It also requires only two possible outcomes.

So, for example, what is the chance of exactly 3 heads when a fair coin is tossed 5 times?

⁵C ₃ x (1/2)³ x (1/2)² = 10/32 = 5/16

And what is the chance of exactly 2 sixes when a fair die is rolled five times?

⁵C ₂x (1/6)² x (5/6)³ = 10 x 1/36 x 125/216 = 1250/7776 = 0.1608

So let’s now use the binomial distribution to solve the Newton-Pepys problem.

What is the probability of obtaining at least one six with 6 dice?
What is the probability of obtaining at least two sixes with 12 dice?
What is the probability of obtaining at least three sizes with 18 dice?

First, what is the probability of no sixes with 6 dice?

P (no sixes with six dice) = ⁿ C _x. (1/6)^x . (5/6)^n-x,x = 0,1,2,…,n

Where x is the number of successes.

So, probability of no successes (no sixes) with 6 dice =

n!/(n-k)!k! = 6!/(6-0)!0! x (1/6)⁰ . (5/6)^6-0 = 6!/6! X 1 x 1 x (5/6)^{6 =}(5/6)⁶

Note that: 0! = 1

Here’s the proof: n! = n. (n-1)!

At n=1, 1! = 1. (1-1)!

So 1 = 0!

So, where x is the number of sixes, probability of at least one six is equal to ‘1’ minus the probability of no sixes, which can be written as:

P (x≥ 1) = 1 – P(x=0) = 1 – (5/6)⁶= 0.665 (to three decimal places).

i.e. probability of at least one six = 1 minus the probability of no sixes.

That is a formal solution to Part 1 of the Newton-Pepys Problem.

Now on to Part 2.

Probability of at least two sixes with 12 dice is equal to ‘1’ minus the probability of no sixes minus the probability of exactly one six.

This can be written as:

P (x≥2) = 1 – P(x=0) – P(x=1)

P(x=0) in 12 throws of the dice = (5/6)¹²

P (x=1) in 12 throws of the dice = ¹² C ₁ . (1/6)¹ . (5/6)¹¹ⁿC _k = n! / (n-k)! k!

So ¹² C ₁

= 12! / (12-1)! 1! = 12! / 11! 1! = 12

So, P (x≥2) = 1 – (5/6)¹²– 12. (1/6) . (5/6)¹¹

= 1 – 0.112156654 – 2 . (0.134587985) = 0.887843346 – 0.26917597 =

= 0.618667376 = 0.619 (to 3 decimal places)

This is a formal solution to Part 2 of the Newton-Pepys Problem.

Now on to Part 3.

Probability of at least three sixes with 18 dice is equal to ‘1’ minus the probability of no sixes minus the probability of exactly one six minus the probability of at exactly two sixes.

This can be written as:

P (x≥3) = 1 – P(x=0) – P(x=1) – P(x=2)

P(x=0) in 18 throws of the dice = (5/6)¹⁸

P (x=1) in 18 throws of the dice = ¹⁸ C ₁ . (1/6)¹ . (5/6)¹⁷

ⁿC _k = n! / (n-k)! k!

So ¹⁸ C ₁

= 18! / (18-1)! 1! = 18

So P (x=1) = 18. (1/6)¹ . (5/6)¹⁷

P (x=2) = ¹⁸C _{2 .} (1/6)² .(5/6)¹⁶

¹⁸C ₂

= 18! / (18-2)! 2! = 18!/16! 2! = 18. (17/2)

So P (x=2) = 18. (17/2) (1/6)²(5/6)¹⁶

So P(x=3) = 1 – P (x=0) – (P(x=1) – P (x=2)

P (x=0) = (5/6)¹⁸

= 0.0375610365

P (x=1) = 18. 1/6. (0.0450732438) = 0.135219731

P (x=2) = 18. (17/2) (1/36) (0.0540878926) = 0.229873544

So P(x=3) = 1 – 0.0375610365 – 0.135219731 – 0.229873544 =

P(x≥3) = 0.597345689 = 0.597 (to 3 decimal places, )

This is a formal solution to Part 3 of the Newton-Pepys Problem.

So, to re-state the Newton-Pepys problem.

Which of the following three propositions has the greatest chance of success?

A. Six fair dice are tossed independently and at least one ‘6’ appears.

B. 12 fair dice are tossed independently and at least two ‘6’s appear.

C. 18 fair dice are tossed independently and at least three ‘6’s appear.

Pepys was convinced that C. had the highest probability and asked Newton to confirm this.

Newton chose A, then B, then C, and produced his calculations for Pepys, who wouldn’t accept them.

So who was right? Newton or Pepys?

According to our calculations, what is the probability of A? 0.665

What is the probability of B? 0.619

What is the probability of C? 0.597

So Sir Isaac’s solution was right. Samuel Pepys was wrong, a wrong compounded by refusing to accept Newton’s solution. How much he lost gambling on his misjudgement is mired in the mists of history. The Newton-Pepys Problem is not, and continues to tease our brains to this very day.

Further Reading and Links

http://datagenetics.com/blog/february12014/index.html

April 15, 2017

Can we solve Zeno’s and other chocolate paradoxes?

Zeno of Elea was a Greek philosopher of the 5^th century BC, best known for his paradoxes of motion, described by Aristotle in his ‘Physics’. Of these perhaps the best known is his paradox of the tortoise and Achilles, in its various forms. In a modern version, the antelope starts 100 metres ahead of the cheetah and moves at half the speed of the cheetah. Will the cheetah ever catch the antelope, assuming they don’t slow down?

Zeno’s paradox relies on the fact that when the cheetah reaches the starting position of the antelope, the antelope will have travelled 50 metres further. When the cheetah arrives at that point, the antelope will have travelled a further 25 metres, and so on. Zeno argued that this was an infinite process, and so does not have a final, finite step. So how can the cheetah ever catch the antelope?

There is a mathematical solution to the paradox, which goes like this:

Let S be the distance the cheetah runs and let 1 = 100 metres.

So S = 1 + ½ + ¼ + 1/8 + 1/16 + 1/32 …..

½ S = ½ + ¼ + 1/8 + 1/16 + 1/32 …..

Therefore, S – ½ S = 1

Therefore, S = 2

So the cheetah catches the antelope in 200 metres.

So an infinite process, with no final step, has a finite conclusion.

That’s the mathematical solution, but does that solve the intuitive paradox? How can an infinite process, with no final step, come to an end? I understand the mathematical solution, but somehow it is as unsatisfying as the wrapper of a chocolate bar. To me, the real chocolate remains untouched. Such paradoxes I refer to as ‘chocolate paradoxes.’ What they have in common is that they can be solved mathematically without really being solved at all.

For those who might differ with me, the Thomson’s Lamp thought experiment offers a related challenge. Devised by philosopher James F. Thomson in 1954, it goes like this. Think of a lamp with a switch. You flick the switch to turn the light on. At the end of one minute exactly you flick it off. At the end of a further half minute, you turn it on again. At the end of a further quarter minute you turn it off. And so on. The time between each turning on and off the lamp is always half the duration of the time before. Assume you have the superpower to do each turning on and turning off instantaneously.

Adding these up gives: 1 minute plus half a minute plus a quarter of a minute ….

1 + ½ + ¼ + 1/8 + 1/16 + 1/32 + … = 2.

In other words, all of these infinitely many time intervals add up to exactly two minutes.

So here’s the question. At the end of two minutes, is the lamp on or off?

And here’s a second question. Say the lamp starts out being off and you turn it on after one minute, then off after a further half minute and so on. Does this make any difference to your answer?

Thomson claimed there was no solution, and that the problem led to a contradiction.

“It seems impossible to answer this question. It cannot be on, because I did not ever turn it on without at once turning it off. It cannot be off, because I did in the first place turn it on, and thereafter I never turned it off without at once turning it on. But the lamp must be either on or off. This is a contradiction.”

While considering the relationship between the infinite and the finite, consider in conclusion the following.

Can a number of infinite length be represented by a line of finite length? Solution below.

Spoiler Alert (Solution)

The square root of 2 is an irrational number, with no finite solution. In other words, it goes on for ever. ‎1.4142135623730950488……………………….. for ever…..

So can a line with a finite length exactly equal to this infinitely long number be drawn?

Draw a right-angled triangle, of vertical length (a) and horizontal length (b) equal to 1.

$Image result for right angle triangle$

Then, the length of the hypoteneuse of the triangle, c, can be derived from the length of the adjacent (a) and opposite (b) sides, using Pythagoras’ Theorem.

a² + b² = c²

So, 1² + 1² = c²

So c² = 2

c = √2

This is a line of finite length, representing a number of infinite length. So the answer to the question is yes. Strange? Indeed. Another of those tantalising ‘chocolate paradoxes.’

Big balance beats big edge. The Gambler’s Ruin Problem.

The famed correspondence between two titans of 17^th century French intellectual thought, Blaise Pascal (Pascal’s Wager) and Pierre Fermat (Fermat’s Last Theorem) was to mark the foundation of modern probability theory. But it was sparked off by a question posed to Pascal by legendary French gambler of the time, Antoine Gombaud, better known as the Chevalier de Mere.

The question related to a new dice game the Chevalier had invented. According to the rules of the game, he asked for even money odds that a pair of dice, when rolled 24 times, will come up with a double-6 at least once. His reasoning seemed impeccable. If the chance of a 6 on one roll of the die = 1/6, then the chance of a double-6 when two dice are thrown = 1/6 x 1/6 (as they are independent events) = 1/36.

So, he reasoned, the chance of at least one double-6 in 24 throws is: 24/36 = 2/3. So this should be a profitable game for the Chevalier. When it didn’t turn out that way, he asked the great philosopher and mathematician, Blaise Pascal to look into it, as you do.

Pascal derived the correct probabilities as follows:

Probability of a double-6 in one throw of a pair of dice = 1/6 x 1/6 = 1/36.

So probability of NO double-6 in one throw of a pair of dice = 35/36.

So, probability of no double-6 in 24 throws of a pair of dice = 35/36 x 35/36 … 24 times = 35/36 to the power of 24, i.e. (35/36)²⁴= 0.5086.

So, probability of at least one double-6 = 1 – 0.5086 = 0.4914

So the Chevalier was betting at even money on a game which he lost (albeit marginally) more often than he won, which is why he was losing over time.

What if he changed the game to give himself 25 throws?

Now, the probability of throwing at least one double-6 in 25 throws of a pair of dice is:

1 – (35/36)²⁵ = 0.5055.

These odds, at even money, are in favour of the Chevalier, but this probability is still lower than the probability of obtaining one ‘6’ in four throws of a single die.

In the single-die game, the Chevalier has a house edge of 51.77% – 48.23% = 3.54%.

In the ‘pair of dice’ game (24 throws), the Chevalier’s edge =

49.14% – 50.81% = -1.72%

In the ‘pair of dice’ game (25 throws), the Chevalier’s edge =

50.55% – 49.45% = 1.1%

A better game for the Chevalier would have been to offer even money that he could get at least one run of ten heads in a row in 1024 tosses of a coin. The derivation of this probability is similar in method to the dice problem.

First, we need to determine the probability of 10 heads in 10 tosses of a fair coin.

The odds are: ½ x ½ x ½ x ½ x ½ x ½ x ½ x ½ x ½ x ½

Odds = (1/2)¹⁰ = 1/1024, i.e. 1023/1.

Based on this, what is the probability of at least one run of 10 heads in 1024 tosses of the coin? Is it 0.5? No, because although you can expect ONE run of 10 heads on average, you could obtain zero, 2, 3, 4, etc.

So what is the probability of NO RUN of 10 heads in 1024 tosses of the coin?

This is: (1-1/1024)¹⁰²⁴

The probability of NO RUNS OF TEN HEADS = (1023/1024)¹⁰²⁴ = 37%

So probability of AT LEAST one run of 10 heads = 63%.

Now assume you have tossed the coin already 234 times out of 1024, without a run of 10 heads, what is your chance now of getting 10 heads?

Probability of NO RUNS OF TEN HEADS in remaining 790 tosses = (1023/1024)⁷⁹⁰= 46%

So probability of at least one success = 54%.

The Chevalier could have played either of these games and expected to come out ahead. But the game would have taken a long time. He preferred the shorter game, which produced the longer loss.

Until he was put right by Monsieur Pascal.

Most importantly, though, the Chevalier’s question led to a correspondence, most of which has survived, which led to the foundations of modern probability theory.

I will examine just one of the conclusions of this correspondence today, and it relates to the infamous ‘Gambler’s Ruin’ problem.

This is an idea set in the form of a problem by Pascal for Fermat, subsequently published by Christiaan Huygens (‘On reasoning in games of chance’, 1657) and formally solved by Jacobus Bernoulli (‘Ars Conjectandi’, 1713).

One way of stating the problem is as follows. If you play any gambling game long enough, will you eventually go bankrupt, even if the odds are in your favour, if your opponent has unlimited funds?

Example: You and your opponent toss a coin, where the loser pays the winner £1. The game continues until either you or your opponent has all the money. Suppose you have £10 to start and your opponent has £20. What are the probabilities that a) you and b) your opponent, will end up with all the money?

The answer is that the player who starts with more money has more chance of ending up with all of it. The formula is:

P₁ = n₁ / (n₁ + n₂)

P₂ = n₂/ (n₁ + n₂)

Where n₁ is the amount of money that player 1 starts with, and n₂is the amount of money that player 2 starts with, and P1 and P2 are the probabilities that player 1 or player 2, your opponent, wins.

In this case, you start with £10 of the £30 total, and so have a 10/(10+20) = 10/30 = 1/3 chance of winning the £30; your opponent has a 2/3 chance of winning the £30. But even if you do win this game, and you play the game again and again, against different opponents, or the same one who has borrowed more money, eventually you will lose your entire bankroll. This is true even if the odds are in your favour. Eventually you will meet a long-enough bad streak to bankrupt you.

In other words, infinite capital will overcome any finite odds against it. This is one version of the ‘Gambler’s Ruin’ problem, and many gamblers over the years have been ruined because of their unawareness of it.

So how can we avoid falling victim to the problem of ‘Gambler’s Ruin?’ Formally, we might turn to the Kelly formula, more of which I shall examine elsewhere. Informally, though, I shall reduce it to two simple bits of advice.

‘Never bet more than you can afford to lose’.

‘When the Fun Stops, Stop!’

Now that’s a start.

Further Reading and Links

https://selectnetworks.net/

Letters between Fermat and Pascal on Probability: https://www.york.ac.uk/depts/maths/histstat/pascal.pdf

April 12, 2017

Superforecasting: The Science of Making Money

‘Superforecasting’ is a term popularised from insights gained as part of a fascinating idea known as the ‘Good Judgment Project’, which consists of running tournaments where entrants compete to forecast the outcome of national and international events.

The key conclusion of this project is that an identifiable element of those taking part (so-called ‘Superforecasters’) were able to consistently and significantly out-predict their peers. To the extent that this ‘superforecasting’ is real, and it seems to be, it provides support for the belief that markets can not only be beaten but systematically so.

So what is special about these ‘Superforecasters’? A key distinguishing feature of these wizards of prediction is that they tend to update their estimates much more frequently than regular forecasters, and they do so in smaller increments. Moreover, they tend to break big intractable problems down into smaller tractable ones.

They are also much better than regular forecasters at avoiding the trap of underweighting new information or overweighting it. In particular, they are good at evaluating probabilities dispassionately using a so-called Bayesian approach, i.e. establishing a prior (or baseline) probability that an event will occur, and then constantly updating that probability as new information emerges, incrementally updating in proportion to the weight of the new evidence.

In adopting this approach, the Superforecasters are echoing the response of legendary economist, John Maynard Keynes, to a criticism made to his face that he had changed his position on monetary policy.

“When my information changes, I alter my conclusions. What do you do, Sir?”

In this, Keynes was one of the great ‘Superforecasters.’ Keynes went on to earn a fortune betting in the currency and commodity markets.

Superforecasters in the field of sports betting can benefit in particular from betting in-running, while the event is taking place. Their evaluations are also likely to be data-driven, and are updated as frequently as possible, taking into account variables some of which may not even exist pre-match.

They will be aware of players who tend to struggle to close the deal, whether in golf, tennis, snooker, or whatever, and who may be value ‘lays’ when trading in-running at short prices. Or shaky starters, like batsmen whose average belies their likely performance once they get into double figures. This information is only valuable, however, if the market doesn’t already incorporate it. So they gain an edge by access to and dispassionate analysis of large data sets. Moreover, they are very aware that patterns spotted, and conclusions derived, from small data sets can be dangerous, and potentially very hazardous to the accumulation of wealth.

Superforecasters also tend to use ‘Triage’. This is the process of determining the most important things from amongst a large number that require attention. Risk expert and Hedge Fund manager, Aaron Brown offers an example of how, when he first got interested in basketball in the 1970s there were data analysts who tried to analyse the game from scratch. He considered that a hard proposition compared to asking which team was likely to attract more betting interest. As Los Angeles was a rich and high-betting city, and the LA Lakers a glamorous team, he figured it wasn’t hard to guess that the betting public would disproportionately favour the Laker and that therefore the spread would be slanted against them. ‘Bet against the Lakers at home’ became his strategy, and he observes that it took a lot less effort than simulating basketball games.”

Could such a simple strategy work today, tweaked or otherwise? And in what circumstances would you apply it? That’s a more nuanced issue, but Superforecasters (who are normally very keen on big data sets) would be alert to it.

Aaron Brown sees trading contracts on the future as striking the right balance between under- and over-confidence, between prudence and decisiveness. The hard part about this, he observes, is that confidence is negatively correlated to accuracy. Even experienced risk takers bet more when they’re wrong than when they’re right, he says, and the most confident people are generally the least reliable.

The solution, he maintains, is to keep careful, objective records, preferably by a third party.

That’s right – even experienced risk takers bet more when they’re wrong than when they’re right. If true, this is a critical insight.

So how might a Superforecaster go about constructing a sports forecasting model?

Let’s say he wants to construct a model to forecast the outcome of a football match or a golf tournament. In the former, he might focus on assessing the likely team line-up before its announcement, and draw on his hopefully extensive data set to eke out an edge from that. The football market is very liquid and likely to be quite efficient to known information, so any forecasting edge in terms of estimating future information, like team shape, can be critical. The same might apply to rugby, cricket, and other team games.

In terms of golf, he could include statistics on the average length of drive of the players, their tee to green percentages, their putting performance, the weather, the type of course, and so on. But where is the edge over the market?

He could try to develop a better model than others, including using new, state-of-the-art econometric techniques. In trying to improve the model, he could also seek to identify additional explanatory variables.

He might also turn to the field of ‘prospect theory’, a body of work pioneered by Daniel Kahneman and Amos Tversky. This states that people behave and make decisions according to the frame of reference rather than just the final outcome. Humans, according to prospect theory, do not think or think or behave totally rationally, and this could be built that into the model.

In particular, a key plank of prospect theory is ‘loss aversion’, the idea that people treat losses more harshly than equivalent gains, and that they view these losses and gains with regard to a sometimes artificial frame of reference.

An excellent seminal paper on this effect in golf (by Devin Pope and Maurice Schweitzer, in the American Economic Review), is a good example of the sort of way in which study of the economic literature can improve sports modelling. The key contribution of the Pope and Schweitzer paper is that it shows how prospect theory can play a role even in the behaviour of highly experienced and well-incentivised professionals. In particular, they demonstrate, using a database of millions of putts, that professional golfers are significantly more likely to make a putt for par than a putt for birdie, even when all other factors, such as distance to the pin, break, are allowed for. But why? And how does prospect theory explain it?

To find the explanation, they examine a number of possible explanations, and reject them one by one until they determine the true explanation. The find it is because golfers see par as the ‘reference’ score, and so a missed par is viewed (subconsciously or otherwise) by these very human golfers as a significantly greater loss than a missed birdie. They react irrationally in consequence, and cannot help themselves from doing so even when made aware of it. The researchers show that equivalent birdie putts tend to come up slightly too short relative to par putts. This is valuable information for Superforecasters, or even the casual bettor. It is also valuable information for a sports psychologist. If only someone could stand close to a professional golfer every time they stand over a birdie putt and whisper in their ear ‘This is for Par’, it would over time make a significant difference to their performance and pay.

So Superforecasters will Improve their model by increments, taking into account factors which more conventional thinkers might not even consider, and will apply due weight to updating their forecasts as new information emerges.

In conclusion, how might we sum up the difference between a Superforecaster and an ordinary mortal? Watch them as they view the final holes of the Masters golf tournament. What’s the chance of Sergio Garcia sinking that 10-footer? The ordinary mortal will just see the putt, the distance to the hole and the potential break of the ball on the green. The Superforecaster is going one step further, and also asking whether the 10-footer is for par or birdie. It really does make a difference, and it’s why she is watching from the members’ area at the Augusta National Golf Club. She has earned her place there, and she knew it before anyone else.

Further Reading and Links

https://selectnetworks.net/

D.G. Pope and M.E. Schweitzer, 2011, Is Tiger Woods Loss-Averse? Persistent Bias in the Face of Experience, Competition and High Stakes, American Economic Review, 101(1), 129-157.

Philip Tetlock and Dan Gardner, Superforecasting: The Art and Science of Prediction, 2016, London: Random House.

April 11, 2017

How important is witness evidence? A Bayesian perspective.

Further and deeper exploration of paradoxes and challenges of intuition and logic can be found in my recently published book, Probability, Choice and Reason.

New Amsterdam has 1,000 taxis. 850 are yellow, 150 are green. One of these taxis accidentally knocks down a pedestrian and then drives away without stopping. We have no reason to believe that drivers of green taxis are any more or any less likely than drivers of yellow taxis to knock down a pedestrian and drive away. Neither do we have any reason to believe that green or yellow taxis are disproportionately represented in the area of New Amsterdam where the hit and run took place.

There is one witness, however, who did see the event. The witness says the colour of the taxi was green.

The witness is given a rigorous observation test, which recreates as closely as possible the event in question, and her judgment proves correct right 80 per cent of the time. We have no reason to doubt the integrity of the witness.

So what is the probability that the taxi was green?

The intuitive answer is in the region of 80 per cent, as the only evidence is that of the witness, and the test of her powers of observation shows that she is right 80 per cent of the time. That is not the Bayesian approach, however, which is to also consider the evidence in the light of the baseline, or prior, probability that the taxi was green before the witness evidence came to light.

The prior probability can be derived from an identification of the proportion of taxis in New Amsterdam that are green. This is 15 per cent (of the 1,000 taxis, 150 are green).

Now, the (posterior) probability that a hypothesis is true after obtaining new evidence, according to the x,y,z formula of Bayes’ Theorem, is equal to:

xy/[xy+z(1-x)]

x is the prior probability, i.e. the probability that a hypothesis is true before you the new evidence arises.

y is the probability the new evidence would arise if the hypothesis is true.

z is the probability the new evidence would arise if the hypothesis is false.

This is a straightforward calculation.

x = 0.15 (15 per cent of taxis are green)

y = 0.8 (the witness is correct 80 per cent of the time)

z = 0.2 (the witness is wrong 20 per cent of the time)

Inserting these numbers into the formula gives:

Posterior probability = 0.15 x 0.8/ (0.15×0.8 + 0.2×0.85) = 0.12/ (0.12+0.17) = 41%

In other words, the true probability that the taxi that knocked down the pedestrian was green is not 80 per cent (despite the witness evidence) but about half that. The baseline probability is that important.

But Bayesians are not content to leave it that. The next step is to look for further new evidence.

Say, for example, that a new witness appears, totally independent of the other, and is also given the observation test, revealing a reliability score of 90 per cent. Again, we have no reason to doubt the integrity of this witness. What a Bayesian does now is to insert that number (0.9) into the Bayes formula (y=0.9) so that z (the probability that the witness is mistaken) = 0.1.

The new baseline (or prior) probability, x, is no longer 0.15, as it was before the first witness appeared, but 0.41 (the probability incorporating the evidence of the first witness).

New posterior probability = 0.41 x 0.9/ (0.41×0.9 + 0.1×0.59) = 0.369/ (0.369+0.059) = 86.2%

This is also the new baseline probability underpinning any new evidence which might arise.

There are three illustrative cases which bear highlighting.

The first is a scenario where the new witness scores 50 per cent on the observation test. Here is a case where intuition and Bayes’ formula converge. Intuition tells us that a witness who is right only half the time about the colour of the taxi is also wrong half the time, and so any evidence they give is worthless. In terms of the equation, such a witness would be accorded y = 0.5 and z = 0.5.

Putting these values of y and z into the equation leads to the following:

xy/[xy+z(1-x)] BECOMES 0.5x / [0.5x + 0.5 (1-x)]

0.5x / [0.5x + 0.5 (1-x)] = 0.5x / (0.5 + 0.5x – 0.5x) = 0.5x / 0.5 = x

So when x and y both equal 0.5 in regard to new evidence, this evidence has no impact on the probability of the hypothesis being tested being true. The posterior probability (x) equals the prior probability (x).

In other words, when y = z = 0.5, the posterior probability equals the prior probability. In this case, the witness’s evidence can be discounted.

The second illustrative case is where a new witness is 100 per cent reliable about the colour of the taxi. In this case, y =1 and z =0. Intuition tells us that the evidence of such a witness solves the case. If the infallible witness says the taxi was green, it was green. Bayes’ formula agrees. Inserting y = 1, z = 0 into the formula gives:

xy/[xy+z(1-x)] = x / (x + 0) = x/x = 1.

So the new (posterior) probability that the taxi is green = 1.

This leads directly to the third illustrative case. If the new witness scores 0 per cent on the observation test, this indicates that they always identify the wrong colour for the taxi. If they say it is green, it is definitely not green. So the chance (posterior probability) that the taxi is green if they say so is zero. This accords with the formula.

xy/[xy+z(1-x)] = 0 / [0 + (1-x)] = 0

Of course, this is valuable information, as it can be reversed to useful effect. A witness who always identifies a green taxi as yellow and vice-versa, and is 100 per cent consistent in doing so, yields us infallible information simply by reversing their identified colour.

So if the witness says the taxi is yellow, we can now identify the taxi as definitely green. This now converges on the second illustrative case.

Similarly, a witness who is, say, 25 per cent accurate in identifying the colour of the taxi in the observation test also yields us valuable information. By reversing the identified colour, this yields a 75 per cent accuracy score, which can be inserted accordingly into Bayes’ formula to update the probability that the taxi that knocked down the pedestrian was green.

The only observation evidence that is worthless, therefore, is evidence that could have been produced by the flip of a fair coin.

And the conclusion to the case? CCTV evidence was later produced in court which was able to conclusively identify the taxi and the driver. The pedestrian never regained consciousness. The driver told the jury that he panicked when the pedestrian unexpectedly stepped out in front of him, and drove off because he feared he would lose his livelihood. He was completely unaware that the victim had hit his head awkwardly, and had thought at the time that it was a very minor accident.

This was rejected by the jury, who accepted the prosecution’s contention that he had acted with premeditation. They based their decision on their view that a driver who was so motivated would indeed have driven off. The taxi driver in this case did drive off, which was what someone who acted wilfully, deliberately and with premeditation would do. It was all the evidence they needed to reach their unanimous verdict.

James Parker, a 29-year-old long-time resident of New Amsterdam, of previous good character, with no previous convictions or any known motive for the crime, is currently serving a sentence of life in a maximum security prison with no possibility of parole.

Further Reading and Links

https://selectnetworks.net/

Prof. Leighton Vaughan Williams

Recent Posts

Categories

A+ links

All Conversation articles

All Select Networks

Audio Files

Betting

Betting Taxation

Book Chapters

Books

Centres

Charity

Choice and Reason

Competition Commission

David Henry Morris Williams, C. Eng.

Editorial

Employment

Evidence to UK Parliament

Gambling Commission

HM Revenue and Customs

Memberships and Fellowships

My Adobe Voice

National Audit Office

Other Publications

Papers Online

Personal

Political Forecasting

Press and media

Probability

Profile

Published Papers

Radio Interviews

Select Abstracts

Select Books

Select Broadcasts

Select Clippings

Select Pages

Select Papers

Select Presentations

Select Social Media

Select Stories

Select Websites

Select Wiki

Selected Talks

Short stories

Thought Experiment

Twisted Logic

Twitter

Useful Links

Various Blogs

XYZ

Flickr Photos