Professor Leighton Vaughan Williams

April 11, 2019

Does doubling up after a loss really work?

Further and deeper exploration of paradoxes and challenges of intuition and logic can be found in my recently published book, Probability, Choice and Reason.

The basis of the martingale betting system is a strategy in which the gambler doubles the bet, such as a coin toss, after every loss, so that the first win would recover all previous losses plus a profit equal to the original stake. The martingale strategy has been applied to roulette in particular, where the probability of hitting either red or black is near to 50 per cent.

Take the case of a gambler who wagers £2 on Heads, at even money, so profits by £2 if the coin lands Heads and loses £2 if it lands Tails. If he loses, he doubles the stake on the next bet, to £4, and wins £4 if it lands Heads, minus £2 lost on the first bet, securing a net profit over both bets of £2 (£4 – £2). If it lands Tails again, however, he is £6 down, so he doubles the stake in the next bet to £8. If it lands Heads he wins £8, minus £6 lost on the first two nets, securing a net profit over the three bets of £2 (£8 – £6). This can be generalized for any number of bets. Whenever he wins, the gambler secures a net profit over all bets of £2.

The strategy is essentially, therefore, one of chasing losses. In the above example, the loss after n losing rounds is equal to 2+2²+2³+…+ 2ⁿ

So the strategy is to bet in the next round 2+2²+2³+…+ 2ⁿ+2

In this way, the profit whenever the coin lands Heads is 2.

For a gambler with infinite wealth, and hence an infinite number of coin tosses to eventually generate heads, the martingale betting strategy has been interpreted as a sure win.

However, the gambler’s expected value remains zero (or less than zero) because the small probability of a very large loss exactly balances out the expected gain. In a casino, the expected value is in fact negative, due to the house edge. There is also conventionally a house limit on bet size.

The martingale strategy fails, therefore, whenever there is a limit on earnings or on bets or bet size, as is the case in the real world. It is only with infinite or boundless wealth, bet size and time that it could be argued that the martingale becomes a winning strategy.

Appendix

Probability of losing three fair coin tosses = 1/8

Probability of losing n times = 1/2ⁿ

Total loss with starting stake of 2, with 3 losses of coin toss = 2 + 4 + 8 = 14.

So martingale strategy suggests a bet of 14 + 2 = 16.

Loss after n losing rounds = 2 + 2²+ … + 2ⁿ

So martingale bet = (2 + 2² + … + 2ⁿ) + 2 = 2ⁿ⁺¹

This strategy always wins a net 2.

This strategy, of always betting to win more than lost so far, works in principle, regardless of the odds, or whether they are fair. If each bet has a 1 in 10 chance of success, for example, the probability of 12 successive losses is about 30%, but the martingale strategy is to bet to win more on the 13^th coin toss than the sum of losses to that point.

This holds so long as there is no finite stopping point at which the next martingale bet is not available (such as a maximum bet limit) or can’t be afforded.

So, let us assume that everyone has some number of losses such that they don’t have enough money to pay a stake large enough for the next round that it would cover the sum of the losses to that point. Call this run of losses n.

n differs across people and could be very high or very low.

Probability of losing n times = 1/2ⁿ

Using a martingale +2 strategy, the player wins 2 if able to play on, and then wins.

So, the player wins 2 with a probability of (1-1/2ⁿ)

Total losses after n losing bets = (2 + 2² + … + 2ⁿ) = (2ⁿ⁺¹ – 2)

Expected gain is equal to the probability of not folding times the gain plus the probability of folding times the loss.

Expectation = (1 – 1/2ⁿ) . 2 – 1/2ⁿ (2ⁿ⁺¹ – 2)

= 2 – 2/2ⁿ – 2 + 2/2ⁿ = 0.

So the expected gain in a fair game for any finite number of bets is zero using the martingale system, but it is positive if the system can be played to infinity. The increment per round need not be 2, but could be any number, x. The net gain to a winning bet is this number, x.

The intuitive explanation for the zero expectation is that the player (take the simplest case of an increment per round of 2) wins a modest gain (2) with a very good probability (1 – 1/2ⁿ) but with a small probability (1/2ⁿ) makes a disastrous loss (2ⁿ⁺¹ – 2).

More generally, for an increment of x:

Expectation = (1 – 1/xⁿ) . x – 1/xⁿ (xⁿ⁺¹ – x)

= x – x/xⁿ – x + x/xⁿ = 0.

The mathematical paradox remains. In the case where on the nth round, the bet is 2ⁿ, the martingale expectation = ½ x 2 + ¼ x 2² + 1/8 x 2³ + … = 1 + 1 + 1 + 1 … ∞

Yet the actual expectation, when the odds are fair, in all realistic cases = 0.

If the odds are tilted against the bettor, so that for example the bettor wins less if a fair coin lands Heads than he loses if it lands Tails, the expected gain in a finite series of coin tosses is less than zero, but the same principle applies.

Exercise

Show that the expected value of martingale strategy in a fair game of heads/tails is zero. Show how this can be reconciled with the fact that whenever the player wins, the net overall profit to the player is positive.

References and Links

Martingale (betting system). Wikipedia. https://en.m.wikipedia.org/wiki/Martingale_(betting_system)

April 11, 2019

Is it possible to beat the dealer? In a nutshell.

Further and deeper exploration of paradoxes and challenges of intuition and logic can be found in my recently published book, Probability, Choice and Reason.

It is said that on returning from a day at the races, a certain Lord Falmouth was asked by a friend how he had fared. “I’m quits on the day”, came the triumphant reply. “You mean by that,” asked the friend, “that you are glad when you are quits?” When Falmouth replied that indeed he was, his companion suggested that there was a far easier way of breaking even, and without the trouble or annoyance. “By not betting at all!” The noble lord said that he had never looked at it like that and, according to legend, gave up betting from that very moment.

While this may well serve as a very instructive tale for many, Ed Thorpe, writing in 1962, took a rather different view. He had devised a strategy, based on probability theory, for consistently beating the house at Blackjack (or ‘21’). In his book, ‘Beat the Dealer: A Winning Strategy for the Game of Twenty – One’, Thorp presents the system. On the inside cover of the dust jacket he claims that “the player can gain and keep a decided advantage over the house by relying on the strategy”.

The basic rules of blackjack are simple. To win a round, the player has to draw cards to beat the dealer’s total and not exceed a total of 21. Because players have choices to make, most obviously as to whether to take another card or not, there is an optimal strategy for playing the game. The precise strategy depends on the house rules, but generally speaking it pays, for example, to hit (take another card) when the total of your cards is 14 and the dealer’s face-up card is 7 or higher. If the dealer’s face-up card is a 6 or lower, on the other hand, you should stand (decline another card). This is known as ‘basic strategy.’

While basic strategy will reduce the house edge, it is not enough to turn the edge in the player’s favour. That requires exploitation of the additional factor inherent in the tradition that the used cards are put to one side and not shuffled back into the deck. This means that by counting which cards have been removed from the deck, we can re-evaluate the probabilities of particular cards or card sizes being dealt moving forward. For example, a disproportionate number of high cards in the deck is good for the player, not least because in those situations where the rules dictate that the house is obliged to take a card, a plethora of remaining high cards increases the dealer’s probability of going bust (exceeding a total of 21).

Thorp’s genius was in devising a method of reducing this strategy to a few simple rules which could be understood, memorized and made operational by the average player in real time. As the book blurb puts it, “The presentation of the system lends itself readily to the rapid play normally encountered in the casinos.” Essentially, all that is needed is to attach a tag to specific cards (such as +1 or -1) and then add or subtract the tags as the cards are dealt.Depending on the net score in relation to the cards dealt, it is easy to see whether the edge iswith the house or the player. This system is called keeping a ‘running count.’

There are variations on this theme, but the core strategy and original insights hold. The problem simply changed to one familiar to many successful horse players, i.e. how to get your money on before being closed down.

References and Links

Card counting. Wikipedia. https://en.wikipedia.org/wiki/Card_counting

https://wizardofodds.com/games/blackjack/card-counting/introduction/

https://wizardofodds.com/games/blackjack/card-counting/high-low/

4-Deck to 8-Deck Blackjack Strategy. https://wizardofodds.com/games/blackjack/strategy/4-decks/

The Ace-Five Count. https://wizardofodds.com/games/blackjack/appendix/17/

April 11, 2019

Expert forecasting – Guide Notes.

Superforecasting’ is a term popularised from insights gained as part of a fascinating idea known as the ‘Good Judgment Project’, which consists of running tournaments where entrants compete to forecast the outcome of national and international events.

The key conclusion of this project is that an identifiable element of those taking part (so-called ‘Superforecasters’) were able to consistently and significantly out-predict their peers. To the extent that this ‘superforecasting’ is real, and it seems to be, it provides support for the belief that markets can not only be beaten but systematically so.

So what is special about these ‘Superforecasters’? A key distinguishing feature of these wizards of prediction is that they tend to update their estimates much more frequently than regular forecasters, and they do so in smaller increments. Moreover, they tend to break big intractable problems down into smaller tractable ones.

They are also much better than regular forecasters at avoiding the trap of underweighting new information or overweighting it. In particular, they are good at evaluating probabilities dispassionately using a so-called Bayesian approach, i.e. establishing a prior (or baseline) probability that an event will occur, and then constantly updating that probability as new information emerges, incrementally updating in proportion to the weight of the new evidence.

In adopting this approach, the Superforecasters are echoing the response of legendary economist, John Maynard Keynes, to a criticism made to his face that he had changed his position on monetary policy.

“When my information changes, I alter my conclusions. What do you do, Sir?”

In this, Keynes was one of the great ‘Superforecasters.’ Keynes went on to earn a fortune betting in the currency and commodity markets.

Superforecasters in the field of sports betting can benefit in particular from betting in-running, while the event is taking place. Their evaluations are also likely to be data-driven, and are updated as frequently as possible, taking into account variables some of which may not even exist pre-match.

They will be aware of players who tend to struggle to close the deal, whether in golf, tennis, snooker, or whatever, and who may be value ‘lays’ when trading in-running at short prices. Or shaky starters, like batsmen whose average belies their likely performance once they get into double figures. This information is only valuable, however, if the market doesn’t already incorporate it. So they gain an edge by access to and dispassionate analysis of large data sets. Moreover, they are very aware that patterns spotted, and conclusions derived, from small data sets can be dangerous, and potentially very hazardous to the accumulation of wealth.

Superforecasters also tend to use ‘Triage’. This is the process of determining the most important things from amongst a large number that require attention. Risk expert and Hedge Fund manager, Aaron Brown offers an example of how, when he first got interested in basketball in the 1970s there were data analysts who tried to analyse the game from scratch. He considered that a hard proposition compared to asking which team was likely to attract more betting interest. As Los Angeles was a rich and high-betting city, and the LA Lakers a glamorous team, he figured it wasn’t hard to guess that the betting public would disproportionately favour the Laker and that therefore the spread would be slanted against them. ‘Bet against the Lakers at home’ became his strategy, and he observes that it took a lot less effort than simulating basketball games.”

Could such a simple strategy work today, tweaked or otherwise? And in what circumstances would you apply it? That’s a more nuanced issue, but Superforecasters (who are normally very keen on big data sets) would be alert to it.

Aaron Brown sees trading contracts on the future as striking the right balance between under- and over-confidence, between prudence and decisiveness. The hard part about this, he observes, is that confidence is negatively correlated to accuracy. Even experienced risk takers bet more when they’re wrong than when they’re right, he says, and the most confident people are generally the least reliable.

The solution, he maintains, is to keep careful, objective records, preferably by a third party.

That’s right – even experienced risk takers bet more when they’re wrong than when they’re right. If true, this is a critical insight.

So how might a Superforecaster go about constructing a sports forecasting model?

Let’s say he wants to construct a model to forecast the outcome of a football match or a golf tournament. In the former, he might focus on assessing the likely team line-up before its announcement, and draw on his hopefully extensive data set to eke out an edge from that. The football market is very liquid and likely to be quite efficient to known information, so any forecasting edge in terms of estimating future information, like team shape, can be critical. The same might apply to rugby, cricket, and other team games.

In terms of golf, he could include statistics on the average length of drive of the players, their tee to green percentages, their putting performance, the weather, the type of course, and so on. But where is the edge over the market?

He could try to develop a better model than others, including using new, state-of-the-art econometric techniques. In trying to improve the model, he could also seek to identify additional explanatory variables.

He might also turn to the field of ‘prospect theory’, a body of work pioneered by Daniel Kahneman and Amos Tversky. This states that people behave and make decisions according to the frame of reference rather than just the final outcome. Humans, according to prospect theory, do not think or think or behave totally rationally, and this could be built that into the model.

In particular, a key plank of prospect theory is ‘loss aversion’, the idea that people treat losses more harshly than equivalent gains, and that they view these losses and gains with regard to a sometimes artificial frame of reference.

An excellent seminal paper on this effect in golf (by Devin Pope and Maurice Schweitzer, in the American Economic Review), is a good example of the sort of way in which study of the economic literature can improve sports modelling. The key contribution of the Pope and Schweitzer paper is that it shows how prospect theory can play a role even in the behaviour of highly experienced and well-incentivised professionals. In particular, they demonstrate, using a database of millions of putts, that professional golfers are significantly more likely to make a putt for par than a putt for birdie, even when all other factors, such as distance to the pin, break, are allowed for. But why? And how does prospect theory explain it?

To find the explanation, they examine a number of possible explanations, and reject them one by one until they determine the true explanation. The find it is because golfers see par as the ‘reference’ score, and so a missed par is viewed (subconsciously or otherwise) by these very human golfers as a significantly greater loss than a missed birdie. They react irrationally in consequence, and cannot help themselves from doing so even when made aware of it. The researchers show that equivalent birdie putts tend to come up slightly too short relative to par putts. This is valuable information for Superforecasters, or even the casual bettor. It is also valuable information for a sports psychologist. If only someone could stand close to a professional golfer every time they stand over a birdie putt and whisper in their ear ‘This is for Par’, it would over time make a significant difference to their performance and pay.

So Superforecasters will Improve their model by increments, taking into account factors which more conventional thinkers might not even consider, and will apply due weight to updating their forecasts as new information emerges.

In conclusion, how might we sum up the difference between a Superforecaster and an ordinary mortal? Watch them as they view the final holes of the Masters golf tournament. What’s the chance of Sergio Garcia sinking that 10-footer? The ordinary mortal will just see the putt, the distance to the hole and the potential break of the ball on the green. The Superforecaster is going one step further, and also asking whether the 10-footer is for par or birdie. It really does make a difference, and it’s why she is watching from the members’ area at the Augusta National Golf Club. She has earned her place there, and she knew it before anyone else.

Further Reading and Links

D.G. Pope and M.E. Schweitzer, 2011, Is Tiger Woods Loss-Averse? Persistent Bias in the Face of Experience, Competition and High Stakes, American Economic Review, 101(1), 129-157. https://repository.upenn.edu/cgi/viewcontent.cgi?article=1215&context=mgmt_papers

Philip Tetlock and Dan Gardner, Superforecasting: The Art and Science of Prediction, 2016, London: Random House.

Superforecasting: The Art and Science of Predicting. Review and Summary. Stringfelloe, W. Jan 24, 2017. https://medium.com/west-stringfellow/superforecasting-the-art-and-science-of-prediction-review-and-summary-e075be35a936

Superforecasting. Wikipedia. https://en.wikipedia.org/wiki/Superforecasting

View at Medium.com

April 11, 2019

What’s the best way to double your money? In a nutshell.

Further and deeper exploration of paradoxes and challenges of intuition and logic can be found in my recently published book, Probability, Choice and Reason.

John needs £216 to pay off an urgent debt, but has only £108 available. This is unacceptable to the lender and as good as nothing. He decides to try to win the money at the roulette wheel.

So what is his best strategy? The answer might be a little surprising. He should, in fact, put the whole lot on one spin of the wheel. Yes, that’s right. In unfavourable games (house edge against you) bold play is best, timid play is worst. Always place the fewest bets you need to reach your target.

Take the case, for example of a single-zero roulette wheel. So there are 36 slots and the zero and the payout to a winning bet is 35/1, while the chance of winning is 1 in 37 (so the payout should be at odds of 36/1). The way to look at it is that the house edge is equal to the proportion of times the ball lands in the zero slot, which is 1/37 or 2.7 per cent. This edge in favour of the house is the same whatever individual bet we make.

So let’s see what happens when John goes for the ‘bold’ play and stakes the entire £108 on Red. In this case, 18 times out of 37 (statistically speaking), or 48.6 per cent of the time, John can cash his chips immediately for £216. Of course, he is only doing this once, so this 48.6 per cent should be interpreted as the probability that he will win the £216.

An alternative ‘timid’ strategy is to divide his money into 18 equal piles of £6, and be prepared to make successive bets on a single number until he either runs out of cash or one bet (at 35 to 1) yields him £210 plus his stake = £216.

To calculate the odds of success using this timid strategy, first calculate the chance that all the bets lose. So any single bet loses with a probability of 36 in 37. So the chance that all 18 bets lose = (36/37)18 = 0.61. Therefore, the probability that at least one bet wins = 1- 0.61 = 0.39. The chance that he will achieve his target has been reduced, therefore, from 48.6 per cent to 39 per cent by substituting the timid strategy for the bold play.

There are many alternative staking strategies that might put John over the top, but none of them can make it more probable that he will achieve his target than the boldest play of them all – the full amount on one spin of the wheel.

Exercise

You need £432 to pay off an urgent debt, but has only a bank of £216 available. This is unacceptable to the lender and as good as nothing. You decide to try to win the money at the roulette wheel.

What is the probability that you will win the target sum if you place all your bank on one spin of the wheel?

What is the probability that you will win the target sum if you divide up your bank and place £36 each on six spins of the wheel?

References and Links

StackExchange. How to win at roulette? https://math.stackexchange.com/questions/98981/how-to-win-at-roulette

Dubins, L.E. and Savage, L.J. (1960). Optimal Gambling Systems. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC223086/pdf/pnas00211-0067.pdf

April 11, 2019

The Prisoner’s Dilemma – in a nutshell.

Further and deeper exploration of paradoxes and challenges of intuition and logic can be found in my recently published book, Probability, Choice and Reason.

What is Game Theory? Game theory is the study of models of conflict, cooperation and interaction between rational decision-makers. A key idea in the study of Game Theory is the Nash Equilibrium (named after John Nash), which is a solution to a game involving two or more players who want the best outcome for themselves and must take account of the actions of others.

Specifically, if there is a set of ‘game’ strategies with the property that no ‘player’ can benefit by changing their strategy while the other players keep their strategies unchanged, then that set of strategies and the corresponding payoffs constitute the Nash equilibrium. Assume, for example, there is a simple two-player game in which each Player (Bill and Ben) can adopt a ‘Friendly’ (smiles) or a ‘Hostile’ (scowls) approach. Now, depending on their respective actions, let’s say the game organiser awards monetary payoffs to each player.

An example of a payoff structure is shown in the next table and is known to each player.

	Ben ‘Friendly’	Ben ‘Hostile’
Bill ‘Friendly’	750 to A; 1000 to B	25 to A; 2000 to B
Bill ‘Hostile’	1000 A; 50 to B	30 to A; 51 to B

Now, what is Bill’s best response to each of Ben’s actions?

If Ben acts ‘Friendly’, Bill’s best payoff is to act ‘Hostile.’ This yields a payoff of 1000. If he had acted ‘Friendly’ he would have earned a payoff of only 750.

If Ben acts ‘Hostile’, Bill’s best response is if he acts ‘Hostile’. He earns 30 instead of a payoff of 25 if he acted ‘Friendly.’

In both cases his best response is to act ‘Hostile’.

Now, what is Ben’s best response to each of Bill’s actions?

If Bill acts ‘Friendly’, Ben’s best payoff is if he acts ‘Hostile.’ This yields a payoff of 2000. If he had acted ‘Friendly’ he would have earned a payoff of only 1000.

If Bill acts ‘Hostile’, Ben’s best response is if he acts ‘Hostile’. He earns 51 instead of a payoff of 50 if he acted ‘Friendly.’

In both cases his best response is to act ‘Hostile.’

A Nash Equilibrium exists when Ben’s best response is the same as Bill’s best response.

Bill and Ben have the same best response to either action of his opponent. Both should act ‘Hostile’, in which case Bill wins 30 and Ben wins 51.

But if both had been able to communicate and reach a joint, enforceable decision, they would both presumably have acted ‘Friendly.’

So, in conclusion, they would have been better off by smiling. Instead, they both scowled, which was the rational thing for them both to do, even though it was the less satisfactory outcome for both. A case of the best strategy being the worst strategy.

Let’s turn now to the world of espionage in seeking out a Nash equilibrium. Let’s assume that there are two possible codes, and Agent Anna can select either of them and so can Agent Barbara. The payoff to selecting non-matching codes is zero. An example of a payoff structure is shown in the next slide and is known to each Agent.

	Barbara uses Code ‘A’	Barbara uses Code ‘B’
Anna uses Code ‘A’	1000 to Anna; 500 to Barbara	0 to Anna; 0 to Barbara
Anna uses Code ‘B’	0 to Anna; 0 to Barbara	500 to Anna; 1000 to Barbara

So where is the Nash equilibrium?

Let’s look at the Top Left box. Here neither Agent Anna nor Agent Barbara can increase their payoff by choosing a different action to the current one. So there is no incentive for either Agent to switch given the strategy of the other Agent. So this is a Nash equilibrium.

How about Bottom right. This is the same. Again, neither Agent Anna nor Agent Barbara can increase their payoff by choosing a different action to the current one. So there is no incentive for either Agent to switch given the strategy of the other Agent. So this is also a Nash equilibrium.

How about Top right. By choosing to use Code B instead of code A, Agent Anna obtains a payoff of 500, given Agent Barbara’s actions. Similarly for Agent Barbara, who would gain by switching to code A, given Agent Anna’s strategy. So this box (Agent Anna uses code A and Agent Barbara uses code B) is NOT a Nash equilibrium, as both Agents have an incentive to switch given what the other Agent is doing.

How about Bottom left? This is the same as Top right. There are again incentives to switchgiven what the other Agent is doing. So it is NOT a Nash equilibrium.

In conclusion, this game has two Nash equilibria – top left (both Agents use code A) and bottom right (both Agents use code B).

Let’s turning now to the classic ‘Live or Die’ problem. In this problem, there are two drivers, Peter and Paul. If both Peter and Paul drive on the left of the road, they will be safe, whilst they will crash if one decides to adhere to one side of the road and the other to the opposite.

	Paul drives on the left	Paul drives on the right
Peter drives on the left	Safe, Safe	Crash, Crash
Peter drives on the right	Crash, Crash	Safe, Safe

At Top left and at Bottom right, there is no incentive for either Driver to switch to the other side of the road given the driving strategy of the other driver. They will both be safe if they adopt this strategy. So both Top left and Bottom right are Nash equilibria.

In both other scenarios (Top right and Bottom left), there is a very strong incentive to switch to the other side given the driving strategy of the other Driver. So neither Top right nor Bottom left is a Nash equilibrium.

In summary, there are two Nash equilibria in the ‘Live or Die’ problem.

Now let’s consider the case of two companies, Alligator PLC and Crocodile PLC, who each have the option of using one of two emblems. Let’s call the first the Blue Badger Emblem and the other the Black Bull emblem.

	Crocodile uses Black Bull emblem	Crocodile uses Blue Badger emblem
Alligator uses Black Bull emblem	1000 to Alligator; 500 to Crocodile	500 to Alligator; 1000 to Crocodile
Alligator uses Blue Badger emblem	500 to Alligator; 1000 to Crocodile	1000 to Alligator; 500 to Crocodile

Top left: Crocodile gains by switching from Black Bull to Blue Badger.

Top right: Alligator gains by switching from Black Bull to Blue Badger.

Bottom left: Alligator gains by switching from Blue Badger to Black Bull.

Bottom right: Crocodile gains by switching Blue Badger to Black Bull.

So this game has no Nash equilibrium. There is always an incentive to switch.

So how many Nash equilibria can there be in these sorts of game? Let us recall that if there is a set of ‘game’ strategies with the property that no ‘player’ can benefit by changing their strategy while the other players keep their strategies unchanged, then that set of strategies and the corresponding payoffs constitute what is known as the ‘Nash equilibrium’.

There may be one (e.g. the Friendly/Hostile game). There may be more than one (e.g. Spy problem, ‘Live or Die’ problem). There may be none (e.g. company emblems problem).

This leads us to the classic ‘Prisoner’s Dilemma’ problem. In this scenario, two prisoners, linked to the same crime, are offered a discount on their prison terms for confessing if the other prisoner continues to deny it, in which case the other prisoner will receive a much stiffer sentence. However, they will both be better off if both deny the crime than if both confess to it. The problem each faces is that they can’t communicate and strike an enforceable deal. The box diagram below shows an example of the Prisoner’s Dilemma in action.

	Prisoner 2 Confesses	Prisoner 2 Denies
Prisoner 1 Confesses	2 years each	Freedom for P1; 8 years for P2
Prisoner 1 Denies	8 years for P1; Freedom for P2	1 year each

The Nash Equilibrium is for both to confess, in which case they will both receive 2 years. But this is not the outcome they would have chosen if they could have agreed in advance to a mutually enforceable deal. In that case they would have chosen a scenario where both denied the crime and received 1 year each.

Note that the action that gave each of the prisoners the least jail time did not depend on what the other prisoner did. There was what is called a ‘dominant strategy’ for each player, and hence a single dominant strategy equilibrium. That’s the definition of a dominant strategy. It is the strategy that will give the highest payoff whatever the other person does.

Often there is no dominant strategy. We have already looked at such a situation. Driving on the right or on the left. If others drive on the right, your best response is to drive on the right too. If they drive on the left, your best response is to drive on the left. In the US, everyone driving on the right is an equilibrium, in the sense that no one would want to change their strategy given what others are doing. In game theory, if everyone is playing their best response to the strategies of everyone else, these strategies are, as we know, termed a Nash equilibrium. In Japan, though, Drive on the Left is a Nash equilibrium. So the Live or Die ‘game’ has two Nash equilibria but no dominant strategy equilibrium.

Many interactions do not have dominant strategy equilibria, but if we can find a Nash equilibrium, it gives us a prediction of what we should observe. So a Nash equilibrium is a stable state that involves interacting participants in which none can gain by a change of strategy as long as the other participants remain unchanged. It is not necessarily the best outcome for the parties involved, but it is the outcome we would most likely predict. Once again, we find that the best strategy in a world of rational self-interested people is not the one that is actually in their self-interest.

Perhaps the best example of an attempted real-life resolution to the Prisoner’s Dilemma was demonstrated in the TV ‘Golden Balls’ quiz show. In the game, two players must select a ball which, unknown to the other player, is either a ‘Split’ or ‘Steal’ Ball. If both choose Split, they share the prize money. If both choose ‘Steal’ they each go away with nothing. If one chooses ‘Steal’ and one chooses ‘ Split’, the contestant who chose ‘Steal wins all the money, and the contestant who chose ‘Split’ gets nothing. In this game, the Nash equilibrium among self-interested players is Steal-Steal as Steal dominates Split (wins all the money compared to sharing the money if choosing Split) but loses nothing to Steal compared to choosing ‘Split’ (wins nothing either way). Steal in the Golden Balls game is this equivalent to Confess in the traditional Prisoner’s Dilemma game.

The YouTube video shown linked below is a classic demonstration of an attempt to resolve the dilemma.

Exercise

Is every Nash Equilibrium a Dominant Strategy Equilibrium? Is every Dominant Strategy Equilibrium a Nash Equilibrium? Illustrate your answer, using an example.

In the Golden Balls game, with no communication allowed outside the game format, is there a dominant strategy for each player? Is there a dominant strategy equilibrium? Is there are a Nash equilibrium? If so, what is it?

References and Links

Social Interaction: Game Theory. CORE. https://core-econ.org/the-economy/book/text/04.html#41-social-interactions-game-theory

Equilibrium in the Invisible Hand Game. CORE. https://core-econ.org/the-economy/book/text/04.html#42-equilibrium-in-the-invisible-hand-game

The Prisoners’ Dilemma. CORE. https://core-econ.org/the-economy/book/text/04.html#43-the-prisoners-dilemma

Social Preferences: Altruism. CORE. https://core-econ.org/the-economy/book/text/04.html#44-social-preferences-altruism

Altruistic Preferences in the Prisoners’ Dilemma. CORE. https://core-econ.org/the-economy/book/text/04.html#45-altruistic-preferences-in-the-prisoners-dilemma

Social interactions: Conflicts in the choice among Nash equilibria. CORE. https://core-econ.org/the-economy/book/text/04.html#413-social-interactions-conflicts-in-the-choice-among-nash-equilibria

Social Interactions: Conclusion. CORE. https://core-econ.org/the-economy/book/text/04.html#414-conclusion

April 11, 2019

Repeated Game Strategies – in a nutshell.

Further and deeper exploration of paradoxes and challenges of intuition and logic can be found in my recently published book, Probability, Choice and Reason.

If there is a set of ‘game’ strategies with the property that no ‘player’ can benefit by changing their strategy while the other players keep their strategies unchanged, then that set of strategies and the corresponding payoffs constitute what is known as the ‘Nash equilibrium’.

	Prisoner 2 Confesses	Prisoner 2 Denies
Prisoner 1 Confesses	2 years each	Freedom for P1; 8 years for P2
Prisoner 1 Denies	8 years for P1; Freedom for P2	1 year each

So a Nash equilibrium is a stable state that involves interacting participants in which none can gain by a change of strategy as long as the other participants remain unchanged. It is not necessarily the best outcome for the parties involved, but it is the outcome we would most likely predict.

The Prisoner’s Dilemma is a one-stage game, however. What happens in games with more than one round, where players can learn from the previous moves of the other players?

Take the case of a 2-round game. The payoff from the game will equal the sum of payoffs from both moves.

The game starts with two players, each of whom is given £100 to place into a pot. They can then secretly choose to honour the deal or to cheat on the deal, by means of giving an envelope to the host containing the card ‘Honour’ or ‘Cheat’. If they both choose to ‘Honour’ the deal, an additional £100 is added to the pot, yielding each an additional £50. So they end up with £150 each. But if one honours the deal and the other cheats on the deal, the ‘Cheat’ wins the original pot (£200) and the ‘Honour’ player loses all the money in that round. A third outcome is that both players choose to ‘Cheat’, in which case each keeps the original £100. So in this round, the dominant strategy for each player (assuming no further rounds) is to ‘Cheat’, as this yields a higher payoff if the opponent ‘Honours’ the deal (£200 instead of £150) and a higher payoff if the opponent ‘Cheats’ (£100 instead of zero). The negotiated, mutually enforceable outcome, on the other hand, would be to agree to both ‘Honour’ the deal and go away with £150.

But how does this change in a 2-round game.

Actually, it makes no difference. In this scenario, the next round is the final round, in which you may as well ‘Cheat’ as there are no future rounds to realise the benefit of any goodwill realised from honouring the deal. Your opponent knows this, so you can assume your opponent who wishes to maximise his total payoff, will be hostile on the second move. He will assume the same about you.

Since you will both ‘Cheat’ on the second and final move, why be friendly on the first move?

So the dominant strategy is to ‘Cheat’ on the first round.

What if there are three rounds? The same applies. You know that your opponent will ‘Cheat’ on the final round and therefore the penultimate round as well. So your dominant strategy is to ‘Cheat’ on the first round, the second round and the final round. The same goes for your opponent. And so on. In any finite, pre-determined number of rounds, the dominant strategy in any round is to ‘Cheat.’

But what if the game involves an indeterminate number of moves? Suppose that after each move, you roll two dice. If you get a double-six, the game ends. Any other combination of numbers, play another round. Keep playing until you get a double-six. Your score for the game is the sum of your payoffs.

This sort of game in fact mirrors many real-world situations. In real life, you often don’t know when the game will end.

What is the best strategy in repeated play? For the game outlined above, we shall denote ‘Honour the deal’ as a ‘Friendly’ move and ‘Cheat’ as a hostile move. But the notion of a Friendly or Hostile approach can adopt other guises in different games.

There are seven proposed strategies here.

Always Friendly. Be friendly every time
Always Hostile. Be hostile every time
Retaliate. Be Friendly as long as your opponent is Friendly but if your opponent is ever Hostile, you be Hostile from that point on.
Tit for tat. Be Friendly on the first move. Thereafter, do whatever your opponent did on the previous move.
Random. On each move, toss a coin. If Heads, be Friendly. If tails, be Hostile.
Alternate. Be Friendly on even-numbered moves, and Hostile on odd-numbered moves, or vice-versa.
Fraction. Be Friendly on the first move. Thereafter, be Friendly if the fraction of times your opponent has been Friendly until that point is less than a half. Be Hostile if it is less than or equal to a half.

Which of these is the dominant strategy in this game of iterated play? Actually, there is no dominant strategy in an iterated game, but which strategy actually wins if every strategy plays every other strategy.

‘Always Hostile’ does best against ‘Always Friendly’ because every time you are Friendly against an ‘Always Hostile’, you are punished with the ‘sucker’ payoff.

‘Always Friendly’ does best against Retaliation, because the extra payoff you get from a Hostile move is eventually negated by the Retaliation.

Thus even the choice of whether to be Friendly or Hostile on the first move depends on the opponent’s strategy.

For every two distinct strategies, A and B, there is a strategy C against which A does better than B, and a strategy D against which B does better than A.

So which strategy wins when every strategy plays every other strategy in a tournament? This has been computer simulated many times. And the winner is Tit for Tat.

It’s true that Tit for Tat can never get a higher score than a particular opponent, but it wins tournaments where each strategy plays every other strategy. In particular, it does well against Friendly strategies, while it is not exploited by Hostile strategies. So you can trust Tit for Tat. It won’t take advantage of another strategy. Tit for Tat and its opponents both do best when both are Friendly. Look at this way. There are two reasons for a player to be unilaterally hostile, i.e. to take advantage of an opponent or to avoid being taken advantage of by an opponent. Tit for Tat eliminates the reasons for being Hostile.

What accounts for Tit for Tat’s success, therefore, is its combination of being nice, retaliatory, forgiving and clear.

In other words, success in an evolutionary ‘game’ is correlated with the following characteristics:

Be willing to be nice: cooperate, never be the first to defect.

Don’t be played for a sucker: return defection for defection, cooperation for cooperation.

Don’t be envious: focus on how well you are doing, as opposed to ensuring you are doing better than everyone else.

Be forgiving if someone is willing to change their ways and co-operate with you. Don’t bear grudges for old actions.

Don’t be too clever or too tricky. Clarity is essential for others to cooperate with you.

As Robert Axelrod, who pioneered this area of game theory in his book, ‘The Evolution of Cooperation’: Tit for Tat’s “niceness prevents it from getting into unnecessary trouble. Its retaliation discourages the other side from persisting whenever defection is tried. Its forgiveness helps restore mutual cooperation. And its clarity makes it intelligible to the other player, thereby eliciting long-term cooperation.”

How about the bigger picture? Can Tit for Tat perhaps teach us a lesson in how to play the game of life? Yes, in my view it probably can.

Further Reading and Links

Axelrod, Robert (1984), The Evolution of Cooperation, Basic Books

Axelrod, Robert (2006), The Evolution of Cooperation (Revised ed.), Perseus Books Group

Axelrod, R. and Hamilton, W.D. (1981), The Evolution of Cooperation, Science, 211, 1390-96. http://www-personal.umich.edu/~axe/research/Axelrod%20and%20Hamilton%20EC%201981.pdf

https://en.wikipedia.org/wiki/The_Evolution_of_Cooperation

April 11, 2019

How to use game theory to take a penalty – in a nutshell.

Further and deeper exploration of paradoxes and challenges of intuition and logic can be found in my recently published book, Probability, Choice and Reason.

It’s 2020 and our mythical El Clasico game between Real Madrid and Barcelona is in the 23^rd minute at the Santiago Bernabeu when Lionel Messi is brought down in the penalty box. He is rewarded with a spot kick against the custodian of the Los Blancos net, Keylor Navas.

Messi knows from the team statistician that if he aims straight and the goalkeeper stands still, his chance of scoring is just 30%. But if he aims straight and Navas dives to one corner, his chance of converting the penalty rises to 90%.

On the other hand, if Messi aims at a corner and the goalkeeper stands still, his chance of scoring is a solid 80%, while it falls to 50% if the goalkeeper dives to a corner.

We are here simplifying the choices to two distinct options, for the sake of simplicity and clarity.

Navas also knows from his team statistician that if he dives to one corner and Messi aims straight, his chance of saving is just 10%. But if he stands still and Messi aims at one corner, his chance of saving the penalty rises to 50%.

On the other hand, if Navas stands still and Messi aims at a corner, his chance of making the save is just 20%, while it rises to 70% if Messi aims straight.

So this is the payoff matrix, so to speak, facing Messi as he weighs up his decision.

	Goalkeeper – Stands Still	Goalkeeper – dive to one corner
Lionel Messi – Aims straight	30%	90%
Lionel Messi – Aims at corner	80%	50%

So what should he do? Aim straight or to a corner. And what should Navas do? Stand still or dive?

Here is the payoff matrix facing Navas.

	Messi – Aims straight	Messi – Aims at a corner
Navas – Stands still	70%	20%
Navas – Dives to one corner	10%	50%

Game theory can help here.

Neither player has what is called a dominant strategy in game-theoretic terms, i.e. a strategy that is better than the other, no matter what the opponent does. The optimal strategy will depend on what the opponent’s strategy is.

In such a situation, game theory indicates that both players should mix their strategies, in Messi’s case aiming for the corner with a two-thirds chance, while the goalkeeper should dive with a 5/9 chance.

These figures are derived by finding the ratio where the chance of scoring (or saving) is the same, whichever of the two tactics the other player uses.

The Proof

Suppose the goalkeeper opts to stand still, then Messi’s chance (if he aims for the corner 2/3 of the time) = 1/3 x 30% + 2/3 x 80% = 10% + 53.3% = 63.3%

If the goalkeeper opts to dive, Messi’s chance = 1/3 x 90% + 2/3 x 50% = 30% + 33.3% = 63.3%

Adopting this mixed strategy (aim for the corner 2/3 of the time and shoot straight 1/3 of the time), the chance of scoring is therefore the same. This is the ideal mixed strategy, according to standard game theory.

From the point of view of Navas, on the other hand, if Messi aims straight, his chance of saving the penalty kick (if he dives 5/9 of the time) = 5/9 x 10% + 4/9 x 70% = 5.6% + 31.1% = 36.7%

If Messi opts to aim for the corner, Navas’ chance = 5/9 x 50% + 4/9 x 20% = 27.8% + 8.9% = 36.7%

Adopting this mixed strategy (dive for the corner 5/9 of the time and stand still 4/9 of the time), the chance of scoring is therefore the same. This is the ideal mixed strategy, according to standard game theory.

The chances of Messi scoring and Navas making the save in each case add up to 100%, which cross-checks the calculations.

Of course, if the striker or the goalkeeper gives away real new information about what he will do, then each of them can adjust tactics and increase their chance of scoring or saving.

To properly operationalise a mixed strategy requires one extra element, and that is the ability to truly randomise the choices, so that Messi actually does have exactly a 2/3 chance of aiming for the corner, and Navas actually does have a 5/9 chance of diving for the corner. There are different ways of achieving this. One method of achieving a 2/3 ratio is to roll a die and go for the corner if it comes up 1, 2, 3 or 4, and aim straight if it comes up 5 or 6. Or perhaps not! But you get the idea.

For the record, Messi aimed at the left corner, Navas guessed correctly and got an outstretched hand to it, pushing it back into play. Leo stepped forward deftly to score the rebound. Cristiano Ronaldo equalised from the spot eight minutes later. And that’s how it ended at the Bernabeu. Real Madrid 1 Barcelona 1. Honours even in El Clasico.

Appendix

Messi’s strategy

x = chance that Messi should aim at corner

y = chance that Messi should aim straight

So,

80x + 30y (if Navas stands still) = 50x + 90y (if Navas dives)

x + y = 1

So,

30x = 60y

30x = 60 (1-x)

90x = 60

x = 2/3

y=1/3

Navas’ strategy

x = chance that Navas should dive to corner

y = chance that Navas should stand still

So,

10x + 70y (if Messi aims straight) = 50x + 20y (if Messi aims at corner)

x+y = 1

So,

10x + 70y = 50x + 20y

40x = 50y

40x = 50(1-x)

90x = 50

x = 5/9

y = 4/9

References and Links

Game Theory: Mixed Strategies Explained. https://www.theprofs.co.uk/library/pdf/mixed-strategy-game-theory-examples.pdf

April 10, 2019

Simplicity and the Search for Truth – Guide Notes.

Further and deeper exploration of paradoxes and challenges of intuition and logic can be found in my recently published book, Probability, Choice and Reason.

William of Occam (also spelled William of Ockham) was a 14^th century English philosopher. At the heart of Occam’s philosophy is the principle of simplicity, and Occam’s Razor has come to embody the method of eliminating unnecessary hypotheses. Essentially, Occam’s Razor holds that the theory which explains all (or the most) while assuming the least is the most likely to be correct. This is the principle of parsimony – explain more, assume less. Put more elegantly, it is the principle of ‘pluritas non est ponenda sine necessitate’ (plurality must never be posited beyond necessity).

Yet empirical support for the Razor can be drawn from the principle of ‘overfitting.’ In statistics, ‘overfitting’ occurs when a statistical model describes random error or noise instead of the underlying relationship. Overfitting generally occurs when a model is excessively complex, such as having too many parameters relative to the number of observations. Critically, a model that has been overfit will generally have poor predictive performance, as it can exaggerate minor fluctuations in the data.

We can also look at it through the lens of what is known as Solomonoff Induction. Whether a detective trying to solve a crime, a physicist trying to discover a new universal law, or an entrepreneur seeking to interpret some latest sales figures, all are involved in collecting information and trying to infer the underlying causes. The problem of induction is this: We have a set of observations (or data), and want to find the underlying causes of those observations, i.e. to find hypotheses that explain our data. We’d like to know which hypothesis is correct, so we can use that knowledge to predict future events. In doing so, we need to create a set of defined steps to arrive at the truth, a so-called algorithm for truth.

In particular, if all of the hypotheses are possible but some are more likely than others, how do you weight the various hypotheses? This is where Occam’s Razor comes in.

Consider, for example, the two 32 character sequences:

abababababababababababababababab

4c1j5b2p0cv4w1x8rx2y39umgw5q85s7

The first can be written “ab 16 times”. The second probably cannot be simplified further.

Now consider the following problem. A computer program outputs the following sequence of numbers: 1, 3, 5, 7. What rule do you think gave rise to the number sequence 1,3,5,7? If we know this, it will help us to predict what the next number in the sequence is likely to be, if there is one. Two hypotheses spring instantly to mind. It could be: 2n-1, where n is the step in the sequence. So the third step, for example, gives 2×3-1 = 5. If this is the correct rule generating the observations, the next step in the sequence will be 9 (5×2-1).

But it’s possible that the rule generating the number sequence is: 2n-1 + (n-1)(n-2)(n-3)(n-4). So the third step, for example, gives 2×3-1 + (3-1)(3-2)(3-3)(3-4) = 7. In this case, however, the next step in the sequence will be 33.

But doesn’t the first hypothesis seem more likely? Occam’s Razor is the principle behind this intuition. “Among all hypotheses consistent with the observations, the simplest is the most likely.”

More generally, say we have two different hypotheses about the rule generating the data. How do we decide which is more likely to be true? To start, is there a language in which we can express all problems, all data, all hypotheses? Let’s look at binary data. This is the name for representing information using only the characters ‘0’ and ‘1’. In a sense, binary is the simplest possible alphabet. With these two characters we can encode information. Each 0 or 1 in a binary sequence (e. g. 01001011) can be considered the answer to a yes-or-no question. And in principle, all information can be represented in binary sequences. Indeed, being able to do everything in the language of binary sequences simplifies things greatly, and gives us great power. We can treat everything contained in the data in the same way.

Now that we have a simple way to deal with all types of data, we need to look at the hypotheses, in particular how to assign prior probabilities to the hypotheses. When we encounter new data, we can then use Bayes’ Theorem to update these probabilities.

To be complete, to guarantee we find the real explanation for our data, we have to consider all possible hypotheses. But how could we ever find all possible explanations for our data?

By using the language of binary, we can do so.

Here we look to the concept of Solomonoff induction, in which the assumption we make about our data is that it was generated by some algorithm, i.e. the hypothesis that explains the data is an algorithm. Now we can find all the hypotheses that would predict the data we have observed. Given our data, we find potential hypotheses to explain it by running every hypothesis, one at a time. If the output matches our data, we keep it. Otherwise, we discard it. We now have a methodology, at least in theory, to examine the whole list of hypotheses that might be the true cause behind our observations.

The first thing is to imagine that for each bit of the hypothesis, we toss a coin. Heads will be 0, and tails will be 1. Take as an example, 01001101, so the coin landed heads, tails, heads, tails and so on. Because each toss of the coin has a 50% probability, each bit contributes ½ to the final probability. Therefore, an algorithm that is one bit longer is half as likely to be the true algorithm. This intuitively fits with Occam’s Razor: a hypothesis that is 8 bits long is much more likely than a hypothesis that is 34 bits long. Why bother with extra bits? We’d need evidence to show that they were necessary. So why not take the shortest hypothesis and call that the truth? Because all of the hypotheses predict the data we have so far, and in the future we might get data to rule out the shortest one. The more data we get, the easier it is likely to become to pare down the number of competing hypotheses which fit the data.

Turning now to ‘ad hoc’ hypotheses and the Razor. In science and philosophy, an ‘ad hoc hypothesis’ is a hypothesis added to a theory in order to save it from being falsified. Ad hoc hypothesising is compensating for anomalies not anticipated by the theory in its unmodified form. For example, you say that there is a leprechaun in your garden shed. A visitor to the shed sees no leprechaun. This is because he is invisible, you say. He spreads flour on the ground to see the footprints. He floats, you declare. He wants you to ask him to speak. He has no voice, you say. More generally, for each accepted explanation of a phenomenon, there is generally an infinite number of possible, more complex alternatives. Each true explanation may therefore have had many alternatives that were simpler and false, but also approaching an infinite number of alternatives that are more complex and false.

This leads us the idea of what I term ‘Occam’s Leprechaun.’ Any new and more complex theory can always be possibly true. For example, if an individual claims that leprechauns were responsible for breaking a vase that he is suspected of breaking, the simpler explanation is that he is not telling the truth, but ongoing ad hoc explanations (e.g. “That’s not me on the CCTV, it’s a leprechaun disguised as me) prevent outright falsification. An endless supply of elaborate competing explanations, called ‘saving hypotheses’, prevent ultimate falsification of the leprechaun hypothesis, but appeal to Occam’s Razor helps steer us towards the probable truth. Another way of looking at this is that simpler theories are more easily falsifiable, and hence possess more empirical content.

All assumptions introduce possibilities for error; if an assumption does not improve the accuracy of a theory, its only effect is to increase the probability that the overall theory is wrong.

It can also be looked at this way. The prior probability that a theory based on n+1 assumptions is true must be less than a theory based on n assumptions, unless the additional assumption is a consequence of the previous assumptions. For example, the prior probability that Jack is a train driver must be less than the prior probability that Jack is a train driver AND that he owns a Mini Cooper, unless all train drivers own Mini Coopers, in which case the prior probabilities are identical.

Again, the prior probability that Jack is a train driver and a Mini Cooper owner and a ballet dancer is less than the prior probability that he is just the first two, unless all train drivers are not only Mini Cooper owners but also ballet dancers. In the latter case, the prior probabilities of the n and n+1 assumptions are the same.

From Bayes’ Theorem, we know that reducing the prior probability will reduce the posterior probability, i.e. the probability that a proposition is true after new evidence arises.

Science prefers the simplest explanation that is consistent with the data available at a given time, but even so the simplest explanation may be ruled out as new data become available. This does not invalidate the Razor, which does not state that simpler theories are necessarily more true than more complex theories, but that when more than one theory explains the same data, the simpler should be accorded more probabilistic weight. The theory which explains all (or the most) and assumes the least is most likely. So Occam’s Razor advises us to keep explanations simple. But it is also consistent with multiplying entities necessary to explain a phenomenon. A simpler explanation which fails to explain as much as another more complex explanation is not necessarily the better one. So if leprechauns don’t explain anything they cannot be used as proxies for something else which can explain something.

More generally, we can now unify Epicurus and Occam. From Epicurus’ Principle we need to keep open all hypotheses consistent with the known evidence which are true with a probability of more than zero. From Occam’s Razor we prefer from among all hypotheses that are consistent with the known evidence, the simplest. In terms of a prior distribution over hypotheses, this is the same as giving simpler hypotheses higher a priori probability, and more complex ones lower probability.

From here we can move to the wider problem of induction about the unknown by extrapolating a pattern from the known. Specifically, the problem of induction is how we can justify inductive inference. According to Hume’s ‘Enquiry Concerning Human Understanding’ (1748), if we justify induction on the basis that it has worked in the past, then we have to use induction to justify why it will continue to work in the future. This is circular reasoning. This is faulty theory. “Induction is just a mental habit, and necessity is something in the mind and not in the events.” Yet in practice we cannot help but rely on induction. We are working from the idea that it works in practice if not in theory – so far. Induction is thus related to an assumption about the uniformity of nature. Of course, induction can be turned into deduction by adding principles about the world (such as ‘the future resembles the past’, or ‘space-time is homogeneous.’) We can also assign to inductive generalisations probabilities that increase as the generalisations are supported by more and more independent events. This is the Bayesian approach, and it is a response to the perspective pioneered by Karl Popper. From the Popperian perspective, a single observational event may prove hypotheses wrong, but no finite sequence of events can verify them correct. Induction is from this perspective theoretically unjustifiable and becomes in practice the choice of the simplest generalisation that resists falsification. The simpler a hypothesis, the easier it is to be falsified. Induction and falsifiability are in practice, from this viewpoint, is as good as it gets in science. Take an inductive inference problem where there is some observed data and a set of hypotheses, one of which may be the true hypothesis generating the data. The task then is to decide which hypothesis, or hypotheses, are the most likely to be responsible for the observations.

A better way of looking at this seems to be to abandon certainties and think probabilistically. Entropy is the tendency of isolated systems to move toward disorder and a quantification of that disorder, e.g. assembling a deck of cards in a defined order requires introducing some energy to the system. If you drop the deck, they become disorganised and won’t re-organise themselves automatically. This is the tendency in all systems to disorder. This is the Second Law of Thermodynamics, which implies that time is asymmetrical with respect to the amount of order: as the system, advances through time, it will statistically become more disordered. By ‘Order’ and ‘Disorder’ we mean how compressed the information is that is describing the system. So if all your papers are in one neat pile, then the description is “All paper in one neat pile.” If you drop them, the description becomes ‘One paper to the right, another to the left, one above, one below, etc. etc.” The longer the description, the higher the entropy. According to Occam’s Razor, we want a theory with low entropy, i.e. low disorder, high simplicity. The lower the entropy, the more likely it is that the theory is the true explanation of the data, and hence that theory should be assigned a higher probability.

More generally, whatever theory we develop, say to explain the origin of the universe, or consciousness, or non-material morality, must itself be based on some theory, which is based on some other theory, and so on. At some point we need to rely on some statement which is true but not provable, and so we think may be false, although it is actually true. We can never solve the ultimate problem of induction, but Occam’s Razor combined with Epicurus, Bayes and Popper is as good as it gets if we accept that. So Epicurus, Occam, Bayes and Popper help us pose the right questions, and help us to establish a good framework for thinking about the answers.

At least that applies to the realm of established scientific enquiry and the pursuit of scientific truth. How far it can properly be extended beyond that is a subject of intense and continuing debate.

References and Links

McFadden, Johnjoe. 2021. Life is Simple. London: Basic Books.

Occam’s Razor. Principia Cybernetica Web. http://pespmc1.vub.ac.be/OCCAMRAZ.html

What is Occam’s Razor. UCR Math. http://math.ucr.edu/home/baez/physics/General/occam.html

Occam’s Razor. Simple English Wikipedia. https://simple.wikipedia.org/wiki/Occam%27s_razor

Occam’s Razor. Wikipedia. https://en.wikipedia.org/wiki/Occam%27s_razor

An Intuitive Explanation of Solomonoff Induction. LESSWRONG. Alex Altair. July 11, 2012. https://www.lesswrong.com/posts/Kyc5dFDzBg4WccrbK/an-intuitive-explanation-of-solomonoff-induction

April 4, 2019

The Four Card Task – in a nutshell.

Further and deeper exploration of paradoxes and challenges of intuition and logic can be found in my recently published book, Probability, Choice and Reason.

You are presented with four cards, with the face-up side on display, showing either a letter or a number. You are promised that each has a letter on one side and a number on the other.

Red Card displays the letter D

Orange Card displays the letter N

Blue Card displays the number 21

Yellow Card displays the number 16

You are now presented with the following statement: Every card with D on one side has 21 on the other side.

Exercise

a. What is the minimum number of cards needed to determine whether this statement is true? What are the colours of the cards you need to turn over to determine this?

b. Four cards are placed on a table, each of which has a number on one side and a patch of colour on the other side. The visible faces of the cards display 1, 4, red and yellow. What is the minimum number of cards you need to turn over to test the truth of the proposition that a card with an even number on one face is red on the other side?

3. Four cards are placed on a table, two of which display a number, 16 or 25. The other two cards display a soft drink and an alcoholic drink. The minimum age for drinking alcohol is 18. What is the minimum number of cards you need to turn over to test the truth of the proposition that a card with a number greater than 18 on one side has an alcoholic drink on the other side?

References and Links

The Famous Four Card Task. Social Psychology Network. https://www.socialpsychology.org/teach/wason.htm

Wason Selection Task. Wikipedia. https://en.wikipedia.org/wiki/Wason_selection_task

April 4, 2019

Biases at the racetrack – in a nutshell.

The Favourite-Longshot Bias is the well-established tendency in most betting markets for bettors to bet too much on ‘longshots’ (events with long odds, i.e. low probability events) and to relatively under-bet ‘favourites’ (events with short odds, i.e. high probability events). This is strangely counterintuitive as it seems to offer a sure-fire way to make above-average returns in the betting booth. Assume, for example, that Mr. Smith and Mr. Jones both start with £1,000. Now Mr. Smith places a level £10 stake on 100 horses quoted at 2 to 1. Meanwhile, Mr. Jones places a level £10 stake on 100 horses quoted at 20 to 1.

Who is likely to end up with more money at the end? Surely the answer should be the same for both. Otherwise, either Mr. Smith or Mr. Jones would seem to be doing something very wrong. So let’s take a look.

The Ladbrokes Flat Season Pocket Companion for 1990 provides a nicely laid out piece of evidence here for British flat horse racing between 1985 and 1989, but the same sort of pattern applies for any set of years we care to choose, or (with a few rare exceptions) pretty much any sport, anywhere.

In fact, the table conveniently presented in the Companion shows that not one out of 35 favourites sent off at 1/8 or shorter (as short as 1/25) lost between 1985 and 1989. This means a return of between 4% and 12.5% in a couple of minutes, which is an astronomical rate of interest. The point being made is that broadly speaking the shorter the odds, the better the return. The group of ‘white hot’ favourites (odds between 1/5 and 1/25) won 88 out of 96 races for a 6.5% profit. The following table looks at other odds groupings.

Odds Wins Runs Profit %

1/5-1/2 249 344 +£1.80 +0.52

4/7-5/4 881 1780 -£82.60 -4.64

6/4 -3/1 2187 7774 -£629 -8.09

7/2-6/1 3464 21681 -£2237 -10.32

8/1-20/1 2566 53741 -£19823 -36.89

25/1-100/1 441 43426 -£29424 -67.76

An interesting argument advanced by Robert Henery in 1985 is that the favourite-longshot bias is a consequence of bettors discounting a fixed fraction of their losses, i.e. they underweight their losses compared to their gains, and this causes them to bias their perceptions of what they have won or lost in favour of longshots. The rationale behind Henery’s hypothesis is that bettors will tend to explain away and therefore discount losses as atypical, or unrelated to the judgment of the bettor. This is consistent with contemporaneous work on the psychology of gambling. These studies demonstrate how gamblers tend to discount their losses, often as ‘near wins’ or the outcome of ‘fluke’ events, while bolstering their wins.

If the Henery Hypothesis is correct as a way of explaining the favourite-longshot bias, the bias can be explained as the natural outcome of bettors’ pre-existing perceptions and preferences. There is little evidence that the market offers opportunities for players to earn consistent profits, but they certainly do much better (lose a lot less) by a blind level-stakes strategy of backing favourites instead of longshots. Intuitively, we would think that people would wise up and switch their money away from the longshots to the favourites. In that case, favourites would become less good value, as their odds would shorten, and longshots would become better value as their odds would lengthen. But is doesn’t happen, despite a host of published papers pointing this out, as well as the Ladbrokes Pocket Companion. People continue to love their longshots, and are happy to pay a price for this love.

Are there other explanations for the persistence of the favourite-longshot bias? One explanation is based on consumer preference for risk. The idea here is that bettors are risk-loving and so prefer the risky series of long runs of losses followed by the odd big win to the less risky strategy of betting on favourites that will win more often albeit pay out less for each win. Such an assumption of risk-love by bettors, however, runs contrary to conventional explanations of financial behaviour which tend to assume people like to avoid risk. It’s also been argued that bettors are actually not risk-lovers but skewness-lovers, which would also explain a preference for backing longshots over favourites.

Another explanation that has been proposed for the existence of the bias is based on the existence of unskilled bettors in the context of high margins and other costs of betting which deter more skilled agents. These unskilled bettors find it more difficult to arbitrate between the true win probabilities of different horses, and so over-bet those offered at longer odds. One test of this hypothesis is to compare the size of the bias in person-to-person betting exchanges (characterised by lower margins) and bookmaker markets (higher margins). The bias was indeed lower in the former, a finding which is at least consistent with this theory.

So far, it should be noted that these are all demand-side explanations, i.e. based on the behaviour of bettors. Another explanation of at least some of the bias is the idea that odds-setters defend themselves against bettors who potentially have superior information to bookmakers by artificially squeezing odds at the longer end of the market. Even so, the favourite-longshot bias continues to exist in so-called ‘pari-mutuel’ markets, in which there are no odds-setters, but instead a pool of all bets which is paid out (minus fixed operator deductions) to winning bets. To the extent that the favourite-longshot bias cannot be fully explained by this odds-squeezing explanation, we can classify the remaining explanations as either preference-based or perception-based. Risk love or skewness love are examples of preference-based explanations. Discounting of losses or other explanations based on a poor assessment of the true probabilities can be categorized as perception-based explanations.

The favourite-longshot bias has even been found in online poker, especially in lower-stake games. In that context, the evidence suggests that it was misperception of probabilities rather than risk-love that offered the best explanation for the bias.

In conclusion, the favourite-longshot bias is a well-established market anomaly in sports betting markets, which can be traced in the published academic literature as far back as 1949. Explanations can broadly be divided into demand-based and supply-based, preference-based and perceptions-based. A significant amount of modern research has been focused on seeking to arbitrate between these competing explanations of the bias by formulating predictions as to how data derived from these markets would behave if one or other explanation was correct. A compromise position, which may or may not be correct, is that all of these explanations have some merit, the relative merit of each depending on the market context.

Appendix

Let’s look more closely at how the Henery odds transformation works.

If the true probability of a horse losing a race is q, then the true odds against winning are q/(1-q).

For example, if the true probability of a horse losing a race (q) is ¾, the chance that it will win the race is ¼, i.e. 1- ¾. The odds against it winning are: q/(1-q) = 3/4/(1-3/4) = 3/4/(1/4) = 3/1.

Henery now applies a transformation whereby the bettor will assess the chance of losing not as q, but as Q which is equal to fq, where f is the fixed fraction of losses undiscounted by the bettor.

If, for example, f = ¾, and the true chance of a horse losing is ½ (q=1/2), then the bettor will rate subjectively the chance of the horse losing as Q = fq.

So Q = ½. ¾ = 3/8, i.e. a subjective chance of winning of 5/8.

So the perceived (subjective) odds of winning associated with true (objective odds) of losing of 50% (Evens, i.e. q=1/2) is 3/5 (60%), i.e. odds-on.

This is derived as follows:

Q/(1-Q) = fq/(1-fq) = 3/8/(1-3/8) = 3/8/(5/8) = 3/5

If the true probability of a horse losing a race is 80%, so that the true odds against winning are 4/1 (q = 0.8), then the bettor will assess the chance of losing not as q, but as Q which is equal to fq, where f is the fixed fraction of losses undiscounted by the bettor.

If, for example, f = ¾, and the true chance of a horse losing is 4/5 (q=0.8), then the bettor will rate subjectively the chance of the horse losing as Q = fq.

So Q = 3/4. 4/5 = 12/20, i.e. a subjective chance of winning of 8/20 (2/5).

So the perceived (subjective) odds of winning associated with true (objective odds) of losing of 80% (4 to 1, i.e. q=0.8) is 6/4 (40%).

This is derived as follows:

Q/(1-Q) = fq/(1-fq) = 12/20 / (1-12/20) = 12/8 = 6/4

To take this to the limit, if the true probability of a horse losing a race is 100%, so that the true odds against winning are ∞ to 1 against (q = 1), then the bettor will again assess the chance of losing not as q, but as Q which is equal to fq, where f is the fixed fraction of losses undiscounted by the bettor.

If, for example, f = ¾, and the true chance of a horse losing is 100% (q=1), then the bettor will rate subjectively the chance of the horse losing as Q = fq.

So Q = 3/4. 1 = 3/4, i.e. a subjective chance of winning of 1/4.

So the perceived (subjective) odds of winning associated with true (objective odds) of losing of 100% (∞ to 1, i.e. q=1) is 3/1 (25%).

This is derived as follows:

Q/(1-Q) = fq/(1-fq) = 3/4 / (1/4) = 3/1

Similarly, if the true probability of a horse losing a race is 0%, so that the true odds against winning are 0 to 1 against (q = 0), then the bettor will assess the chance of losing not as q, but as Q which is equal to fq, where f is the fixed fraction of losses undiscounted by the bettor.

If, for example, f = ¾, and the true chance of a horse losing is 0% (q=0), then the bettor will rate subjectively the chance of the horse losing as Q = fq.

So Q = 3/4. 0 = 0, i.e. a subjective chance of winning of 1.

So the perceived (subjective) odds associated of winning with true (objective odds) of losing of 0% (0 to 1, i.e. q=0) is also 0/1.

This is derived as follows:

Q/(1-Q) = fq/(1-fq) = 0 / 1 = 0/1

This can all be summarised in a table.

Objective odds (against) Subjective odds (against)
Evens 3/5
4/1 6/4
Infinity to 1 3/1
0/1 0/1

We can now use these stylised examples to establish the bias.

In particular, the implication of the Henery odds transformation is that, for a given f of ¾, 3/5 is perceived as fair odds for a horse with a 1 in 2 chance of winning.

In fact, £100 wagered at 3/5 yields £160 (3/5 x £100, plus stake returned) half of the time (true odds = evens), i.e. an expected return of £80.

£100 wagered at 6/4 yields £250 (6/4 x £100, plus the stake back) one fifth of the time (true odds = 4/1), i.e. an expected return of £50.

£100 wagered at 3/1 yields £0 (3/1 x £100, plus the stake back) none of the time (true odds = Infinity to 1), i.e. an expected return of £0.

It can be shown that the higher the odds the lower is the expected rate of return on the stake, although the relationship between the subjective and objective probabilities remains at a fixed fraction throughout.

Now on to the over-round.

The same simple assumption about bettors’ behaviour can explain the observed relationship between the over-round (sum of win probabilities minus 1) and the number of runners in a race, n.

If each horse is priced according to its true win probability, then over-round = 0. So in a six horse race, where each has a 1 in 6 chance, each would be priced at 5 to 1, so none of the lose probability is shaded by the bookmaker. Here the sum of probabilities = (6 x 1/6) – 1 = 0.

If only a fixed fraction of losses, f, is counted by bettors, the subjective probability of losing on any horse is f(qi), where qi is the objective probability of losing for horse i, and the odds will reflect this bias, i.e. they will be shorter than the true probabilities would imply. The subjective win probabilities in this case are now 1-f(qi), and the sum of these minus 1 gives the over-round.

Where there is no discounting of the odds, the over-round (OR) = 0, i.e. n times correct odds minus 1. Assume now that f = ¾, i.e. ¾ of losses are counted by the bettor.

If there is discounting, then the odds will reflect this, and the more runners the bigger will be the over-round.

So in a race with 5 runners, q is 4/5, but fq = 3/4 x 4/5 = 12/20, so subjective win probability = 1-fq = 8/20, not 1/5. So OR = (5 x 8/20) – 1 = 1.

With 6 runners, fq = ¾ x 5/6 = 15/24, so subjective win probability = 1 – fq = 9/24. OR = (6x 9/24) – 1 = (54/24) -1 = 1_1/4.

With 7 runners, fq = ¾ x 6/7 = 18/28, so subjective win probability = 1-fq = 10/28. OR = (7 x 10/28) – 1 = 42/28 = 1_1/2

If there is no discounting, then the subjective win probability equals the actual win probability, so an example in a 5-horse is that each has a win probability of 1/5. Here, OR = (5×1/5) – 1 = 0. In a 6-horse race, with no discounting, subjective probability = 1/6. OR = (6 x 1/6) – 1 = 0.

Hence, the over-round is linearly related to the number of runners, assuming that bettors discount a fixed fraction of losses (the ‘Henery Hypothesis’).

Exercise

Calculate the subjective odds (against) in this table assuming that f, the fixed fraction of losses undiscounted by the bettor, is a half.

Objective odds (against) Subjective odds (against)

Evens

4/1

Infinity to 1

0/1

References and Links

Henery, R.J. (1985). On the average probability of losing bets on horses with given starting price odds. Journal of the Royal Statistical Society. Series A (General). 148, 4. 342-349. https://www.jstor.org/stable/2981894?seq=1#page_scan_tab_contents

Vaughan Williams, L. and Paton, D. (1997). Why is there a favourite-longshot bias in British racetrack betting markets? Economics Journal, 107, 150-158.

https://watermark.silverchair.com/ej0150.pdf?token=AQECAHi208BE49Ooan9kkhW_Ercy7Dm3ZL_9Cf3qfKAc485ysgAAAl0wggJZBgkqhkiG9w0BBwagggJKMIICRgIBADCCAj8GCSqGSIb3DQEHATAeBglghkgBZQMEAS4wEQQMCVUQcYllSWnGMkemAgEQgIICEIOAXph8aDPjvqnVXOjliPPOADAHPowpBaRXZzEyyarnDxYBBfFf50ey-dXw6qgpVYZ4zcXow6kajx5-dLMzqRvaYj0SsqR-v8ITxcYGne5ENR5CAHjoUL8aaVUYKOuvzmKCqBDh_ekTRdQdNHkNQeGsd9sBd43-mxn5doQc0S8WdRKtMoVZJ2XkRnu0r9gLTobVw5RgdZ8pCi_FXfnIOxo1Peh-z7YA6FwTA0IPeoUoP-EhM-BmTgD6il9Wj77LQKBeDdf2OjUFeGZWsNijclQNROqgN5_VvkVq9t8GPntSGbrajZnDqcumw5yDi9ldcqBLoY5rQvSxlB3YU71662SJZuPuI5TMYshoD1dYWIGKQpCvKN29u8fSwOH3H-1gVP_L7poUZ6houLZ7_oV-0yolR4zzMpZ67Labb2uoXSf97adBFoynXvzm43PdpSKdD1oYEWt8E6mcBEhEmoRoekfISRch-DE9XeOHl9svPL8j2pU4ynIx8mym-fxrk2nJ5jqeUCIQt0n31yWZrWVkVin6tFr4nfWWTMKHaHGlMRgVT82UYp6-z1Ro8fAesvMvePNxl-Kv6jGU1Rgmig5W8gHZQtuy8EC5BZmZ_AakPnPwEfDdcWZrgmNucG6QGyzylobCPRR-LcpSMX5WwfMUUDJfUQ7Hw-wVUBHYQG3EDHw3Eh-1CGvtUNa3Y8RWt_2mug

Prof. Leighton Vaughan Williams

Recent Posts

Categories

A+ links

All Conversation articles

All Select Networks

Audio Files

Betting

Betting Taxation

Book Chapters

Books

Centres

Charity

Choice and Reason

Competition Commission

David Henry Morris Williams, C. Eng.

Editorial

Employment

Evidence to UK Parliament

Gambling Commission

HM Revenue and Customs

Memberships and Fellowships

My Adobe Voice

National Audit Office

Other Publications

Papers Online

Personal

Political Forecasting

Press and media

Probability

Profile

Published Papers

Radio Interviews

Select Abstracts

Select Books

Select Broadcasts

Select Clippings

Select Pages

Select Papers

Select Presentations

Select Social Media

Select Stories

Select Websites

Select Wiki

Selected Talks

Short stories

Thought Experiment

Twisted Logic

Twitter

Useful Links

Various Blogs

XYZ

Flickr Photos