Further and deeper exploration of paradoxes and challenges of intuition and logic can be found in my recently published book, Probability, Choice and Reason.

A viscountess, a radio DJ, a reality star, a vlogger, a comedian, several sportspeople and an assortment of actors and presenters. These, more or less, are the celebrities lined up to compete in the 2019 season of Strictly Come Dancing.

Outside their day jobs, few people know much about them yet. But over the 13 weeks or so of shows up until Christmas, viewers will at least learn how well the contestants can dance. But how much will their success in the competition have to do with their foxtrot and to what extent will it be, literally, the luck of the draw that sees the victors lift the trophy in December?

seminal study published in 2010 looked at public voting at the end of episodes of the various Idol television pop singing contests and found that singers who were later on in the bill got a disproportionately higher share of the public vote than those who had preceded them.

This was explained as a “recency effect” – meaning that those performing later are more recent in the memory of people who were judging or voting. Interestingly, a different study, of wine tasting, suggested that there is also a significant “primacy effect” which favours the wines that people taste first (as well, to some extent, as last).

## A little bias is in order

What would happen if the evaluation of each performance was carried out immediately after each performance instead of at the end – surely this would eliminate the benefit of going last as there would be equal recency in each case? The problem in implementing this is that the public need to see all the performers before they can choose which of them deserves their vote.

You might think the solution is to award a vote to each performer immediately after each performance – by complementing the public vote with the scores of a panel of expert judges. And, of course, Strictly Come Dancing (or Dancing with the Stars if you are in the US) does just this. So there should be no “recency effect” in the expert voting – because the next performer does not take to the stage until the previous performer has been scored.

We might expect in this case that the later performers taking to the dance floor should have no advantage over earlier performing contestants in the expert evaluations – and, in particular, there should be no “last dance” advantage.

We decided to test this out using a large data set of every performance ever danced on the UK and US versions of the show – going right back to the debut show in 2004. Our findings, published in Economics Letters, proved not only surprising, but almost a bit shocking.

## Last shall be first

Contrary to expectations, we found the same sequence order bias by the expert panel judges – who voted after each act – as by the general public, voting after all performances had concluded.

We applied a range of statistical tests to allow for the difference in quality of the various performers and as a result we were able to exclude quality as a reason for getting high marks. This worked for all but the opening spot of the night, which we found was generally filled by one of the better performers.

So the findings matched the Idol study in demonstrating that the last dance slot should be most coveted, but that the first to perform also scored better than expected. This resembles a J-curve where there are sequence order effects such that the first and later performing contestants disproportionately gained higher expert panel scores.

Although we believe the production team’s choice of opening performance may play a role in this, our best explanation of the key sequence biases is as a type of “grade inflation” in the expert panel’s scoring. In particular, we interpret the “order” effect as deriving from studio audience pressure – a little like the published evidence of unconscious bias exhibited by referees in response to spectator pressure. The influence on the judges of increasing studio acclaim and euphoria as the contest progresses to a conclusion is likely to be further exacerbated by the proximity of the judges to the audience.

When the votes from the general public augment the expert panel scores – as is the case in Strictly Come Dancing – the biases observed in the expert panel scores are amplified. All of which means that, based on past series, the best place to perform is last and second is the least successful place to perform.

The implications of this are worrying if they spill over into the real world. Is there an advantage in going last (or first) into the interview room for a job – even if the applicants are evaluated between interviews? The same effects could have implications in so many situations, such as sitting down in a dentist’s chair or doctor’s surgery, appearing in front of a magistrate or having your examination script marked by someone with a huge pile of work to get through.

One study, reported in the New York Times in 2011, found that experienced parole judges granted freedom about 65% of the time to the first prisoner to appear before them on a given day, and the first after lunch – but to almost nobody by the end of a morning session.

So our research confirms what has long been suspected – that the order in which performers (and quite possibly interviewees) appear can make a big difference. So it’s now time to look more carefully at the potential dangers this can pose more generally for people’s daily lives, and what we can do to best address the problem.