Benford’s Bonus

In a fascinating article published in the New York Times, a certain Malcolm Browne relates how Dr. Theodore Hill would ask his maths students to go home and either toss a coin 200 times and record the results, or else pretend that they had done so. Either way, he would ask them to produce for him the results of their (real or imaginary) coin-tossing experiment.

Dr. Hill’s purpose in this experiment was to show just how difficult it is to fake data convincingly. It just isn’t that easy to make up a random sequence. Based on this knowledge, he would astound his students by almost unerringly picking out the fakers from the tossers!

One of the ways he would do this would be to spot how many times heads or tails would be listed six or more times in a row. In real life, this occurrence is overwhelmingly probable in 200 coin throws. To most of his students this long a sequence is counter-intuitive, an example of what is often termed the Gamblers’ Fallacy, i.e. the erroneous perception that independent random sequences will balance out over time, so that for example an extended sequence of heads is more likely to be followed by a tail than a head. The fakers, susceptible to the Fallacy, are thus easily exposed. Ordinary people, even mathematics students, simply can’t help introducing patterns into what is random noise.

This is an example of a broader analysis which is usually referred to a Benford’s Law, which essentially states that if we randomly select a number from a table of real-life data, the probability that the first digit will be one particular number is significantly different to it being a different number. For example, the probability that the first digit will be a ‘1’ is about 30%, rather than the intuitive 10%, which assumes that all digits are equally likely.

The empirical support for this proportion can be traced to the man after whom the Law is named, physicist Dr. Frank Benford, in a paper he published in 1938, called ‘The Law of Anomalous Numbers’. In that paper he examined 20,229 sets of numbers, as diverse as baseball statistics, the areas of rivers, numbers in magazine articles and so forth, confirming the 30% rule for number 1. For information, the chance of throwing up a ‘2’ as first digit is 17.6%, and of a ‘9’ just 4.6%. The same principle applies to trailing (i.e. last) digits. It’s a great way, therefore, of checking the veracity of receipts. If, for example, there is an unusual number of trailing digit ‘7’s, there’s a decent chance that the figures are cooked. Tax authorities are alert to this.

Which makes fraudulent activity just that little bit easier to detect. For all right-minded citizens, let’s call that Benford’s Bonus.