Benford’s Law is one of those laws of statistics that defies common intuition. Essentially, it states that if we randomly select a number from a table of real-life data, the probability that the first digit will be one particular number is significantly different to it being a different number. For example, the probability that the first digit will be a ‘1’ is about 30 per cent, rather than the intuitive 11 per cent or so, which assumes that all digits from 1 to 9 are equally likely. In particular, Benford’s Law applies to the distribution of leading digits in naturally occurring phenomena, such as the population of different countries or the heights of mountains. For example, choose a paper with a lot of numbers, and now circle the numbers that occur naturally, such as stock prices. So lengths of rivers and lakes could be included, but not artificial numbers like telephone numbers. About 30 per cent of these numbers will start with a 1, and it doesn’t matter what units they are in. So the lengths of rivers could be denominated in kilometres, miles, feet, centimetres, without it making a difference to the distribution frequency of the digits. Empirical support for this distribution can be traced to the man after whom the Law is named, physicist Frank Benford, in a paper he published in 1938, called ‘The Law of Anomalous Numbers.’ In that paper he examined 20,229 sets of numbers, as diverse as baseball statistics, the areas of rivers, numbers in magazine articles and so forth, confirming the 30 per cent rule for number 1. For information, the chance of throwing up a ‘2’ as first digit is 17.6 per cent, and of a ‘9’ just 4.6 per cent.

This has clear implications for fraud detection. In particular, if declared returns or receipts deviate significantly from the Benford distribution, we have an automatic red flag which those tackling fraud are, or should be, aware of.

To explain the basis of Benford’s Law, take £1 as a base. Assume this now grows at 10 per cent per day.

£1.10, £1.21, £1.33, £1.46, £1.61, £1.77, £1.94, £2.14, £2.35, £2.59, £2.85, £3.13, £3.45, £3.80, £4.18, £4.59, £5.05, £5.56, £6.11, £6.72, £7.40, £8.14, £8.95, £9.84, £10.83, £11.92, £13.11, £14.42, £15.86, £17.45, £19.19, £21.11, £23.22, £25.50, £28.10, £30.91, £34.00, £37.40, £41.14, £45.26, £49.79, £54.74, £60.24, £72.89, £80.18, £88.20, £97.02 …

So we see that the leading digits stay a long time in the teens, less in the 20s, and so on through the 90s, and this pattern continues through three digits and so forth. Benford noticed that the probability that a number starts with n = log (n+1) – log (n), so that:

NB log10 1 = 0; log10 2 = 0.301; log10 3 = 0.4771 … log10 10 = 1.

• 1                                                                 30.1%
• 2                                                                 17.6%
• 3                                                                 12.5%
• 4                                                                 9.7%
• 5                                                                 7.9%
• 6                                                                 6.7%
• 7                                                                 5.8%
• 8                                                                 5.1%
• 9                                                                 4.6%