Numbers follow a surprising law of digits, and scientists can't explain why

May 10, 2007 By Lisa Zyga feature
This graph shows several examples of data sets from the Spaniard National Institute of Statistics that follow Benford’s logarithmic law. Data from the lottery, however, is random and uniform. Credit: Jesús Torres, et al.

Does your house address start with a 1? According to a strange mathematical law, about 1/3 of house numbers have 1 as their first digit. The same holds true for many other areas that have almost nothing in common: the Dow Jones index history, size of files stored on a PC, the length of the world’s rivers, the numbers in newspapers’ front page headlines, and many more.

The law is called Benford’s law after its (second) founder, Frank Benford, who discovered it in 1935 as a physicist at General Electric. The law tells how often each number (from 1 to 9) appears as the first significant digit in a very diverse range of data sets.

Besides the number 1 consistently appearing about 1/3 of the time, number 2 appears with a frequency of 17.6%, number 3 at 12.5%, on down to number 9 at 4.6%. In mathematical terms, this logarithmic law is written as F(d) = log[1 + (1/d)], where F is the frequency and d is the digit in question.

If this sounds kind of strange, scientists Jesús Torres, Sonsoles Fernández, Antonio Gamero, and Antonio Sola from the Universidad de Cordoba also call the feature surprising. The scientists published a letter in the European Journal of Physics called “How do numbers begin? (The first digit law),” which gives a short historical review of the law. Their paper also includes useful applications and explains that no one has been able to provide an underlying reason for the consistent frequencies.

“The Benford law has been an intriguing question for me for years, ever since I read about it,” Torres, who specializes in plasma physics, told PhysOrg.com. “I have used it as a surprising example at statistical physics classes to arouse the curiosity of my pupils.”

Torres et al. explain that, before Benford, a highly esteemed astronomer named Simon Newcomb discovered the law in 1881, although Newcomb’s contemporaries did not pay much attention to his publication. Both Benford and Newcomb stumbled upon the law in the same way: while flipping through pages of a book of logarithmic tables, they noticed that the pages in the beginning of the book were dirtier than the pages at the end. This meant that their colleagues who shared the library preferred quantities beginning with the number one in their various disciplines.

Benford took this observation a step further than Newcomb, and began investigating other groups of numbers, finding that the “first digit law” emerged in groups as disparate as populations, death rates, physical and chemical constants, baseball statistics, the half-lives of radioactive isotopes, answers in a physics book, prime numbers, and Fibonacci numbers. In other words, just about any group of data obtained by using measurements satisfies the law.

On the other hand, data sets that are arbitrary and contain restrictions usually don’t follow Benford’s law. For example, lottery numbers, telephone numbers, gas prices, dates, and the weights or heights of a group of people are either random or arbitrarily assigned, and not obtained by measurement.

As Torres and his colleagues explain, scientists in the decades following Benford performed numerous studies, but discovered little more about the law other than racking up a wide variety of examples. However, scientists did discover a few curiosities. For one, when investigating second significant digits of data sets, the law still held, but with less importance. Similarly, for the third and fourth digits, the appearance of the numbers started becoming equal, leveling out at a uniform 10% for the fifth digit. A second discovery attracted even more scientific interest:

“In 1961, Pinkham discovered the first general relevant result, demonstrating that Benford’s law is scale invariant and is also the only law referring to digits which can have this scale invariance,” the scientists wrote in their letter. “That is to say, as the length of the rivers of the world in kilometers fulfill Benford’s law, it is certain that these same data expressed in miles, light years, microns or in any other length units will also fulfill it.”

Torres et al. also explain that in the last years of the 20th century, some important theoretical advances have been proven (base invariance, unicity, etc.), mainly by Ted Hill and other mathematicians. While some cases can be explained (for example, house addresses almost always start with 1’s, and lower numbers must occur before higher numbers), there is still no general justification for all examples. The scientists also explain that there is no a priori criteria that tells when a data set should or should not obey the law.

“Nowadays there are many theoretical results about the law, but some points remain in darkness,” said Torres. “Why do some numerical sets, like universal physical constants, follow the law so well? We need to know not only mathematical reasons for the law, but also characterize this set of experimental data. For example, what are their points of contact? Where they come from? Apparently, they are independent.

”I hope the general necessary and sufficient conditions will be discovered in the future—many people are interested in the law, especially economists—but I also know it could be not possible ever,” he added, mentioning Godel.

Nevertheless, scientists have been using the law for many practical applications. For example, because a year’s accounting data of a company should fulfill the law, economists can detect falsified data, which is very hard to manipulate to follow the law. (Interestingly, scientists found that numbers 5 and 6, rather than 1, are the most prevalent, suggesting that forgers try to “hide” data in the middle.)

Benford’s law has also been recently applied to electoral fraud in order to detect voting anomalies. Scientists found that the 2004 US presidential election showed anomalies in the state of Florida, as well as fraud in Venezuela in 2004 and Mexico in 2006.

“The story about how it was discovered—twice—from dirty pages … it is almost incredible,” said Torres. “Benford's law has undeniable applications, and this useful aspect was not clear when the law was discovered. It seemed to be only a math curiosity. For me, this is an example of how simplicity can be unexpectedly marvelous.”

For more details on Benford’s law, the highly readable letter is temporarily available at:
www.iop.org/EJ/abstract/0143-0807/28/3/N04 (with free registration).

Citation: Torres, J. Fernández, S., Gamero, A., Solar, A. “How do numbers begin? (The first digit law).” Eur. J. Phys. 28 (2007) L17-25.

Copyright 2007 PhysOrg.com.
All rights reserved. This material may not be published, broadcast, rewritten or redistributed in whole or part without the express written permission of PhysOrg.com.

Explore further: World's largest particle collider busts record

Related Stories

US House passes bill ending NSA bulk data collection

May 13, 2015

The US House of Representatives voted Wednesday to end the NSA's dragnet collection of telephone data from millions of Americans, a controversial program revealed in 2013 by former security contractor Edward ...

The next step in DNA computing: GPS mapping?

May 06, 2015

Conventional silicon-based computing, which has advanced by leaps and bounds in recent decades, is pushing against its practical limits. DNA computing could help take the digital era to the next level. Scientists ...

After years of talk, a regulator is willing to take on Google

Apr 30, 2015

The European Commission's decision to charge Google with abuse of its dominant market position in the search business in order to favour its own services has been criticised as too narrow in focus, too superficial for not dealing with the bigger problem of digital competition, ill-conceived for messing with ...

Recommended for you

SLAC gears up for dark matter hunt with LUX-ZEPLIN

May 21, 2015

Researchers have come a step closer to building one of the world's best dark matter detectors: The U.S. Department of Energy (DOE) recently signed off on the conceptual design of the proposed LUX-ZEPLIN (LZ) ...

First images of LHC collisions at 13 TeV

May 21, 2015

Last night, protons collided in the Large Hadron Collider (LHC) at the record-breaking energy of 13 TeV for the first time. These test collisions were to set up systems that protect the machine and detectors ...

User comments : 3

Adjust slider to filter visible comments by rank

Display comments: newest first

gruff
not rated yet Aug 08, 2008
I think to say Benford's Law can't be explained by scientists is not just sensationalist, it's plain wrong. Benford's law may be unintuitive but it's been explained many times in various ways and is very well understood (but not by Torres clearly). Steve Smith's Digital Signal Processing analysis offers the most intuitive insight into it: it's an artefact produced by the operation of taking the first digit in the first place which is a logarithmic operation (in the number base used). Random numbers _may_ follow Benford's law depending on their probability distribution function.
gruff
not rated yet Aug 08, 2008
(or if you prefer, an artefact of expressing measurements in given base when they're taken across many orders magnitude in that base)
Nerdle
not rated yet May 08, 2009
I think gruffs pretty much got it there. Since we always put 1 at the start of a new base 10 (10-100-1000), there will inevitably be more numbers cantaing a 1 as the first digit.

Please sign in to add a comment. Registration is free, and takes less than a minute. Read more

Click here to reset your password.
Sign in to get notified via email when new comments are made.