Indus script encodes language, reveals new study of ancient symbols

April 23, 2009
Examples of the Indus script. The four square artifacts with animal and human iconography are stamp seals that measure one or two inches per side. On the top right are three elongated seals that have no iconography, as well as three miniature tablets (one twisted). The tablets measure about 1.25" long by 0.5" wide. Credit: J. M. Kenoyer /

( -- The Rosetta Stone allowed 19th century scholars to translate symbols left by an ancient civilization and thus decipher the meaning of Egyptian hieroglyphics.

But the symbols found on many other ancient artifacts remain a mystery, including those of a people that inhabited the Indus valley on the present-day border between Pakistan and . Some experts question whether the symbols represent a at all, or are merely pictograms that bear no relation to the language spoken by their creators.

A University of Washington computer scientist has led a statistical study of the Indus script, comparing the pattern of symbols to various linguistic scripts and nonlinguistic systems, including DNA and a computer programming language. The results, published online Thursday by the journal Science, found the Indus script's pattern is closer to that of spoken words, supporting the hypothesis that it codes for an as-yet-unknown language.

"We applied techniques of computer science, specifically machine learning, to an ancient problem," said Rajesh Rao, a UW associate professor of computer science and engineering and lead author of the study. "At this point we can say that the Indus script seems to have statistical regularities that are in line with natural languages."

Co-authors are Nisha Yadav and Mayank Vahia at the Tata Institute of Fundamental Research in Mumbai, India; Hrishikesh Joglekar, a software engineer from Mumbai; R. Adhikari at the Institute of Mathematical Sciences in Chennai, India; and Iravatham Mahadevan at the Indus Research Center in Chennai. The research was supported by the Packard Foundation and the Sir Jamsetji Tata Trust.

The Indus people were contemporaries of the Egyptian and Mesopotamian civilizations, inhabiting the Indus river valley in present-day eastern Pakistan and northwestern India from about 2600 to 1900 B.C. This was an advanced, urbanized civilization that left written symbols on tiny stamp seals, amulets, ceramic objects and small tablets.

"The Indus script has been known for almost 130 years," said Rao, an Indian native with a longtime personal interest in the subject. "Despite more than 100 attempts, it has not yet been deciphered. The underlying assumption has always been that the script encodes language."

In 2004 a provocative paper titled The Collapse of the Indus-Script Thesis claimed that the short inscriptions have no linguistic content and are merely brief pictograms depicting religious or political symbols. That paper's lead author offered a $10,000 reward to anybody who could produce an Indus artifact with more than 50 symbols.

Taking a scientific approach, the U.S.-Indian team of computer scientists and mathematicians looked at the statistical patterns in sequences of Indus symbols. They calculated the amount of randomness allowed in choosing the next symbol in a sequence. Some nonlinguistic systems display a random pattern, while others, such as pictures that represent deities, follow a strict order that reflects some underlying hierarchy. Spoken languages tend to fall between the two extremes, incorporating some order as well as some flexibility.

The new study compared a well-known compilation of Indus texts with linguistic and nonlinguistic samples. The researchers performed calculations on present-day texts of English; texts of the Sumerian language spoken in Mesopotamia during the time of the Indus civilization; texts in Old Tamil, a Dravidian language originating in southern India that some scholars have hypothesized is related to the Indus script; and ancient Sanskrit, one of the earliest members of the Indo-European language family. In each case the authors calculated the conditional entropy, or randomness, of the symbols' order.

They then repeated the calculations for samples of symbols that are not spoken languages: one in which the placement of symbols was completely random; another in which the placement of symbols followed a strict hierarchy; DNA sequences from the human genome; bacterial protein sequences; and an artificially created linguistic system, the computer programming language Fortran.

Results showed that the Indus inscriptions fell in the middle of the spoken languages and differed from any of the nonlinguistic systems.
If the Indus symbols are a spoken language, then deciphering them would open a window onto a civilization that lived more than 4,000 years ago. The researchers hope to continue their international collaboration, using a mathematical approach to delve further into the Indus script.

"We would like to make as much headway as possible and ideally, yes, we'd like to crack the code," Rao said. "For now we want to analyze the structure and syntax of the script and infer its grammatical rules. Someday we could leverage this information to get to a decipherment, if, for example, an Indus equivalent of the Rosetta Stone is unearthed in the future."

Provided by University of Washington (news : web)

Explore further: Scientists trace how rivers change course

Related Stories

Scientists trace how rivers change course

December 25, 2005

U.S. scientists have used laboratory techniques and sediment cores from the ocean to help explain the how rivers have changed course over millions of years.

3,000-year-old writing found in Mexico

September 15, 2006

A stone slab with 3,000-year-old writing, perhaps the oldest script ever found in the Western Hemisphere, has been discovered in Mexico, reports say.

Endangered languages threaten to disappear, researcher says

January 29, 2007

Endangered animal and plant species regularly make the news, but another type of endangered species is often overlooked: human languages. A University of Missouri-Columbia researcher has dedicated much of her career to studying ...

Recommended for you

The hand and foot of Homo naledi

October 6, 2015

The second set of papers related to the remarkable discovery of Homo naledi, a new species of human relative, have been published in scientific journal, Nature Communications, on Tuesday, 6 October 2015.

Who you gonna trust? How power affects our faith in others

October 6, 2015

One of the ongoing themes of the current presidential campaign is that Americans are becoming increasingly distrustful of those who walk the corridors of power – Exhibit A being the Republican presidential primary, in which ...

The dark side of Nobel prizewinning research

October 4, 2015

Think of the Nobel prizes and you think of groundbreaking research bettering mankind, but the awards have also honoured some quite unhumanitarian inventions such as chemical weapons, DDT and lobotomies.

How much for that Nobel prize in the window?

October 3, 2015

No need to make peace in the Middle East, resolve one of science's great mysteries or pen a masterpiece: the easiest way to get yourself a Nobel prize may be to buy one.

1 comment

Adjust slider to filter visible comments by rank

Display comments: newest first

5 / 5 (1) Apr 30, 2009
Now that is a good research! I wish they did the same with Thracian symbols which are even older as age, even though, there is at least one person who claims to have translated this symbols (and in a consistent way, if I may add).

Anyway, I hope this technology one day can help us translate all those ancient languages, because I so badly want to know about those civilisations!

Please sign in to add a comment. Registration is free, and takes less than a minute. Read more

Click here to reset your password.
Sign in to get notified via email when new comments are made.