Indus script encodes language, reveals new study of ancient symbols

Apr 23, 2009
Examples of the Indus script. The four square artifacts with animal and human iconography are stamp seals that measure one or two inches per side. On the top right are three elongated seals that have no iconography, as well as three miniature tablets (one twisted). The tablets measure about 1.25" long by 0.5" wide. Credit: J. M. Kenoyer / Harappa.com

(PhysOrg.com) -- The Rosetta Stone allowed 19th century scholars to translate symbols left by an ancient civilization and thus decipher the meaning of Egyptian hieroglyphics.

But the symbols found on many other ancient artifacts remain a mystery, including those of a people that inhabited the Indus valley on the present-day border between Pakistan and . Some experts question whether the symbols represent a at all, or are merely pictograms that bear no relation to the language spoken by their creators.

A University of Washington computer scientist has led a statistical study of the Indus script, comparing the pattern of symbols to various linguistic scripts and nonlinguistic systems, including DNA and a computer programming language. The results, published online Thursday by the journal Science, found the Indus script's pattern is closer to that of spoken words, supporting the hypothesis that it codes for an as-yet-unknown language.

"We applied techniques of computer science, specifically machine learning, to an ancient problem," said Rajesh Rao, a UW associate professor of computer science and engineering and lead author of the study. "At this point we can say that the Indus script seems to have statistical regularities that are in line with natural languages."

Co-authors are Nisha Yadav and Mayank Vahia at the Tata Institute of Fundamental Research in Mumbai, India; Hrishikesh Joglekar, a software engineer from Mumbai; R. Adhikari at the Institute of Mathematical Sciences in Chennai, India; and Iravatham Mahadevan at the Indus Research Center in Chennai. The research was supported by the Packard Foundation and the Sir Jamsetji Tata Trust.

The Indus people were contemporaries of the Egyptian and Mesopotamian civilizations, inhabiting the Indus river valley in present-day eastern Pakistan and northwestern India from about 2600 to 1900 B.C. This was an advanced, urbanized civilization that left written symbols on tiny stamp seals, amulets, ceramic objects and small tablets.

"The Indus script has been known for almost 130 years," said Rao, an Indian native with a longtime personal interest in the subject. "Despite more than 100 attempts, it has not yet been deciphered. The underlying assumption has always been that the script encodes language."

In 2004 a provocative paper titled The Collapse of the Indus-Script Thesis claimed that the short inscriptions have no linguistic content and are merely brief pictograms depicting religious or political symbols. That paper's lead author offered a $10,000 reward to anybody who could produce an Indus artifact with more than 50 symbols.

Taking a scientific approach, the U.S.-Indian team of computer scientists and mathematicians looked at the statistical patterns in sequences of Indus symbols. They calculated the amount of randomness allowed in choosing the next symbol in a sequence. Some nonlinguistic systems display a random pattern, while others, such as pictures that represent deities, follow a strict order that reflects some underlying hierarchy. Spoken languages tend to fall between the two extremes, incorporating some order as well as some flexibility.

The new study compared a well-known compilation of Indus texts with linguistic and nonlinguistic samples. The researchers performed calculations on present-day texts of English; texts of the Sumerian language spoken in Mesopotamia during the time of the Indus civilization; texts in Old Tamil, a Dravidian language originating in southern India that some scholars have hypothesized is related to the Indus script; and ancient Sanskrit, one of the earliest members of the Indo-European language family. In each case the authors calculated the conditional entropy, or randomness, of the symbols' order.

They then repeated the calculations for samples of symbols that are not spoken languages: one in which the placement of symbols was completely random; another in which the placement of symbols followed a strict hierarchy; DNA sequences from the human genome; bacterial protein sequences; and an artificially created linguistic system, the computer programming language Fortran.

Results showed that the Indus inscriptions fell in the middle of the spoken languages and differed from any of the nonlinguistic systems.
If the Indus symbols are a spoken language, then deciphering them would open a window onto a civilization that lived more than 4,000 years ago. The researchers hope to continue their international collaboration, using a mathematical approach to delve further into the Indus script.

"We would like to make as much headway as possible and ideally, yes, we'd like to crack the code," Rao said. "For now we want to analyze the structure and syntax of the script and infer its grammatical rules. Someday we could leverage this information to get to a decipherment, if, for example, an Indus equivalent of the Rosetta Stone is unearthed in the future."

Provided by University of Washington (news : web)

Explore further: Radar search to find lost Aboriginal burial site

add to favorites email to friend print save as pdf

Related Stories

Scientists trace how rivers change course

Dec 25, 2005

U.S. scientists have used laboratory techniques and sediment cores from the ocean to help explain the how rivers have changed course over millions of years.

Endangered languages threaten to disappear, researcher says

Jan 29, 2007

Endangered animal and plant species regularly make the news, but another type of endangered species is often overlooked: human languages. A University of Missouri-Columbia researcher has dedicated much of her career to studying ...

Study: Grammar ability hardwired in humans

Feb 07, 2006

University of Rochester scientists studying why characteristics of grammar are found in all languages say the use of grammar is hardwired in our brains.

3,000-year-old writing found in Mexico

Sep 15, 2006

A stone slab with 3,000-year-old writing, perhaps the oldest script ever found in the Western Hemisphere, has been discovered in Mexico, reports say.

Recommended for you

West US cave with fossil secrets to be excavated

9 hours ago

(AP)—For the first time in three decades, paleontologists are about to revisit one of North America's most remarkable troves of ancient fossils: The bones of tens of thousands of animals piled at the bottom ...

Radar search to find lost Aboriginal burial site

Jul 22, 2014

Scientists said Tuesday they hope that radar technology will help them find a century-old Aboriginal burial ground on an Australian island, bringing some closure to the local indigenous population.

Archaeologists excavate NY Colonial battleground

Jul 19, 2014

Archaeologists are excavating an 18th-century battleground in upstate New York that was the site of a desperate stand by Colonial American troops, the flashpoint of an infamous massacre and the location of the era's largest ...

User comments : 1

Adjust slider to filter visible comments by rank

Display comments: newest first

denijane
5 / 5 (1) Apr 30, 2009
Now that is a good research! I wish they did the same with Thracian symbols which are even older as age, even though, there is at least one person who claims to have translated this symbols (and in a consistent way, if I may add).

Anyway, I hope this technology one day can help us translate all those ancient languages, because I so badly want to know about those civilisations!