Scientists discover oldest words in the English language, predict which ones are likely to disappear

Feb 26, 2009

The oldest words in the English language include "I" and "who", while words like "dirty" could die out relatively quickly, British researchers said Thursday.

Scientists at the University of Reading have discovered that 'I', 'we', 'who' and the numbers '1', '2' and '3' are amongst the oldest words, not only in English, but across all Indo-European languages. What's more, words like 'squeeze', 'guts', 'stick', 'throw' and 'dirty' look like they are heading for history's dustbin - along with a host of others.

Evolutionary language scientistsfrom the University of Reading, one of the world's leading centres in this field of research, have been investigating how languages evolve, and whether that evolution followed any rules. Until recently they believed they would not be able to track words back in time for more than 5,000 years, however their new IBM supercomputerhas enabled them to go back almost 30,000 years, and finally provide the answers.

The scientists have been able to analyse the family of Indo-European languages - of which English is a modern-day example - reconstruct the rate at which words evolve and predict future changes to our vocabulary. The oldest words we use today have been in existence for at least 10,000 years.

Looking to the future, the less frequently certain words are used, the more likely they are to be replaced. Other simple rules have been uncovered - numerals evolve the slowest, then nouns, then verbs, then adjectives. Conjunctions and prepositions such as 'and', 'or', 'but' , 'on', 'over' and 'against' evolve the fastest, some as much as 100 times faster than numerals. 'Throw' which is expected to evolve quickly, has a half-life of 900 years, there are 42 unrelated sounds for it across all the languages. In 10,000 years time, it will likely have been replaced in 10 of them - possibly including English, unless of course we all do our part to keep the word in circulation.

"50% of the words we use today would be unrecognisable to our ancestors living 2,500 years ago. If a time-traveller came to us, and told us he wanted to go back to that period, we could arm him with the appropriate phrase book, and hopefully keep him out of trouble" explained Mark Pagel, Professor of Evolutionary Biology at the University of Reading.

The IBM supercomputer at the University of Reading, known as ThamesBlue, is now one year old. Before it arrived, it took an average of six weeks to perform a computational task such as comparing two sets of words in different languages, now these same tasks can be executed in a few hours.

Professor Vassil Alexandrov, the University's leading expert on computational science and director of the University's ACET Centre¹ said: "The new IBM supercomputer has allowed the University of Reading to push to the forefront of the research community. It underpins other important research at the university, including the development of accurate predictive models for environmental use. Based on weather patterns and the amounts of pollutant in the atmosphere, our scientists have been able to pinpoint likely country-by-country environmental impacts, such as the affect airborne chemicals will have on future crop yields and cross-border pollution".

Caroline Isaac, Deep Computing Executive at IBM said "Supercomputers are enabling the world to become increasingly interconnected, instrumented and intelligent. We have now reached a tipping point in price/performance that's allowing breakthroughs in university research that were previously unimaginable".

Provided by University or Reading

Explore further: How science can beat the flawed metric that rules it

add to favorites email to friend print save as pdf

Related Stories

Security CTO to detail Android Fake ID flaw at Black Hat

10 hours ago

Where have you heard this before: A team of security researchers discover a security flaw in Android devices. This is, however, news. This time, experts are talking about a flaw that involves a widespread ...

Huge waves measured for first time in Arctic Ocean

11 hours ago

As the climate warms and sea ice retreats, the North is changing. An ice-covered expanse now has a season of increasingly open water which is predicted to extend across the whole Arctic Ocean before the middle ...

Underwater elephants

11 hours ago

In the high-tech world of science, researchers sometimes need to get back to basics. UC Santa Barbara's Douglas McCauley did just that to study the impacts of the bumphead parrotfish (Bolbometopon muricatum) on cor ...

Recommended for you

F1000Research brings static research figures to life

13 hours ago

F1000Research today published new research from Bjorn Brembs, professor of neurogenetics at the Institute of Zoology, Universitaet Regensburg, in Germany, with a proof-of-concept figure allowing readers and reviewers to run ...

How science can beat the flawed metric that rules it

14 hours ago

In order to improve something, we need to be able to measure its quality. This is true in public policy, in commercial industries, and also in science. Like other fields, science has a growing need for quantitative ...

User comments : 9

Adjust slider to filter visible comments by rank

Display comments: newest first

3.3 / 5 (9) Feb 26, 2009
"The oldest words in circulation today have been in use for at least 10,000 years, researchers added."

And how did they come to THAT conclusion?



There are NO 10,000 year old written records.

This kind of careless writing, so common in this type of paper, makes us question the validity of the rest of their work.

Much better to say: 'MAY have been in use for at least 10,000 years'.
4 / 5 (4) Feb 26, 2009
I figured the oldest words would be "Not tonight, I have a headache"...
4.3 / 5 (8) Feb 26, 2009
There are ways of dating pieces of a language that don't rely on written records. Archaeology and studies of human migration can give a lot of clues.

For example, suppose a word is found in the languages of two groups of people. If archaeological evidence shows that the two groups parted ways 8,000 years ago then you know the word is likely to be at least 8,000 years old.

Same goes for grammar- how are sentences structured? what order do the words come? does the language use tenses and genders? Distinctive features like this can be used to demonstrate a relationship between languages, even if the words themselves have changed.

It is an intricate and subtle business to extract facts out of the jumble of languages, but it is not guesswork.
1.4 / 5 (10) Feb 26, 2009
4.3 / 5 (3) Feb 27, 2009

Good Neil, that's a noun.
2.3 / 5 (3) Feb 27, 2009
El_Nexus, you say:

"It is an intricate and subtle business to extract facts out of the jumble of languages, but it is not guesswork."

And yet you use the word "likely" ("the word is likely to be at least 8,000 years old")---That sounds alot like guesswork or supposition.
3 / 5 (1) Feb 27, 2009
mvg ---

likely is an adjective so per the write up it is "likely" to be much much younger than 8000 probabaly to 1000 but that is also likely to be just an opinion. ;-) Be happy and multiply --- or at least have fun trying
4.2 / 5 (5) Feb 27, 2009
A word could be the same in two languages by coincidence (the Japanese word "so" means the same as ours, I'm told, but Japanese and English are not related). Or it could be a "Wanderwort", a word that starts in one language and migrates to another through trade and diplomacy between the two cultures. "Tomato" is such a word in our language- the tomato and the word for it both come from South America.

I suppose if you wanted to demonstrate a link between two languages based on one common word, it'd be dismissed as guesswork. But once you get fifty or a hundred common words, you've got a powerful argument for a link between the languages. Then you can start investigating for common grammatical structures, end up finding more common words and perhaps discarding a few of your original set that you're no longer so sure about. It's a process of refinement and of course there will always be a degree of uncertainty. There are controversies and arguments, just like in any field of study. But a wild stab in the dark? No.
4 / 5 (2) Feb 28, 2009

I do sincerely thank you for your last comment-and I truly have no doubt that there are linkages between languages. What I find impossible to justify is the placing of absolute (or nearly absolute) dates on linguistic branching which probably occurred in prehistoric times. (the article refers to 30,000 and 10,000 years ago). There is NO way we can infer what languages these people may have spoken, without guessing--yes we can genetically match with some degree of accuracy some migration routes--but we have no way of KNOWING that there is a one to one lineup between the genetic and linguistic mapping that we may generate.

Even during the early historical period (from about 2000 BC) there have been several lingua francas that have been imposed or adopted by peoples of quite diverse racial backgrounds.

In modern times (the last 200 years) English has rapidly become the lingua franca of the Indian subcontinent(and in the 20th century-to some extent the world)--what would that do to some scholar's conclusions in the far distant future (say 30,000 years) of the commonality between, say English and Hindi?

Such changes are so irratic in history--some so rapid (a few hundred years)--or so slow (thousands)--and governed by events so unpredictable--(even during periods for which we have written records)--I find it impossible to believe that such dating of prehistoric events are anything more than (educated?) guessing.

I don't mind reading about someone's theory of how things may have occurred--but it does seem a bit of "hubris" to state as fact what is said to have occurred 30,000 years ago--when it is actually only one of several POSSIBLE conclusions.