Linguists to re-think reason for short words
January 25, 2011 by Lin Edwards
(PhysOrg.com) -- Linguists have thought for many years the length of words is related to the frequency of use, with short words used more often than long ones. Now researchers in the US have shown the length is more closely related to the amount of information the words carry than their frequency of use.
A link between the length of words and how frequently they are used was first proposed in 1935 by George Kingsley Zipf, a Harvard University linguist and philologist. Zipf's idea was that people would tend to shorten words they used often, to save time in writing and speaking. The relationship seems intuitive and it seems to apply to many languages with short words such as the, a, to, and, so (and equivalents in other languages) being frequently used.
Researchers at the Massachusetts Institute of Technology (MIT), led by Steven Piantadosi, tested the Zipf relationship by analysing word use in 11 European languages. They analyzed digitized texts for correlations between words by counting how often all pairs of words occurred in sequence. This information was then used to estimate the probability of words occurring after given previous words or sequences of words. They made the assumption that the more predictable a word is, the less information it conveys, and estimated the information content from information theory, which says the information content is proportional to the negative logarithm of the probability of a word occurring.
Piantadosi said if the word length is directly related to information content this would make the transmission of information through language more efficient and also make speech and written texts easier to understand. This is because shorter words, carrying less information, would be scattered through the speech, essentially smoothing out the information density and delivering the important information at a steady rate.
The studies suggest that the short words are in fact the least informative and most predictable words rather than the most often used, and that word length is more closely related to the information the words contain.
The paper is soon to be published in the Proceedings of the National Academy of Sciences (PNAS). Steven Piantadosi belongs to the PhD program with MITs Department of Brain and Cognitive Sciences.
More information: Piantadosi, S. T., et al. Proceedings of the National Academy of Sciences (2011). PNAS paper will appear online at http://dx.doi.org/ … s.1012551108
© 2010 PhysOrg.com
-
From lemons to lemonade: Reaction uses carbon dioxide to make carbon-based semiconductor,
28 comments
-
Thioridazine kills cancer stem cells in human while avoiding toxic side-effects of conventional cancer treatments,
3 comments
-
SpaceX private rocket blasts off for space station (Update),
41 comments
-
Climate scientists say they have solved riddle of rising sea,
30 comments
-
Scotland passes turbine test to harness tidal power,
41 comments
-
Limits
6 hours ago
-
Complex numbers: Why is the modulus of z...
8 hours ago
-
A close approximation for square root of 2.
18 hours ago
-
What are some interesting ways of proving the quadratic formula?
May 25, 2012
-
Punctuation in mathematical writing
May 25, 2012
-
Is there anything wrong with completing the square this way?
May 25, 2012
- More from Physics Forums - General Math
More news stories
Social welfare cuts ultimately come with heavy price, researchers say
(Phys.org) -- Slashing government funding for Medicaid, food stamps and other programs that serve the poor while politically popular with some lawmakers and many conservatives may do more harm ...
Other Sciences / Social Sciences
May 24, 2012 |
4.3 / 5 (12) |
99
Ancient Bethlehem seal unearthed in Jerusalem
Israeli archaeologists have discovered a 2,700-year-old seal that bears the inscription "Bethlehem," the Israel Antiquities Authority announced Wednesday, in what experts believe to be the oldest artifact ...
Other Sciences / Archaeology & Fossils
May 23, 2012 |
3.5 / 5 (14) |
22
Oldest Jewish archaeological evidence on the Iberian Peninsula
German archaeologists of the Friedrich Schiller University Jena found one of the oldest archaeological evidence so far of Jewish Culture on the Iberian Peninsula at an excavation site in the south of Portugal, ...
Other Sciences / Archaeology & Fossils
May 25, 2012 |
4.3 / 5 (4) |
12
Dollars and sense: Why are some people morally against tax?
As the U.S. presidential election campaigns heat up, the economic debate is dominated by bailouts, austerity and, inevitably, taxation. Now a new study published in Symbolic Interaction asks why tax is such an important issue ...
Other Sciences / Social Sciences
May 23, 2012 |
3 / 5 (2) |
12
Oldest art even older
New dates from Geißenklösterle Cave in Southwest Germany document the early arrival of modern humans and early appearance of art and music.
Other Sciences / Archaeology & Fossils
May 24, 2012 |
5 / 5 (2) |
6
SpotterRF debuts Radar Backpack Kit (w/ Video)
(Phys.org) -- SpotterRF has announced a special radar backpack kit designed to enhance situational awareness for soldiers on the ground. The company says its special radar is designed for warfighters as part ...
Australia hails surprise super-telescope decision
Australia has hailed a surprise decision giving it a role in a radio telescope project aimed at revolutionising astronomy, vowing to draw on its decades of experience in space science.
Astronomers seize last chance in lifetime for Venus Transit
Astronomers are gearing for one the rarest events in the Solar System: an alignment of Earth, Venus and the Sun that will not be seen for another 105 years.
Thousands of shellfish found dead in Peru
Thousands of crustaceans were found dead off the coast of Lima following the mystery mass death of dolphins and pelicans, the Peruvian Navy said Friday.
SpaceX capsule has 'new car' smell, astronauts say
SpaceX's Dragon cargo vessel smells like a new car, said astronauts at the International Space Station after opening the hatches Saturday following the spacecraft's landmark mission to the orbiting lab.
Family history of Alzheimer's affects functional connectivity
(HealthDay) -- Cognitively normal individuals with a family history of late-onset Alzheimer's disease (AD) may display lower resting state functional connectivity in the default mode network (DMN) of the brain, ...
Jan 25, 2011
Rank: not rated yet
One example of simple redundancy is the prepositions [in, at, on, near, into, out, etc] which can sometimes add essential clarity to utterances but often do not add much meaning. And in European languages for example there are 'necessary' agreements - embodied as inflections - between adjectives and the nouns they are qualifying. The success of English with its relatively few such inflections [compared to Russian for example] show that much of this is just redundant. So why is it there?
Jan 25, 2011
Rank: not rated yet
Shouldn't this scientist consider all possible explanations? Certainly compound words and words with suffixes and prefixes are going to be longer and convey more meaning through building blocks (Grandmother, defensive, flammable, chemical names).
The usage argument still applies, because less "meaningful" words are necessary to use more often in speech to give it context.
Sure better examples
or
I'm sure that there are better examples than this one
Longer words with more information have to be used less, because their specific meanings don't apply as often. I would use 'inflammable' less than 'to' because it doesn't apply in as many situations.
and...lets face it, we think of simple words as simple things. Car and automobile mean the same thing even though this study would consider one to have more meaning
Jan 26, 2011
Rank: not rated yet
(e.g. "the tire is in the trunk" vs. "the tire is on the trunk")
The information carrying capacity of words (especially the short ones) can't be only be judged by probability of occurence since they are always embedded (quite literally) in a context.
Jan 27, 2011
Rank: 5 / 5 (1)
long ago, when I wss a kid, the wonder of television revealed to my little British soul that Americans always said "automobile" when they meant "car". Nowadays I never hear the word automobile in an American movie or TV program, it is always "car".
Perhaps this change has something to do with the [apparent] fact that Americans also commonly use[d?] the word "car" when referring to railway carriages and trams [as in "Streetcar named Desire"]. The decline of railways as people transport medium maybe opened the way for car to displace automobile as it has.
Jan 30, 2011
Rank: not rated yet
De gustibus non disputandum est cover a lot of territory in few words
Feb 01, 2011
Rank: not rated yet
the year I studied Latin [care of some aged soul we called "Yob"] I achieved an overall mark of 36%. Luckily nobody of importance to me thought that this was of any particular importance.
On-line translators this evening seem to imply that "est" in the quote should come before "disputandum". 36% notwithstanding, I think that makes better sense too ... :-)
Feb 01, 2011
Rank: not rated yet
Online translators cannot be trusted for more than single words.
In fact, you have to realize that ancient times were essentially times of remembering the spoken word. No citation databases, no books "Latin for dummies". Thus, the sound of the spoken word was of utmost importance. (The oldest European literature, Odyssey and Iliad, is completely written in hexameters.)
And now listen to yourself reciting first "de gustibus non disputandum est" and then "De gustibus non est disputandum". The first is a hexameter, sticking to the ear, while the second is of the kind you have forgotten before the next adage.
Feb 06, 2011
Rank: not rated yet