Decoding the dictionary: Study suggests lexicon evolved to fit in the brain

Apr 30, 2008

The latest edition of the Oxford English Dictionary boasts 22,000 pages of definitions. While that may seem far from succinct, new research suggests the reference manual is meticulously organized to be as concise as possible — a format that mirrors the way our brains make sense of and categorize the countless words in our vast vocabulary.

“Dictionaries have often been thought of as a frustratingly tangled web of words where the definition of word A refers users to word B, which is defined using word C, which ends up referring users back to word A,” said Mark Changizi, assistant professor of cognitive science at Rensselaer Polytechnic Institute. “But this research suggests that all words are grounded in a small set of atomic words — and it’s likely that the dictionary’s large-scale organization has been driven over time by the way humans mentally systematize words and their meanings.”

Dictionaries are built like an inverted pyramid. The most complex words (e.g., “albacore” and “antelope”) sit at the top and are defined by words that are more basic, and thus lower on the pyramid. Eventually all words are linked to a small number of words — called “atomic words,” (such as “act” and “group”) — that are so fundamental they cannot be defined by simpler terms. The number of levels of definition it takes to get from a word to an atomic word is called the “hierarchical level” of the word.

Changizi’s research, which was published online this week and will appear in the June print edition of the Journal of Cognitive Systems Research, indicates that the dictionaries we use every day utilize approximately the optimal number of hierarchical levels — and provide a visual roadmap of how the lexicon itself has culturally evolved over tens of thousands of years to help lower the overall “brain space” required to encode it, according to Changizi.

Many other human inventions — such as writing and other human visual signs — have been designed either explicitly or via cultural selection over time so as to minimize their demands on the brain, Changizi said.

By conducting a series of calculations based on the estimation that the most complex words in the dictionary total around 100,000 different terms, and that the number of atomic words range from 10 to 60, Changizi was able to devise three signature features present in the most efficient dictionaries — as well as in their human counterpart, the brain.

Most importantly, he discovered that the total number of words across all the definitions in the dictionary (and thus the size of the dictionary) changes in relation to the total number of hierarchical levels present. Optimal dictionaries should have approximately seven hierarchical levels, according to Changizi.

“The presence of around seven levels of definition will reduce the overall size of the dictionary, so that it is about 30 percent of the size it would be if there were only two hierarchical levels,” Changizi said.

Additionally, users will find that there are progressively more words at each successive hierarchical level, and that each hierarchical level contributes mostly to the definitions of the words just one level above their own, according to Changizi, who put his three predictions to the test by studying actual dictionaries.

The Oxford English Dictionary and WordNet — a large, online lexical database of English, developed at Princeton University — were found to possess all three signatures of an economically organized dictionary, and thus were organized in such a way as to economize the amount of dictionary space required to define the lexicon, according to Changizi.

“Somehow, over centuries, these revered reference books have achieved near-optimal organization,” Changizi said. “That optimality can likely be attributed to the fact that cultural selection pressures over time have shaped the organization of our lexicon so as to require as little mental space and energy as possible.”

Changizi believes his research has potential applications in the study of childhood learning, where scientists could analyze how students learn vocabulary words and possibly develop ways to optimize that learning process.

Source: Rensselaer Polytechnic Institute

Explore further: NTU and UNESCO to create mini-lab kits for youths in developing countries

add to favorites email to friend print save as pdf

Related Stories

Recommended for you

Cloning whistle-blower: little change in S. Korea

16 hours ago

The whistle-blower who exposed breakthrough cloning research as a devastating fake says South Korea is still dominated by the values that allowed science fraudster Hwang Woo-suk to become an almost untouchable ...

Color and texture matter most when it comes to tomatoes

Oct 21, 2014

A new study in the Journal of Food Science, published by the Institute of Food Technologists (IFT), evaluated consumers' choice in fresh tomato selection and revealed which characteristics make the red fruit most appealing.

How the lotus got its own administration

Oct 21, 2014

Actually the lotus is a very ordinary plant. Nevertheless, during the Qing dynasty (1644-1911) a complex bureaucratic structure was built up around this plant. The lotus was part of the Imperial Household, ...

User comments : 2

Adjust slider to filter visible comments by rank

Display comments: newest first

photojack
not rated yet Apr 30, 2008
Perhaps Changizi was either consciously or with a subliminal mental connection, referring to Abraham Maslow's hierarchical pyramid used to illustrate his concept of self-actualization! Excellent research with a superb conclusion and a really important goal to optimize the educational process of vocabulary acquisition and more. Bravo! I'd rather see "atomic words" than atomic bombs.
Corban
not rated yet May 01, 2008
Changizi has brought a unique perspective that I attribute to the fact that he's Chinese; the Chinese dictionary is organized around atomic components called 'radicals' as well. He's essentially superimposing that paradigm onto its Western cousin.

Question is: does it work?