Identifying the meaning of words with multiple meanings, without using their semantic context

Jul 03, 2013
Identifying the meaning of words with multiple meanings, without using their semantic context

Two Brazilian physicists have devised a method to automatically elucidate the meaning of words with several senses, based solely on their patterns of connectivity with nearby words in a given sentence – and not on semantics. Thiago Silva and Diego Amancio from the University of São Paulo, Brazil, reveal, in a paper about to be published in the European Physical Journal B, how they modelled classics texts as complex networks in order to derive their meaning. This type of model plays a key role in several natural processing language tasks such as machine translation, information retrieval, content analysis and text processing.

In this study, the authors chose a set of ten so-called polysemous words—words with multiple meanings—such as bear, jam, just, rock or present. They then verified their patterns of connectivity with nearby words in the text of literary classics such as Jane Austen's Pride and Prejudice. Specifically, they established a model that consisted of a set of nodes representing words connected by their "edges," if they are adjacent in a text.

The authors then compared the results of their disambiguation exercise with the traditional semantic-based approach. They observed significant accuracy rates in identifying the suitable meanings when using both techniques. The approach described in this study, based on a so-called deterministic tourist walk characterisation, can therefore be considered a complementary methodology for distinguishing between word senses.

In future works, the authors are planning to devise new measures to connect not only adjacent words, but also words within a given interval in order to enhance the ability of the model to grasp semantic factors. This approach is supported by another recent study by the same authors, showing that traditional measures mainly depend on the syntax.

Explore further: The physics of lead guitar playing

More information: T. C. Silva and D. R. Amancio (2013), Discriminating word senses with tourist walks in complex networks, European Physical Journal B, DOI 10.1140/epjb/e2013-40025-4 . http://link.springer.com/article/10.1140/epjb/e2013-40025-4

Related Stories

Our ambiguous world of words

May 31, 2013

(Phys.org) —Ambiguity in language poses the greatest challenge when it comes to training a computer to understand the written word. Now, new research aims to help computers find meaning.

New study suggests Voynich text is not a hoax

Jun 24, 2013

(Phys.org) —Theoretical physicist Marcelo Montemurro and colleague Damián H. Zanette have published a paper in the journal PLOS ONE claiming that the Voynich text is likely not a hoax as some have sugges ...

Texting affects ability to interpret words

Feb 20, 2012

(Medical Xpress) -- Research designed to understand the effect of text messaging on language found that texting has a negative impact on people's linguistic ability to interpret and accept words.

Recommended for you

IHEP in China has ambitions for Higgs factory

2 hours ago

Who will lay claim to having the world's largest particle smasher?. Could China become the collider capital of the world? Questions tease answers, following a news story in Nature on Tuesday. Proposals for ...

The physics of lead guitar playing

3 hours ago

String bends, tapping, vibrato and whammy bars are all techniques that add to the distinctiveness of a lead guitarist's sound, whether it's Clapton, Hendrix, or BB King.

The birth of topological spintronics

4 hours ago

The discovery of a new material combination that could lead to a more efficient approach to computer memory and logic will be described in the journal Nature on July 24, 2014. The research, led by Penn S ...

The electric slide dance of DNA knots

8 hours ago

DNA has the nasty habit of getting tangled and forming knots. Scientists study these knots to understand their function and learn how to disentangle them (e.g. useful for gene sequencing techniques). Cristian ...

User comments : 0