Investigating documents in depth

Feb 09, 2006

Keyword searches in text databases are a standard procedure today. Related content in different documents can now be analyzed on numerous levels using the software tool SWAPit. Researchers will be demonstrating at CeBIT how football news can be evaluated.

Does Ballack actually play any better now that he has signed a lucrative advertising contract? Or has his performance deteriorated instead? Has the disagreement between Kahn and Lehmann improved the two goalkeepers’ performance, or are they tending to stop fewer balls than before? And what effect does this have on their clubs? If scoop-hungry reporters are to assess these issues on a founded basis, rather than just relying on their gut feeling, they need to square up the news in sports magazines with up-to-date statistics, club communications and articles in the tabloids.

Such multi-layered analyses can now be prepared semi-automatically, using the software tool SWAPit developed by scientists at the Fraunhofer Institute for Applied Information Technology FIT in Sankt Augustin near Bonn. This tool makes it possible to discover related content in textual data at a glance, revealing any associated additional information.

“The name SWAPit is derived from the verb ‘to swap’,” explains Andreas Becks of the FIT. "The program challenges users to look at textual information from alternative points of view, enabling them to compare supplementary information related to the documented topics.” To make this possible the tool presents collections of texts as
a kind of map, in which similar texts are grouped into clusters. When a user clicks on one of these clusters, the shared features are displayed on the monitor in a field immediately adjacent to the map. “These additional ways of looking at information allow users to analyze their data much more fully. They can compile statistics and discern patterns that were not evident before,” Becks emphasizes.

Press research is just one possible application of the method known as integrated text and data mining. Other ways of using this software might be to analyze patents for research planning, examine documents on segments of the market or evaluate inquiries at service centers. “But at one point we even had an interdisciplinary cultural project in which SWAPit solved communication problems,” Becks reports. “It showed us how differently various disciplines define the same term.”

The researchers have already tested their prototype with industrial partners in a wide range of sectors. It is compatible with standard text formats such as doc, pdf and html, but could easily be extended to cover other formats if required for concrete marketing purposes, Becks assures us. Interested parties can learn more details at CeBIT in Hanover from March 9 to 15.

Source: Fraunhofer-Gesellschaft

Explore further: Google searches hold key to future market crashes

add to favorites email to friend print save as pdf

Related Stories

Researchers uncover secrets of internal cell fine-tuning

30 minutes ago

New research from scientists at the University of Kent has shown for the first time how the structures inside cells are regulated – a breakthrough that could have a major impact on cancer therapy development.

Local education politics 'far from dead'

30 minutes ago

Teach for America, known for recruiting teachers, is also setting its sights on capturing school board seats across the nation. Surprisingly, however, political candidates from the program aren't just pushing ...

First grade reading suffers in segregated schools

40 minutes ago

A groundbreaking study from the Frank Porter Graham Child Development Institute (FPG) has found that African-American students in first grade experience smaller gains in reading when they attend segregated schools—but the ...

Printing the metals of the future

40 minutes ago

3-D printers can create all kinds of things, from eyeglasses to implantable medical devices, straight from a computer model and without the need for molds. But for making spacecraft, engineers sometimes need ...

Getting a jump on plant-fungal interactions

50 minutes ago

Fungal plant pathogens may need more flexible genomes in order to fully benefit from associating with their hosts. Transposable elements are commonly found with genes involved in symbioses.

Recommended for you

Turning bio-waste into hydrogen

1 hour ago

Whilst hydrogen cars look set to be the next big thing in an increasingly carbon footprint-aware society, sustainable methods to produce hydrogen are still in their early stages. The HYTIME project is working on a novel production ...

Pfizer's 2Q profit sinks 79 pct but tops forecasts

2 hours ago

(AP)—Pfizer's second-quarter earnings plunged 79 percent from last year, when the world's second-largest drugmaker booked a business spinoff gain of more than $10 billion. The latest results still edged ...

Aetna 2Q profit rises 2.4 percent

2 hours ago

Aetna's second-quarter profit climbed more than 2 percent, as gains from an acquisition helped the health insurer beat analyst expectations and raise its 2014 earnings forecast again.

Merck 2Q profit more than doubles

2 hours ago

A big one-time gain and a tax benefit helped drugmaker Merck & Co. more than double its second-quarter profit, raise the lower end of its profit forecast and easily top analysts' expectations.

User comments : 0