The National Science Foundation- (NSF) funded Action Science Explorer (ASE) allows users to simultaneously search through thousands of academic papers, using a visualization method that determines how papers are connected, for instance, by topic, date, authors, etc. The goal is to use these connections to identify emerging scientific trends and advances.
"We are creating an early warning system for scientific breakthroughs," said Ben Shneiderman, a professor at the University of Maryland (UM) and founding director of the UM Human-Computer Interaction Lab.
"Such a system would dramatically improve the capability of academic researchers, government program managers and industry analysts to understand emerging scientific topics so as to recognize breakthroughs, controversies and centers of activity," said Shneiderman. "This would enable appropriate allocation of funds, encourage new collaborations among groups that unknowingly were working on similar topics and accelerate research progress."
ASE is not itself a product, but rather "a scientific research study that shows some potent new features that could be added to bibliographic systems to support more powerful functions," said Shneiderman.
This project is unique and provides "powerful network visualization, integrated with search techniques, statistical methods and text analytics to provide automatic summarization of closely related document clusters," he said.
Shneiderman explained that ASE would be especially helpful to those who explore emerging topics, such as computer scientists who want to understand quantum computing or environmental researchers who want to explore new visualization techniques for encouraging energy conservation. ASE extends beyond science papers to include topics in any knowledge domain. The ideas of ASE can be built into any language, beyond English.
Using the ASE begins with a keyword search on a particular topic, which yields a list of related documents. Instead of having to read each document to ascertain how the search findings are related, relationships are shown through visual citation patterns. Users can interactively select clusters, see highlighted groups in textual lists, view statistics about search results and annotate their findings to share with colleagues. ASE can also cite specific key text found within the documents and compare specific text across multiple sources.
"Then users can generate an automatic summarization that we believe provides far more rapid, thorough and effective insight to emerging disciplines," said Shneiderman.
"While traditional search tools for scientific publication databases are still designed as single scrolling windows filled with text we believe that modern information visualization practices, including graphical user interfaces can produce breakthrough ideas," he said. "The research team brings together skills in search, text analytics and visualization, all focused on searching large databases of scientific publications, so as to accelerate the processes of scientific research."
For example, the ASE was used to examine a specific area of natural language processing called dependency parsing. Dependency parsing (DP) is a small field of computational linguistics dedicated to analyzing sentences based on which of their components are dependent on each other.
ASE shows how the field of DP evolved, by examining the publication dates of research papers on the subject. One of the findings is that the DP field evolved from two separate research groups.
The first DP research appeared from 1986-1998, and it centered around a research group at the SITRA Foundation in Helsinki, Finland. Starting in 1996, a different group published papers on DP as well.
However, aside from one paper that bridges the two groups, no papers cite the original SITRA group in later years. Instead, an explosion of research appears around the second group in 2006-2008 from conferences focused on DP.
Currently, the ASE project engages with commercial systems developers, professional societies and digital librarians to encourage them to adopt the system or add similar features to their own systems. Shneiderman also hopes that program managers will apply their design principles to further understanding in existing and emerging scientific trends.
"In summary, we took a bold step forward, beyond what is supplied in existing systems such as specific digital libraries, commercial tools supplied by information publishers or specific research systems," said Shneiderman. "The integration of visual displays of network structures with search capabilities and text analytics yield powerful tools that we see as the next generation of search capabilities for scientific publication databases."
To read more about ASE research, see UM's Human-Computer Interaction Lab's tech report on the project.
Explore further: New search engine ranks tables by title, document content, text reference