(Phys.org)—Microsoft Research and Technion-Israel Institute of Technology have been working on software that can predict events. The pursuit could lead to a tool that can provide better information that goes beyond conclusions and forecasts drawn from human expertise, educated guesses, and intuition. The software might help mine data toward the goal of knowing when outbreaks of disease or outbreaks of violence could occur, among other kinds of information. The software collaboration has involved testing with over 20 years' worth of New York Times articles, taken from an archive from 1986 to 2007, along with various Web data sources, to establish better ways of seeing what leads to major events such as disease and violence.
Eric Horvitz, Distinguished Scientist and co-director of Microsoft Research, teamed up with Technion-Israel Institute's Kira Radinsky, a PhD researcher.
Their system was tested on data where they found patterns and determined correlations between weather disasters such as drought in Africa with post-drought events such as cholera outbreaks. Following those weather events, alerts about a downstream risk of cholera could have been issued nearly a year in advance.
The researchers described the manner in which they crawled and parsed the archives of New York Times articles. "We say that a chain of events belongs to a domain D, if it consists one of the domain relevant words, denoted as wi(D). For example, for the challenge of predicting future deaths, we consider the wordskilled," dead," death," and their related terms. For the challenge of predicting future disease outbreak, we consider all mentions of cholera, "malaria, " and dysentery."
While they are not unique in exploring conditions surrounding disease outbreaks, the researchers noted that epidemiologists pursuing like relationships issue studies that are frequently retrospective analyses rather than predictive studies. The two researchers are looking for a software tool that can guide better decisions for near term actions.
Horvitz said the project will continue. He would like to mine more newspaper archives and digitized books. He is optimistic that a more refined version could assist experts at government agencies planning humanitarian responses among other uses. "We've done some reaching out and plan to do some follow-up work with such people," he said.
In their research paper, "Mining the Web to Predict Future Events," Radinsky and Horvitz wrote that, "Beyond alerting about actionable situations based on increased likelihoods of forthcoming outcomes of interest, predictive models can more generally assist by providing guidance when inferences from data run counter to expert expectations."
They said they hoped their work will stimulate additional research on leveraging past experiences and human knowledge to provide valuable predictions about future events and interventions of importance.
Explore further: Lifebrowser: Data mining gets (really) personal at Microsoft