If Twitter had been invented during Prohibition, Al Capone's criminal career might have been much shorter. That's one conclusion that could be drawn from Matthew Gerber's use of Twitter data to improve crime forecasting in modern-day Chicago.
A University of Virginia systems and information engineer, Gerber analyzed Twitter content originating in different areas across the Windy City and combined this analysis with traditional crime-forecasting methods. He found that the combined method was more accurate in predicting 19 out of 25 crime types in those locations than predictions based on historical patterns alone.
"Crimes are committed in a physical space," Gerber said. "Twitter gives us another way of looking at it."
Gerber's Twitter project is just one of several that fall under the auspices of his department's Predictive Technology Laboratory, which Gerber co-directs with Professor Donald Brown. "There is a common theme to the laboratory's work," he said. "We use data to create models that enable us to predict future outcomes. Our goal is promote better decision-making."
One of the laboratory's thrusts is military and policing applications. For instance, Predictive Technology Laboratory researchers are helping the military predict insurgent activity. They are developing statistical models that use data from previous attacks that, when combined with geographic information, will not just predict the location of the next assault, but also shed light on the thought processes of attackers. If insurgents move to a new area, the models will suggest when and where they might be active in the future.
Medical informatics is another area where the lab is making a contribution. Hospitals are now being penalized when patients are readmitted within 30 days, so they need to be able to identify patients who require more intensive follow-up care to keep them healthy. Researchers are developing techniques to identify patients at risk of readmission before they are released.
Gerber's application of Twitter to improve crime forecasting fits neatly into the laboratory's first thrust. It also builds on his dissertation research creating computer programs that automatically scan and understand text.
"I had tested them on Wall Street Journal articles," he said, "but when I came to the laboratory, we started thinking about how we could combine these two areas, crime prediction and language. We very quickly came up with the idea of investigating the potential of social media."
Twitter was an ideal choice because the company allows anyone to collect the GPS-tagged tweets authored in a designated area. Gerber chose Chicago and began by collecting more than 1.5 million public tweets tagged with GPS coordinates between January and March 2013. He also collected a data set of Chicago crime records for the period. Finally, he divided Chicago into square-kilometer neighborhoods and mapped the tweets and crime records onto this grid.
At the end of the first month, Gerber combined the tweets from each neighborhood, identified topics of discussion and superimposed the crime data. For instance, the topic containing the words "center," "united," "blackhawks" and "bulls," all sports-related words in Chicago (the NBA's Chicago Bulls and the NHL's Chicago Blackhawks both play in the United Center), appeared often in neighborhoods with a high incidence of criminal damage.
Gerber then applied these correlations in conjunction with traditional forecasting techniques to predict crimes across the city for the following month. At the end of February, he discovered that his combined approach produced a sharper forecast for most crimes.
"My methods don't explain why specific topics and crimes are correlated," Gerber said. "Certainly most people aren't going to tweet about crimes they intend to commit or have just committed. But their plans for an evening, for example, might set the stage for illegal activities." For instance, rowdy sports fans sometimes damage public property after a game.
When they became aware of the Chicago results, the New York Police Department contacted Gerber and asked him to collect tweets from the boroughs of Brooklyn and Queens to see if there was potential for improving crime predictions there as well. The next step for Gerber is to see if using this technology to shape patrol routes and officer allocations will actually reduce crime.
"We have this predictive tool," Gerber said. "I'd like to know if it helps us in real life. If we can demonstrate this, it will give police departments a reason to adopt it."
Explore further: Chicago police cameras more effective when clustered, study says