September 30, 2016

Using Big Data to monitor societal events shows promise, but the coding tech needs work

In the age of Big Data, automated systems can track societal events on a global scale. These systems code and collect vast stores of real-time "event data"—happenings gleaned from news articles covering everything from political protests to ecological shifts around the world.

In new research published Thursday in the journal Science, Northeastern network scientist David Lazer and his colleagues analyzed the effectiveness of four global-scale databases and found they are falling short when tested for reliability and validity.

Misclassification and duplication

The fully automated systems studied were the International Crisis Early Warning System, or ICEWS, maintained by Lockheed Martin, and Global Data on Events Language and Tone, or GDELT, developed and run out of Georgetown University. The others were the hand-coded Gold Standard Report, or GSR, generated by the nonprofit MITRE Corp., and the Social, Political, and Economic Event Database, or SPEED, at the University of Illinois, which uses both human and automated coding.

First the researchers tested the systems' reliability: Did they all detect the same protest events in Latin America? The answer was "not very well." ICEWS and GDELT, they found, rarely reported the same protests, and ICEWS and SPEED agreed on just 10.3 percent of them.

Next they assessed the systems' validity: Did the protest events reported actually occur? Here they found that only 21 percent of GDELT's reported events referred to real protests. ICEWS' track record was better, but the system reported the same event more than once, jacking up the protest count.

The systems were also vulnerable to missing news. "If something doesn't get reported in a newspaper or a similar outlet, it will not appear in any of these databases, no matter how important it really is," says Lazer, Distinguished Professor of Political Science and Computer and Information Sciences who also co-directs the NULab for Texts, Maps, and Networks.

"These global-monitoring systems can be incredibly valuable, transformative even," added Lazer. "Without good data, you can't develop a good understanding of the world. But to gain the insights required to tackle global problems such as national security and climate change, researchers need more reliable event data."

And what about the reported protests that actually weren't protests at all? "Automated systems can misclassify words," says Lazer. For example, the word "protest" in a news article can refer to an actual political demonstration, but it can also refer to, say, a political candidate "protesting" comments from a rival candidate.

"It's so easy for us as humans to read something and know what it means," says Lazer. "That's not so for a set of computational rules."

Analysis begets policy

From community building among scholars and the formation of multidisciplinary groups—which were among the policy recommendations by the researchers—teams within the group could compete against one another to spur innovation.

"Transparency is key," says Lazer. In the best-case scenario, the development methods, the software, and the source materials would be available to everyone involved. "But many of the source materials have copyright protection, and so they can't be shared widely," he says. "So one question is: How do we develop a large publicly shareable corpus?"

Participants should be able to test their varying coding methods on open, representative sets of event data to see how the methods compare, Lazer says. Contests could be used as a catalyst. Finally, the researchers recommend that a consortium should be established to balance the business needs of the news providers with the source needs of the developers and event-data users.

The authors suggest that reliable data-tracking systems can be used to build models that anticipate the escalation of conflicts, forecast the progression of epidemics, or trace the effect of global warming on the ecosystem.

More information: W. Wang et al. Growing pains for global monitoring of societal events, Science (2016). DOI: 10.1126/science.aaf6758

Journal information: Science

Provided by Northeastern University

Citation: Using Big Data to monitor societal events shows promise, but the coding tech needs work (2016, September 30) retrieved 11 July 2024 from https://phys.org/news/2016-09-big-societal-events-coding-tech.html

This document is subject to copyright. Apart from any fair dealing for the purpose of private study or research, no part may be reproduced without the written permission. The content is provided for information purposes only.

Explore further

Do women talk more than men? It depends

99 shares

Feedback to editors

A new species of extinct crocodile relative rewrites life on the Triassic coastline

11 hours ago

New method achieves tenfold increase in quantum coherence time via destructive interference of correlated noise

11 hours ago

Mars likely had cold and icy past, new study finds

11 hours ago

Study: Nanoparticle vaccines enhance cross-protection against influenza viruses

11 hours ago

New tools are needed to make water affordable, says study

12 hours ago

Researchers demonstrate how to build 'time-traveling' quantum sensors

12 hours ago

Lion with nine lives breaks record with longest swim in predator-infested waters

13 hours ago

New multimode coupler design advances scalable quantum computing

13 hours ago

High-speed electron camera uncovers new 'light-twisting' behavior in ultrathin material

13 hours ago

Perceived warmth, competence predict callback decisions in meta-analysis of hiring experiments

14 hours ago

Load comments (0)

Using Big Data to monitor societal events shows promise, but the coding tech needs work

Misclassification and duplication

Analysis begets policy

A new species of extinct crocodile relative rewrites life on the Triassic coastline

New method achieves tenfold increase in quantum coherence time via destructive interference of correlated noise

Mars likely had cold and icy past, new study finds

Study: Nanoparticle vaccines enhance cross-protection against influenza viruses

New tools are needed to make water affordable, says study

Researchers demonstrate how to build 'time-traveling' quantum sensors

Lion with nine lives breaks record with longest swim in predator-infested waters

New multimode coupler design advances scalable quantum computing

High-speed electron camera uncovers new 'light-twisting' behavior in ultrathin material

Perceived warmth, competence predict callback decisions in meta-analysis of hiring experiments

Relevant PhysicsForums posts

Help with some optimization code for Block Matrices.

Is an API Always Necessary for Server-Client Communication?

5 GHz PC WiFi connection Cybersecurity question

I did this POST message configuration damage to my wifi internet, help

Number of Multiplications in the FFT Algorithm

Newbie question about deep learning

Do women talk more than men? It depends

Are young people who join social media protests more likely to protest offline too?

Research focuses on understanding and predicting user behavior by mining social media events

What social media data could tell us about the future

New study examines web-based biosurveillance systems in identifying disease outbreaks

Political protests can lead to more responsive political parties, study finds

Hyphens in paper titles harm citation counts and journal impact factors

A big step toward the practical application of 3-D holography with high-performance computers

Combining multiple CCTV images could help catch suspects

Applying deep learning to motion capture with DeepLabCut

Training artificial intelligence with artificial X-rays

New model for large-scale 3-D facial recognition

Medical Xpress

Tech Xplore

Science X

Using Big Data to monitor societal events shows promise, but the coding tech needs work

Misclassification and duplication

Analysis begets policy

A new species of extinct crocodile relative rewrites life on the Triassic coastline

New method achieves tenfold increase in quantum coherence time via destructive interference of correlated noise

Mars likely had cold and icy past, new study finds

Study: Nanoparticle vaccines enhance cross-protection against influenza viruses

New tools are needed to make water affordable, says study

Researchers demonstrate how to build 'time-traveling' quantum sensors

Lion with nine lives breaks record with longest swim in predator-infested waters

New multimode coupler design advances scalable quantum computing

High-speed electron camera uncovers new 'light-twisting' behavior in ultrathin material

Perceived warmth, competence predict callback decisions in meta-analysis of hiring experiments

Relevant PhysicsForums posts

Related Stories

Do women talk more than men? It depends

Are young people who join social media protests more likely to protest offline too?

Research focuses on understanding and predicting user behavior by mining social media events

What social media data could tell us about the future

New study examines web-based biosurveillance systems in identifying disease outbreaks

Political protests can lead to more responsive political parties, study finds

Recommended for you

Hyphens in paper titles harm citation counts and journal impact factors

A big step toward the practical application of 3-D holography with high-performance computers

Combining multiple CCTV images could help catch suspects

Applying deep learning to motion capture with DeepLabCut

Training artificial intelligence with artificial X-rays

New model for large-scale 3-D facial recognition

Newsletter sign up

Donate and enjoy an ad-free experience