Big-data visualization experts make using scatter plots easier for today's researchers

February 19, 2016
Participants grouped 247 scatter plots based on their similarity, taking into account the visible patterns and trends.

Scatter plots—which use horizontal and vertical axes to plot data points and display how much one variable is affected by another—have long been employed by researchers in various fields, but the era of Big Data, especially high-dimensional data, has caused certain problems. A data set containing thousands of variables—now exceedingly common—will result in an unwieldy number of scatter plots. If a researcher is to extract useful information from the plots, that number must be winnowed in some way.

In recent years, algorithmic methods have been developed in an attempt to detect plots that contain one or more patterns of interest to the researcher analyzing them, thereby providing a measure of guidance as the data is explored. While those techniques represent an important step, little attention has been paid to validating their results by comparing them to those achieved when human observers and analysts view large sets of plots and their patterns.

Members of Tandon's data-visualization group, headed by Professor Enrico Bertini, have conducted a study that found that results obtained through algorithmic methods, such as those known as scagnostics, do not necessarily correlate well to human perceptional judgments when asked to group scatter plots based on their similarity. While the team identified several factors that drive such perceptual judgments, they assert that further work is needed to develop perceptually balanced measures for analyzing large sets of plots, in order to better guide researchers in such fields as medicine, aerospace, and finance, who are regularly confronted with high-dimensional data.

Explore further: Overhaul in tropical forest research needed

More information: Rahul Singh et al. Towards human-computer synergetic analysis of large-scale biological data, BMC Bioinformatics (2013). DOI: 10.1186/1471-2105-14-S14-S10

Related Stories

Overhaul in tropical forest research needed

November 17, 2014

New work from a team led by Carnegie's Greg Asner shows the limitations of long-used research methods in tropical rainforest ecology and points to new technological approaches for understanding forest structures and systems ...

How to handle zeroes in ecological data

February 16, 2016

The analysis of ecological data can be a difficult endeavor. Ecological data are noisy: some days are windy, some days are hotter than usual, sometimes ants chew through your carefully placed flagging tape, and sometimes ...

Corn cobs eyed for bioenergy production

January 31, 2013

Corn crop residues are often left on harvested fields to protect soil quality, but they could become an important raw material in cellulosic ethanol production. U.S. Department of Agriculture (USDA) research indicates that ...

New software helps reveal patterns in space and time

September 11, 2012

The GeoDa Center for Geographical Analysis & Computation, led by ASU Regents' Professor Luc Anselin, has just released a new version of its signature software, OpenGeoDa. The software provides a user-friendly interface to ...

Recommended for you

Amber specimen offers rare glimpse of feathered dinosaur tail

December 8, 2016

Researchers have discovered a dinosaur tail complete with its feathers trapped in a piece of amber. The finding reported in Current Biology on December 8 helps to fill in details of the dinosaurs' feather structure and evolution, ...

0 comments

Please sign in to add a comment. Registration is free, and takes less than a minute. Read more

Click here to reset your password.
Sign in to get notified via email when new comments are made.