Powerful new software plug-in detects bugs in spreadsheets

October 24, 2014, University of Massachusetts Amherst
Doctoral student Dan Barowy's CheckCell plug-in automatically finds data errors in spreadsheets. He, Dimitar Gochev and their advisor Emery Berger at UMass Amherst developed it as a plugin for Microsoft's Excel program. Credit: UMass Amherst

An effective new data-debugging software tool dubbed "CheckCell" was released to the public this week in a presentation by University of Massachusetts Amherst computer science doctoral student Daniel Barowy. He spoke at the premier international computer programming language design conference known as OOPSLA, in Portland, Ore.

CheckCell, which automatically finds data in spreadsheets, was developed as a plugin for Microsoft's popular Excel program. Its release at the highly respected Object-Oriented Programming, Systems, Languages and Applications (OOPSLA) conference this week signals that it is now freely available to anyone who wants to use it.

Spreadsheet data errors can be consequential, Barowy says. "Consider the case of a paper written by Harvard economists Carmen Reinhart and Kenneth Rogoff a couple of years ago. The paper was influential, lending credibility to government austerity measures in Europe and the United States. But in 2013, UMass Amherst economist Thomas Herndon and colleagues found, in combing through the data by hand, that methodological errors undermined Reinhart and Rogoff's argument. In particular, Reinhart and Rogoff exaggerated the impact of key data values in a spreadsheet."

The CheckCell group wondered whether software might be developed to find these kinds of errors automatically. The answer is a definite yes, says UMass Amherst School of Computer Science professor Emery Berger, Barowy's advisor. CheckCell successfully found a number of the same errors as had Herndon.

Berger explains, "Our work for the first time combines data analysis and program analysis. Poor quality data costs everyone money. CheckCell helps users avoid costly mistakes."

He adds, "Basically, CheckCell identifies data points that have a big impact on the final result, even if the impact is super subtle and difficult to detect. CheckCell immediately flags data points that are very suspicious, the ones that deserve a second look. It's like having a helper who says, 'pay attention to these cells, they really matter.'"

For example, if a teacher has an "A" student who would be expected to get a 94 on a test and the spreadsheet says that student got a 49, CheckCell will flag it, the computer scientist says. "It tells you that you need to make sure this value is correct."

To develop CheckCell, Berger and graduate students Barowy and Dimitar Gochev used a combination of statistical analysis and data flow analysis to flag inputs that have an unusual impact on the program's output. They evaluated the procedure against a collection of real-world spreadsheets such as budgets and student grades. They introduced common errors into the spreadsheets, then asked the plug-in tool to find them.

The technique uses what Berger calls "a threshold of unusualness." CheckCell marks hidden, high-impact data points in red and asks the designer to check them. If they are indeed correct, they turn green and will not be flagged in subsequent analyses, he notes.

In the future, his team, working with UMass Amherst colleague Alexandra Meliou, plan to extend CheckCell's use to large-scale data sets.

Explore further: How not to Excel: Austerity economics paper is coding-flawed

More information: Download the plug-in free at: www.CheckCell.org

Related Stories

Detecting software errors via genetic algorithms

March 5, 2014

According to a current study from the University of Cambridge, software developers are spending about the half of their time on detecting errors and resolving them. Projected onto the global software industry, according to ...

Recommended for you

Researchers 3-D print electronics and cells directly on skin

April 25, 2018

In a groundbreaking new study, researchers at the University of Minnesota used a customized, low-cost 3D printer to print electronics on a real hand for the first time. The technology could be used by soldiers on the battlefield ...

Balancing nuclear and renewable energy

April 25, 2018

Nuclear power plants typically run either at full capacity or not at all. Yet the plants have the technical ability to adjust to the changing demand for power and thus better accommodate sources of renewable energy such as ...

Electrode shape improves neurostimulation for small targets

April 24, 2018

A cross-like shape helps the electrodes of implantable neurostimulation devices to deliver more charge to specific areas of the nervous system, possibly prolonging device life span, says research published in March in Scientific ...

China auto show highlights industry's electric ambitions

April 22, 2018

The biggest global auto show of the year showcases China's ambitions to become a leader in electric cars and the industry's multibillion-dollar scramble to roll out models that appeal to price-conscious but demanding Chinese ...

After Facebook scrutiny, is Google next?

April 21, 2018

Facebook has taken the lion's share of scrutiny from Congress and the media about data-handling practices that allow savvy marketers and political agents to target specific audiences, but it's far from alone. YouTube, Google ...


Adjust slider to filter visible comments by rank

Display comments: newest first

not rated yet Oct 24, 2014
Should have named it CellChecker
not rated yet Oct 24, 2014
Should have named it CellChecker

I guess it was supposed to rhyme with Excel.

Please sign in to add a comment. Registration is free, and takes less than a minute. Read more

Click here to reset your password.
Sign in to get notified via email when new comments are made.