Clone wars—finding buggy code copies

June 11, 2018 by Kris Foster, University of Saskatchewan
Chanchal Roy, associate professor in the Department of Computer Science. Credit: Kris Foster

Code is ubiquitous and most industries around the world rely on code-based software to keep day-to-day operations running, said Chanchal Roy, associate professor in the Department of Computer Science.

"The simplest functions use , and bad code can have a massive impact," said Roy, who joined the College of Arts and Science in 2009. "Unfortunately, the way developers copy code can result in lots of bugs or errors, something my research addresses."

It is common practice for software developers to copy, paste and modify a fragment of existing code to suit the task or tool they are working on. This is called cloning, and the resulting code from the copy-and-paste process is, of course, called a clone.

"There are valid reasons why cloning is so common," said Roy, whose research is supported by a Natural Sciences and Engineering Research Council of Canada Accelerator Grant. "It saves time, there is low risk in using stable code, and it results in faster development. There is no need to reinvent the wheel."

The problem, Roy is quick to point out, is that often cloning code results in cloning unknown "bugs" as well, and these errors can spread quickly.

"If you have a bug in the original code, you are copying errors over and over again," he said. "Even if you find one instance of the bug, it is nearly impossible to find all of them … which results in a lot of industries using outdated code over new code that potentially has bugs."

In part because of the issues related to cloning and the resulting buggy clones, up to 85 per cent of the cost of software development can go towards software maintenance, including clone detection.

"It is a double-edged sword," said Roy. "Cloning is common because of the benefits to programmers, but clones can carry bugs that are also really troublesome."

Clone detection, an area in which Roy has dedicated a lot of research time, means finding similar code fragments in order to resolve bug issues. In its simplest form, it is like doing a document search for specific words. In its most complex form, it is like searching for a needle in a haystack, especially if the original code has been modified (which is the most common form of cloning) and is in a program containing millions of lines of code.

To address this issue, Roy and his research collaborator James Cordy of Queen's University have developed a number of clone detection systems that search for similar fragments of code. There are two main criteria needed for a good clone detection system: precision, which is the ability to detect clones correctly; and recall, a term referring to the percentage of clones detected out of the total number of clones present. Roy and Cordy have developed the first clone detection system, called NICAD, that excels in both precision and recall.

"Once we define what similarities to search for, NICAD can detect modified clones," Roy said, noting that a great amount of human testing, including vetting over nine million cloned fragments, has gone towards ensuring the clone detection system is accurate.

Through his evaluation of clone detection, Roy has also become a world leader in the area of benchmarking clone detection tools with the development of the BigCloneBench tool.

The potential of Roy's clone detection systems and benchmarking work is not going unnoticed. Roy and Cordy have recently received two Most Influential Paper awards, in recognition of the "lasting impact of contributions made within the previous 10 years." Their work on benchmarking and NICAD were recognized by the International Conference on Software Analysis, Evolution and Reengineering, and the International Conference on Program Comprehension, respectively.

Looking ahead to the next decade, Roy said he would like to develop a "safe system" that not only detects corrupt clones, but is also able to advise on how to fix bugs in the system, or even remove them automatically.

"This has the potential to save a lot of time and money, but I am not sure I can do this even in the next 20 years," said Roy with a slight smile and laugh.

Explore further: Re-cloning of first cloned dog deemed successful thus far

Related Stories

Re-cloning of first cloned dog deemed successful thus far

November 22, 2017

(—A team of researchers with Seoul National University, Michigan State University and the University of Illinois at Urbana-Champaign has re-cloned the first dog to be cloned. In their paper published in the journal ...

Starfish that clone themselves live longer

June 25, 2015

Starfish that reproduce through cloning avoid ageing to a greater extent than those that propagate through sexual reproduction. This is shown by a new research study in which researchers from the University of Gothenburg ...

Recommended for you

Where is the universe hiding its missing mass?

February 15, 2019

Astronomers have spent decades looking for something that sounds like it would be hard to miss: about a third of the "normal" matter in the Universe. New results from NASA's Chandra X-ray Observatory may have helped them ...

What rising seas mean for local economies

February 15, 2019

Impacts from climate change are not always easy to see. But for many local businesses in coastal communities across the United States, the evidence is right outside their doors—or in their parking lots.

Tiny particles can switch back and forth between phases

February 15, 2019

Three years ago, when Richard Robinson, associate professor of materials science and engineering, was on sabbatical at Hebrew University in Israel, he asked a graduate student to send him some nanoparticles of a specific ...


Please sign in to add a comment. Registration is free, and takes less than a minute. Read more

Click here to reset your password.
Sign in to get notified via email when new comments are made.