An oracle for object-oriented programmers

Oct 07, 2011 by Larry Hardesty

In the last 40 years, the major innovation in software engineering has been the development of what are called object-oriented programming languages. “Objects” are, effectively, repositories for the computational details of a program, which let the programmer concentrate on the big picture. A complex computer program, with millions of lines of code, can be distilled into some fairly intuitive interactions between objects.

For programmers building a large application from scratch, object-oriented programming is a boon, allowing them to add new functions or make major revisions by changing just a few lines of code. But for a programmer dropped into the middle of a massive development project, trying to navigate the thicket of existing objects can be bewildering. Learning what the objects are and what they do might take days or even weeks.

At the Association for Computing Machinery’s SPLASH conference at the end of the month, researchers from MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL) will present a new system that automatically determines how objects in a large software project interact, so it can inform latecomers which objects they will need to design certain types of functions. The system could be of particular use to programmers working with open-source software, whose licensing terms require that its underlying code be publicly disclosed. Someone wishing to simply add a function to a common open-source program, for instance, may not want to spend the week it takes to get up to speed on all the program’s objects.

“Part of the promise of open source is that if you don’t like what it does, you can go in and change it,” says Armando Solar-Lezama, the NBX Career Development Assistant Professor of Computer Science and Engineering, who led the work. “But if you have this huge learning curve, then you’re not going to be able to do that. You’re going to end up with a small group of experts who go and do all the stuff, and everyone else just uses it.”

Objectivity

The idea of the object is easiest to understand when the programmer’s object — a cluster of data and a set of associated functions — corresponds to a virtual object on-screen. A programmer wishing to add a new window to an application, for instance, simply writes a line of code calling up a new window object; the window comes complete with things like scroll bars and size-adjustment tabs and a display line for text. If the programmer wants to add a button to the window, she calls up a new button object.

But after that, things can get more complicated. To describe the layout for the window, the programmer may have to invoke an object called Layout; to enable the button to register mouse clicks, she may have to invoke an object called EventListener. These don’t appear on-screen as virtual objects, but in the programmer’s sense, they’re objects nonetheless.

“In some respects, this is a great design,” Solar-Lezama says. “It’s beautifully engineered to allow you to just take out little pieces of the functionality and replace them without having to go and write lots of code. But the price of that is that you have to know how it works before you can use it.”

Solar-Lezama and his students Zhilei Xu and Kuat Yessenov have developed a new system they call Matchmaker, because it takes as input the names of two objects and describes how to get them to interact with each other. To demonstrate how it works, the researchers applied it to an open-source program called Eclipse, which computer scientists use to develop programming tools for new computer languages.

In the Eclipse framework, the window that displays code written in the new language is called an Editor; a function that searches the code for symbols or keywords is called a Scanner. That much a seasoned developer could probably glean by looking over the Eclipse source code. But say you want to add a new Scanner to Eclipse, one that allows you to highlight particular symbols. It turns out that, in addition to your Editor and Scanner objects, you would need to invoke a couple of objects with the unintuitive names of DamageRepairer and PresentationReconciler and then overwrite a function called getPresentationReconciler in yet a third object called a SourceViewerConfiguration.

With Matchmaker, the developer would simply type the words “editor” and “scanner” into the query fields, and the program would return the names of the objects that link them and a description of the modifications required to any existing functions.

Observe and detect

Matchmaker builds up its database of object linkages in a program’s source code by monitoring the program’s execution. In the case of Eclipse, it noticed that every time a Scanner was invoked, so were the other objects.

On occasion, Matchmaker’s inferences may turn out to be wrong. But even in those cases, Solar-Lezama argues, some guidance is better than none. To test that thesis, Solar-Lezama and his colleagues did a user study with eight programmers new to Eclipse. All of them were asked to perform the same task, which required linking up two different types of objects. Four of them were allowed to use Matchmaker, and four weren’t. Moreover, the example was specifically chosen so that the instructions provided by Matchmaker were incomplete: They left out one crucial step. Nonetheless, Solar-Lezama says, the programmer who completed the task most quickly without Matchmaker still took longer than the slowest of the programmers who used it.

“I think the user study is the linchpin that makes this seem like it’s got real practical implications,” says Jeff Foster, an associate professor at the University of Maryland who’s part of the Department of Computer Science’s Programming Languages group. “If you can hand the tool to somebody else and they can be better programmers when they’re using it, that’s a great result.”

Foster points out that Matchmaker can’t answer all the questions that a programmer new to an application might have. Matchmaker is useful, he says, “in cases where you can guess that you needed an X and Y, and you could find the names of those components, and you can ask how they’re connected. But there’s a whole other set of questions where you don’t even know what components to use.”

Still, Foster says, in the instances where Matchmaker is applicable, “you’re going to get result that’s vastly superior. You’re going to get a result in seconds or minutes instead of having to search on Google, filter out a lot of bad answers, figure out what people meant when they explained various things — if the tool works for the problem, it’s very useful.”


This story is republished courtesy of MIT News (http://web.mit.edu/newsoffice/), a popular site that covers news about MIT research, and teaching.

Explore further: Communication-optimal algorithms for contracting distributed tensors

Related Stories

Defibrillator for stalled software

Aug 03, 2011

It’s happened to everyone: You’re using a familiar piece of software to do something you’ve done a thousand times before — say, find a particular word in a document — and all of a ...

Real-world programming in the classroom

Oct 28, 2010

In undergraduate computer-science classes, homework assignments are usually to write programs, and students are graded on whether the programs do what they're supposed to. Harried professors and teaching assistants ...

The surprising connection between two types of perception

Jun 14, 2011

(Medical Xpress) -- The brain is constantly changing as it perceives the outside world, processing and learning about everything it encounters. In a new study, which will be published in an upcoming issue of Psychological Sc ...

Our brains have multiple mechanisms for learning

Jul 14, 2011

(Medical Xpress) -- One of the most important things humans do is learning this kind of pattern: when A happens, B follows. A new study, which will be published in an upcoming issue of Psychological Science, a journal of the ...

Recommended for you

Charging electric cars efficiently inductive

6 minutes ago

We already charge our toothbrushes and cellphones using contactless technology. Researchers have developed a particularly efficient and cost-effective method that means electric cars could soon follow suit.

Customized surface inspection

16 minutes ago

The quality control of component surfaces is a complex undertaking. Researchers have engineered a high-precision modular inspection system that can be adapted on a customer-specific basis and integrated into ...

Sensors that improve rail transport safety

21 minutes ago

A new kind of human-machine communication is to make it possible to detect damage to rail vehicles before it's too late and service trains only when they need it – all thanks to a cloud-supported, wireless ...

3D TV may be the victim of negative preconceptions

1 hour ago

An academic from Newcastle University, UK, has led a lab-based research, involving 433 viewers of ages from 4 to 82 years, in which participants were asked to watch Toy Story in either 2D or 3D (S3D) and report on their viewing ...

User comments : 29

Adjust slider to filter visible comments by rank

Display comments: newest first

antialias_physorg
5 / 5 (2) Oct 07, 2011
Learning what the objects are and what they do might take days or even weeks.

Try months.
And for really big projects using large frameworks you can only hope to ever graps a subset of object/object interactions and their capabilities.

But the price of that is that you have to know how it works before you can use it.

That's not entirely true. You have to know what interface you have to provide fo your new object so that it can replace another object. This is (or should be) provided in the interface description - which is already common practice and without which no largish project can survive for long without becoming unmanageable.

But generally such representations of object interdependencies as described in the article are helpful. But they are not particularly new. There's a whole host of tools out there that already do this to some degree.

The main (and unsolved) problem is often to find out _what_ object you need to replace/derive from in the first place.
arri_guy
5 / 5 (2) Oct 07, 2011
As a retired software development consultant, I can't say how many times I was dumped into the middle of a system that had been worked on by dozens of programmers sometimes over a decade or more. Management thought that one could grasp the totality of the system in a day or two. MY tools were FORTRAN or PL/1 or BASIC or C, and line-oriented command languages. Even toward the end of my career, I couldn't believe that "spaghetti code" and uncommented source were tolerated. "KISS" (keep it simple stupid) was my mantra, since I had to maintain the damn thing. Hardware is comparatively easy. Software is HARD. BTW, have we achieved a compiler for massively parallel systems yet?
CHollman82
5 / 5 (5) Oct 07, 2011
I'm a professional software/firmware engineer and all I have to say about this is that it will never really be used by anybody and will fall into obscurity. There have been many attempts to automate the modelling of object oriented designs to present a high level view to programmers unfamiliar with the design but the reality is it often takes longer to learn to use these systems and to bypass their flaws than it does to simply learn the details of the objects and interfaces manually by reading the documentation that should exist for them already.
CHollman82
5 / 5 (2) Oct 07, 2011
Arri_Guy, I like your KISS mantra, and to it I'll add RTFM, "It takes as long as it takes", and "don't reinvent the wheel".
Nanobanano
5 / 5 (1) Oct 07, 2011
Even toward the end of my career, I couldn't believe that "spaghetti code" and uncommented source were tolerated


I'm not even a professional programmer, but any time I screw around and write a program, or even a script, trigger or level design for a game or some other goofy thing, I comment everything.

I comment the beginning and end of control structures, functions, objects, variables, etc. Explain what it does, what the input parameters are, what the output should be, and sometimes why I did it that way instead of some other way.
Noumenon
4.7 / 5 (50) Oct 07, 2011
In my day, spaghetti code was an art. :)
antialias_physorg
5 / 5 (2) Oct 07, 2011
have we achieved a compiler for massively parallel systems yet?

Erlang

There have been many attempts to automate the modelling of object oriented designs to present a high level view to programmers unfamiliar with the design

I have to disagree. Systems that do graphical representations of large frameworks and their interdependencies (like you can do with doxygen) are extremely helpful. Just look at the documentation of Qt or VTK. As a novice you can actually dive right in with relatively modest effort.

That said: Such good documentation treatment is (unfortunately) far from the norm.

In my day, spaghetti code was an art.

In my (programming) day spaghetti code was and is a disease. And that 'day' already lasts for 20 years. Anyone who uses a 'goto' (or similar control structure) should be shot. Or better:

diediedie:
shoot(programmer);
goto diediedie;
Zomax
5 / 5 (3) Oct 07, 2011
First,
Show me a example output.

Then,
Show me the doxy.
Show me the a Class level Diagram.
Show me the Use Case Diagram.

Let me decide.
Noumenon
4.7 / 5 (48) Oct 07, 2011
@antialias_physorg, LOL. Yes it was a nightmare, but being asked to develope a full property management system on a VERY limited toy computer, ... one ends with a mixture of assembly language and basic spaghetti code. I would eventually write assembly code to load and execute models of a drive to make it work, which was published in Compute mag. Fun days.

Now I did gt to use COBOL,... once finished documenting, sometimes I forgot the code.
antonima
5 / 5 (1) Oct 07, 2011
more like
In the last 40 years, the major innovation has been software engineering PERIOD
El_Nose
4 / 5 (1) Oct 07, 2011
NO - objects are a way of combining data and functions together in a logical single entity - large programs can sometimes be distilled into objects but most often it is broken down into modules, or other complete programs that are called from a main program
Noumenon
4.7 / 5 (47) Oct 07, 2011
NO - objects are a way of combining data and functions together in a logical single entity - large programs can sometimes be distilled into objects but most often it is broken down into modules, or other complete programs that are called from a main program


Are you referig to my comment? If so, yes, I know what object oriented programming is as I have written extensively in C(plus plus)* ,.. not in the above case however.

* editor does not allow the plus symbol.
jamesrm
not rated yet Oct 07, 2011
Sounds like a version of predictive text for objects?
Nanobanano
3.3 / 5 (3) Oct 07, 2011
Sounds like a version of predictive text for objects?


It's just an application that attempts to help the programmer figure out how to use a library or update and edit the objects of an existing program if they are not familliar with it.

Like if a company hired you to add a new feature to an application, and you have no idea how any of the existing functions and objects work, and let's say the comments aren't very good, this is supposed to help you figure it out.

Predictive text is quite different, and has actually been in c family editors and compilers for at least 10 or 11 years now...
jimbo92107
5 / 5 (1) Oct 07, 2011
What a disappointment has been the evolution of software programming. I decided to avoid the whole mess after seeing the dismal state of the art of Basic, C and Pascal. That was thirty years ago. I figured eventually people would be snapping objects together visually with a drag and drop interface like Visio, where the dreary details would be selectable from menus and charts with a right click and you could connect objects with whatever functions you needed with arrows.

Software tells hardware what to do. What a shame that nobody has invented a way to show humans what software is doing.
CHollman82
5 / 5 (5) Oct 07, 2011
What a disappointment has been the evolution of software programming. I decided to avoid the whole mess after seeing the dismal state of the art of Basic, C and Pascal. That was thirty years ago. I figured eventually people would be snapping objects together visually with a drag and drop interface like Visio, where the dreary details would be selectable from menus and charts with a right click and you could connect objects with whatever functions you needed with arrows.


You can certainly do just that with a variety of tools, the problem is that engineering a complex system is by definition complex, so any tool used to simplify will also simplify the result... it's the difference between building a machine from scratch starting with the basic elements on the periodic table and building one out of Legos. The lego version might APPROXIMATE the model you were looking for, but to make it exactly what you want you have to get your hands dirty with chemistry and physics.
CHollman82
5 / 5 (2) Oct 07, 2011
Oh, and as far as the goto statement, there are legitimate uses... The stigma comes from people using it when it wholly unnecessary and who aren't used to mixed c/asm. Most of what I write is very low level, in higher level code goto can almost always be replaced by a more appropriate control structure.

Just for fun I searched for "goto" in the project I have open right now (~10 million line optical time domain reflectometer firmware) and came up with 29 occurrences, 6 of them were mine, the rest were in libraries I'm using.
CodeMunkey
5 / 5 (1) Oct 07, 2011
You know, it is kinda funny. I know people that will trip all over themselves for this stuff and make it all sound good. But I am one that will run, run, and run.
GDM
5 / 5 (3) Oct 07, 2011
Good to see all the s/w engineers out there. After 40 years of programming from 360 assembler, fortran, cobol, ADA, c (all flavors) VBasic, OO programming, and on and on, it has been "a long stange trip". Mostly what I see now is something much larger than the fabled tower of babel and it doesn't look like it is getting any simpler. Talk about job security! So, I still write code for the hell of it, building web pages out of M/S VB (mostly in linear code rather than OO) and I write very few comments. If you can't read the code, the comments mean nothing, and over time, the comments tend to mislead, rather than enlighten, because if written, they are often neglected and not updated. Of all the stuff I've written, my best was in COBOL, forming bit-mapped, variable-length, variable-format messages. It was so complex and had so many possibilities that I had 2 GOTO DEPENDING statments, each with 256 paragraphs. It was perhaps the easiest thing in the system to understand.
kaasinees
2.1 / 5 (7) Oct 08, 2011
I HATE comments in code. They make code very unreadable, together with shitty formatting ... A file or a class header is no big deal though and can be handy. In non-OO languages it might make more sense to put comment headers on functions because they don't get to put descriptive class names or namespaces. IMHO if you cant read the code the code is very ill written or you don't know how to code well enough, and i think this applies to more people than anyone realizes, some people just hack code together and don't know what they are doing very well, i have seen this happen. I remember hacking around in unreal engine (the first one) and i had no trouble navigating the code, but the code was not that big though, the objects were properly named. Sometimes it was a hassle to figure out how to do something, but after a bit research(minutes, hours tops) it was no trouble. Does anyone else have the ability to debug code while read/writing it? I seem to have it. Its neat!
monique_bizzell
5 / 5 (1) Oct 08, 2011
VS 2010 Ultimate or higher has tools of this nature, for about 2 years. On OO programming I love the MVVM paradigm. Also, how can you do asm w/o spaghetti?
antialias_physorg
4 / 5 (2) Oct 08, 2011
VS 2010 Ultimate or higher

Ultimate is already the highest version (and costs about 12000$ a pop).

I believe the Architect version already has these tools, though.

Certainly it depends on the application. Spaghetti code is probably fine if all of the below apply:
- the program is exteremly small
- developed by one person
- never needs to be serviced
- never needs to be upgraded
- requires no documentation (i.e. will never be passed on to another person to service/upgrade)
- deployed on hardware where memory is at a premium.
- written in a language that does not support OO
- does not have to pass am industrial testing process or conform to stringent laws (medical, aviation, ... )

But using any kind of higher language and not using OO is just plain foolish.
monique_bizzell
not rated yet Oct 08, 2011
Enterprise is the highest version.
HarshMistress
5 / 5 (3) Oct 08, 2011
Unlike some commenters here, I'd always appreciated well commented code. Reading uncommented code is time consuming and ambiguous, no matter how good a coder you are.

Call me romantic, but I was enjoying myself 30-40 years ago while reading the code and comments written by DEC's systems software developers of the RSX-11M operating system. Those people were not only clever, they were also so witty I'd frequently laughed outloud over their code. My point is: on top of being useful, comments add a very important humane touch to the product.

Ever since then, I was commenting my code generously, as a token of my benevolence towards happless people to inherit my code - which doesn't necessarilly mean my code was bad.
Callippo
2.3 / 5 (3) Oct 08, 2011
The object-oriented programming isn't actually quite optimal for dynamic situations, when project must be adopted with flexibility. We should realize, the OOP design simplifies code sharing and reuse, but from the same reason it makes the organic changes more difficult. For agile programming it's often better to use paradigms, which are more easy to refactor.

Regarding the code comments, the OOP design encapsulates the functionality into small blocks, which shouldn't require a supervision often - so that the commenting of code isn't so important here, with compare to the documentation of interfaces. I do prefer compact and brief code to, so I place my comments to the right side of screen, where they still remain available, but don't interfere the code flow.
NMvoiceofreason
5 / 5 (1) Oct 09, 2011
In the late 1980's I went to a seminar on programming languages. "I don't know whether block common will be allowed, or database api's, or object oriented classes and methods will be part of the language structure in 2010. But I do know we will call it FORTRAN."
Callippo
2 / 5 (3) Oct 09, 2011
I do know we will call it FORTRAN
Actually this language is called BASIC by now. It's the only language, which survived all stages of evolution of computer programming, while keeping its syntax and it's used widely by now. Albeit Fortran 2003 and another newer Fortran clones support some OOP traits too. We should realize, the OOP comes with performance penalty (memory overhead in particular) and the Fortran was designed as a high performance language from its very beginning.
I am Sancho
5 / 5 (1) Oct 09, 2011
Good spaghetti code will never die. The secret is to immediately return the spaghetti back to the hot pot after draining with some sauce. Al dente always. And save some of the water for later.
paintingfrom
not rated yet Oct 16, 2011
do you want to recreat you photo to a masterpiece oil paintings?or collection value reproduction paintings?we can transform your photos to portrait paintings and reproduction from master artists,100% handmade,Hand painted oil painting is not only the best choice for your home decoration but also a special gift for your friends.