Computer scientists develop 'mathematical jigsaw puzzles' to encrypt software

Jul 29, 2013 by Matthew Chin
Computer scientists develop 'mathematical jigsaw puzzles' to encrypt software
Concept illustration of mathematical jigsaw puzzle Credit: UCLA Engineering

(Phys.org) —UCLA computer science professor Amit Sahai and a team of researchers have designed a system to encrypt software so that it only allows someone to use a program as intended while preventing any deciphering of the code behind it. This is known in computer science as "software obfuscation," and it is the first time it has been accomplished.

Sahai, who specializes in cryptography at UCLA's Henry Samueli School of Engineering and Applied Science, collaborated with Sanjam Garg, who recently earned his doctorate at UCLA and is now at IBM Research; Craig Gentry, Shai Halevi and Mariana Raykova of IBM Research; and Brent Waters, an assistant professor of at the University of Texas at Austin. Garg worked with Sahai as a student when the research was done.

Their peer-reviewed paper will be formally presented in October at the 54th annual IEEE Symposium on Foundations of Computer Science, one of the two most prominent conferences in the field of theoretical computer science. Sahai has also presented this research in recent invited talks at Stanford University and the Massachusetts Institute of Technology.

"The real challenge and the great mystery in the field was: Can you actually take a piece of software and encrypt it but still have it be runnable, executable and fully functional," Sahai said. "It's a question that a lot of companies have been interested in for a long time."

According to Sahai, previously developed techniques for obfuscation presented only a "speed bump," forcing an attacker to spend some effort, perhaps a few days, trying to reverse-engineer the software. The new system, he said, puts up an "iron wall," making it impossible for an adversary to reverse-engineer the software without solving that take hundreds of years to work out on today's computers—a game-change in the field of cryptography.

The researchers said their mathematical obfuscation mechanism can be used to protect intellectual property by preventing the theft of new algorithms and by hiding the vulnerability a software patch is designed to repair when the patch is distributed.

"You write your software in a nice, reasonable, human-understandable way and then feed that software to our system," Sahai said. "It will output this mathematically transformed piece of software that would be equivalent in functionality, but when you look at it, you would have no idea what it's doing."

The key to this successful obfuscation mechanism is a new type of "multilinear jigsaw puzzle." Through this mechanism, attempts to find out why and how the software works will be thwarted with only a nonsensical jumble of numbers.

"The real innovation that we have here is a way of transforming software into a kind of mathematical jigsaw puzzle," Sahai said. "What we're giving you is just math, just numbers, or a sequence of numbers. But it lives in this mathematical structure so that these individual pieces, these sequences of numbers, can only be combined with other numbers in very specified ways.

"You can inspect everything, you can turn it upside-down, you can look at it from different angles and you still won't have any idea what it's doing," he added. "The only thing you can do with it is put it together the way that it was meant to interlock. If you tried to do anything else—like if you tried to bash this piece and put it in some other way—you'd just end up with garbage."

Functional encryption

The new technique for obfuscation paved the way for another breakthrough called functional encryption. With functional encryption, instead of sending an encrypted message, an encrypted function is sent in its place. This offers a much more secure way to protect information, Sahai said. Previous work on functional encryption was limited to supporting very few functions; the new work can handle any computable function.

For example, a single message could be sent to a group of people in such a way that each receiver would obtain different information, depending on characteristics of that particular receiver. In another example, a hospital could share the outcomes of treatment with researchers without revealing details such as identifying patient information.

"Through functional encryption, you only get the specific answer, you don't learn anything else," Sahai said.

Explore further: CHIKV challenge asks teams to forecast the spread of infectious disease

More information: Paper: eprint.iacr.org/2013/451

Related Stories

Beefing up public-key encryption

Feb 18, 2013

Most financial transactions on the Internet are safeguarded by a cryptographic technique called public-key encryption. Where traditional encryption relies on a single secret key, shared by both sender and ...

App security testing tool

Jul 22, 2013

"Please contact the administrator." This error message usually flashes up on the monitor when employees want to install new software on their office computer. The reason is simple. Companies want to protect themselves and ...

Recommended for you

User comments : 22

Adjust slider to filter visible comments by rank

Display comments: newest first

adam_russell_9615
1 / 5 (7) Jul 29, 2013
So it transforms it into spaghetti code?
packrat
2 / 5 (13) Jul 29, 2013
and probably makes it run even slower...
visualhawk
1.8 / 5 (15) Jul 29, 2013
Seeing that CPU's cannot run mangled code, I simply hook in at the level of the cpu (can be done via CPU emulators or in-circuit debuggers) and look at the code being loaded.

Then, if I'm clever enough, I will have the knowledge and/or tools to interpret the code. Encryption is a fools paradise - if it has to run on a CPU - the code will at some time reside in an unencrypted form.

It may thwart the guy down the street from reverse engineering my p-code based program but it will never prevent governments or funded crime syndicates from breaking the code.
Expiorer
1.3 / 5 (14) Jul 30, 2013
I will buy it if you give a downloadable demo and a million dollar prize to crack it.
MikeBowler
2.5 / 5 (8) Jul 30, 2013
I will buy it if you give a downloadable demo and a million dollar prize to crack it.

buy what? did you read the article?
Expiorer
1 / 5 (12) Jul 30, 2013
I will buy it if you give a downloadable demo and a million dollar prize to crack it.

buy what? did you read the article?


That means I will believe in that Mike.
And "...and then feed that software to our system..." could mean it will not be for free.
Mayor__Dooley
1 / 5 (7) Jul 30, 2013
The article is obviously missing a lot of detail, otherwise their system would only prevent the most casual of cracking and their research would be nothing but hot air.
Maybe it is designed to work on some hypothetical processor that has secured execution? Maybe it runs on some slow virtual processor and only prevents modification of the application binary.
cathar_seamus
3 / 5 (4) Jul 30, 2013
visualhawk wrote:
Seeing that CPU's cannot run mangled code, I simply hook in at the level of the cpu (can be done via CPU emulators or in-circuit debuggers) and look at the code being loaded.

Then, if I'm clever enough, I will have the knowledge and/or tools to interpret the code. Encryption is a fools paradise - if it has to run on a CPU - the code will at some time reside in an unencrypted form.

Ah, but then you completely missed the point: The code runs always in encrypted form, it is similar to homomorfic encryption but more sofisticated. Yes, you can try to unentangle the obfuscated graph to crack the software, but the problem is equivalent to hard decryption complexity-wise
mzso
1 / 5 (9) Jul 30, 2013
Seeing that CPU's cannot run mangled code, I simply hook in at the level of the cpu (can be done via CPU emulators or in-circuit debuggers) and look at the code being loaded.

Then, if I'm clever enough, I will have the knowledge and/or tools to interpret the code. Encryption is a fools paradise - if it has to run on a CPU - the code will at some time reside in an unencrypted form.

It may thwart the guy down the street from reverse engineering my p-code based program but it will never prevent governments or funded crime syndicates from breaking the code.


It might be mangled. And would just jump to different positions in the code seemingly randomly. (aka jigsaw) You could only assemble by going through every state.
visualhawk
1 / 5 (11) Jul 30, 2013

Ah, but then you completely missed the point: The code runs always in encrypted form, it is similar to homomorfic encryption but more sofisticated. Yes, you can try to unentangle the obfuscated graph to crack the software, but the problem is equivalent to hard decryption complexity-wise


I never said it would be easy or that the average person in the street could do it - on the other side, the NSA and crime syndicates with millions to their disposal are known for being able to compute today's "difficult to factor" problems in real-time.

And yes, design a CPU where the decryption takes place in the CPU data & code pipelines and the code will always remain mangled - for the 99.99% of CPUs in use though - not an option.

The whole point of what I said was that you should not rely on encryption except if it is of a quantum type in nature. And there is even indications that ways exists to eavesdrop on that .
Expiorer
1 / 5 (11) Jul 30, 2013
All these comments will be some "old ladies talk at the bus stop" until we have a functional demo. Right now we can chose to believe.
antialias_physorg
4.7 / 5 (3) Jul 30, 2013
the NSA and crime syndicates with millions to their disposal are known for being able to compute today's "difficult to factor" problems in real-time.

That is a pretty big statement. Care to back that up with anything?
The math on some of these (inverse) problems applied in encryption is pretty airtight.
To paraphrase Scotty:
"Even the NSA canna change the laws of physics". Why do you think they tried to have the PGP keys delivered to them by law?

And yes, design a CPU where the decryption takes place in the CPU data & code pipelines and the code will always remain mangled

You don't need to design a CPU for this. This will run on any old CPU. As cathar_seamus points out: It's a variant of homomorphic algorithms already in use (where the computation is done on a cloud based service on your encrypted data - I.e. the service provider never has access to the unencrypted data at any point)
http://en.wikiped...cryption
cathar_seamus
3 / 5 (4) Jul 30, 2013

The whole point of what I said was that you should not rely on encryption except if it is of a quantum type in nature. And there is even indications that ways exists to eavesdrop on that .


Not even the NSA can crack AES-256. The only attack they can do is brute forcing passphrases or by eavesdropping private keys. As of 2013, encryption still works. Quantum computer and their polynomial-time prime factoring could change it, but it hasn't happened yet
vega12
5 / 5 (1) Jul 31, 2013
I think some people are misunderstanding what they are saying. There is no actual encryption of the commands sent to the CPU. The encryption is in the actual algorithms used. Normal reverse engineering procedures can build up a pretty decent, human understandable code to go from in understanding exactly what is going on. But here they have found a way to obfuscate the actual logic of the code, so standard reverse engineering methods will surely produce "correct" code, but it will be so twisted and messed up that pulling out "understanding" of what is actually going on would prove to be extremely difficult.

And yes, it would most definitely add overhead to the execution. I don't think this would be used for speed intensive stuff.
Expiorer
1 / 5 (12) Aug 01, 2013
I think some people are misunderstanding what they are saying. There is no actual encryption of the commands sent to the CPU. The encryption is in the actual algorithms used. Normal reverse engineering procedures can build up a pretty decent, human understandable code to go from in understanding exactly what is going on. But here they have found a way to obfuscate the actual logic of the code, so standard reverse engineering methods will surely produce "correct" code, but it will be so twisted and messed up that pulling out "understanding" of what is actually going on would prove to be extremely difficult.

And yes, it would most definitely add overhead to the execution. I don't think this would be used for speed intensive stuff.

go #uck your self nobody will believe in this until a demo
PhyOrgSux
1 / 5 (9) Aug 03, 2013
This can become somewhat of a problem if used by malware. On another hand it can also be used to obfuscate software from an oppresive government.
dtxx
1 / 5 (9) Aug 03, 2013
All that would really be needed here is just to enumerate all of the functions of the software and note the changes in CPU and RAM states. They might be able to obfuscate the exact code, but that's already the case with opaque binaries. You can build a pseudo-code version of the program from your enumeration, then re-code it into whatever language you find suitable.

What's more interesting is that this may prevent (i.e. make much more difficult) buffer overflows, heap sprays, etc.

Also, cryptographic systems are very vulnerable to flaws in implementation. While a perfect implementation of this system may represent a hard problem (in the CS sense), most likely the implementation can be attacked.
dav_daddy
1 / 5 (10) Aug 04, 2013
Doesn't this mean you could have malicious code doing God knows what with your machine and sensitive data and you would have absolutely no way of detecting/stopping/removing said code?
Moebius
1 / 5 (5) Aug 04, 2013
It sounds to me like you can't have malicious code. In order to write a virus you have to understand the code you want to invade and change. This could prevent anyone from being able to design a virus or malware. Yes, it sounds like any program twisted like this will run slower but if it becomes immune to malware it will be worth it.
Moebius
1 / 5 (6) Aug 04, 2013
All that would really be needed here is just to enumerate all of the functions of the software and note the changes in CPU and RAM states. They might be able to obfuscate the exact code, but that's already the case with opaque binaries. You can build a pseudo-code version of the program from your enumeration, then re-code it into whatever language you find suitable........


And when you recode it, you have what? The same jumble as the encrypted version but in a different language.
germaine_rizak
not rated yet Aug 07, 2013
That's neat, its pretty much how i envisioned school, while reading a non textbook in class, while listening to the teacher, especially since it was history
antialias_physorg
not rated yet Aug 07, 2013
It sounds to me like you can't have malicious code. In order to write a virus you have to understand the code you want to invade and change.

Unless you write a homomorphic code bit that replaces another code bit (but does a little bit of 'bad stuff' on the side). That way you don't even need to know what you're replacing. As long as its effects (without the added, malicious side-effects) are the same you're good.

The construction of such a piece of code is harder (but may be automated if you know the algorithm). On the other hand it suddenly opens up the possibility of replacing ANY part of the code without it being noticeable.