The math of the Rubik's cube

Jun 29, 2011 by Larry Hardesty
Erik Demaine's collection of Rubik's-cube-type puzzles includes cubes with five, six, and seven squares to a row, as well as one of the original cubes, signed by its inventor (close-up below).Photo: Dominick Reuter

Last August, 30 years after the Rubik’s cube first appeared, an international team of researchers proved that no matter how scrambled a cube got, it could be solved in no more than 20 moves. Although the researchers used some clever tricks to avoid evaluating all 43 quintillion of the cube’s possible starting positions, their proof still relied on the equivalent of 35 years’ worth of number crunching on a good modern computer.

Unfortunately, for cubes bigger than the standard Rubik’s cube — with, say, four or five squares to a row, rather than three — adequately canvassing starting positions may well be beyond the computational capacity of all the computers in the world. But in a paper to be presented at the 19th Annual European Symposium on Algorithms in September, researchers from MIT, the University of Waterloo and Tufts University establish the mathematical relationship between the number of squares in a cube and the maximum number of moves necessary to solve it. Their method of proof also provides an efficient algorithm for solving a cube that’s in its worst-case state.

Computer science is concerned chiefly with the question of how long algorithms take to execute, but computer scientists measure the answer to this question in terms of the number of elements the algorithm acts upon. The execution time of an algorithm that finds the largest number in a list, for instance, is proportional to the length of the list. A “dumb” algorithm for sorting the numbers in the list from smallest to largest, however, will have an execution time proportional to the square of the length of the list.

Solution with a twist

Erik Demaine, an associate professor of computer science and engineering at MIT; his father, Martin Demaine, a visiting scientist at MIT’s Computer Science and Artificial Intelligence Laboratory; graduate student Sarah Eisenstat; Anna Lubiw, who was Demaine’s PhD thesis adviser at the University of Waterloo; and Tufts graduate student Andrew Winslow showed that the maximum number of moves required to solve a Rubik’s cube with N squares per row is proportional to N2/log N. “That that’s the answer, and not N2, is a surprising thing,” Demaine says.

The standard way to solve a Rubik’s cube, Demaine explains, is to find a square that’s out of position and move it into the right place while leaving the rest of the cube as little changed as possible. That approach will indeed yield a worst-case solution that’s proportional to N2. Demaine and his colleagues recognized that under some circumstances, a single sequence of twists could move multiple squares into their proper places, cutting down the total number of moves.

But finding a way to mathematically describe those circumstances, and determining how often they’d arise when a cube was in its worst-case state, was no easy task. “In the first hour, we saw that it had to be at least N2/log N,” Demaine says. “But then it was many months before we could prove that N2/log N was enough moves.”

Photo: Dominick Reuter

Because their method of analysis characterizes the cases in which multiple squares can be moved into place simultaneously, it provides a way to recognize those cases, and thus an algorithm for solving a disordered cube. The isn’t quite optimal: It always requires a few extra moves. But as the number of squares per face increases, those moves dwindle in significance.

Go configure

The Rubik’s cube is an instance of what’s called a configuration problem, the best-known example of which involves finding the most efficient way to reorganize boxes stacked in a warehouse. It’s possible, Demaine says, that the tools he and his colleagues have developed for studying the Rubik’s cube could be adapted to such problems.

But Demaine is also a vocal defender of research that doesn’t have any obvious applications. “My life has been driven by solving problems that I consider fun,” he says. “It’s always hard to tell at the moment what is going to be important. Studying prime numbers was just a recreational activity. There was no practical importance to that for hundreds of years until cryptography came along.”

But, he adds, “the aesthetic is not just to look at things that are fun but also look at problems that are simple. I think the simpler the mathematical problem, the more likely that it’s going to arise in some important practical application in the future. And the Rubik’s cube is kind of the epitome of simplicity.”

“Erik is always very interested in extending the reach of popular mathematics,” says Marc van Kreveld, an associate professor in the Department of Information and Computing Sciences at Utrecht University in the Netherlands, who designs puzzles in his spare time. “That’s really one of the things that he tries to do, to bring across that mathematics is not just some boring area of study, but it’s actually fun, and you can do a lot with it, and it’s beautiful.”

“Erik’s a very brilliant person,” van Kreveld adds. “He is already very successful in his hard-core research. But the popularizing is also very necessary, I think. You should not underestimate the importance of motivating students to learn.”

Explore further: Professor quantifies how 'one thing leads to another'

Related Stories

Study uncovers every possible Rubik's Cube solution

Aug 13, 2010

An international team of researchers using computer time lent to them by Google has found every way the popular Rubik's Cube puzzle can be solved, and showed it can always be solved in 20 moves or less.

The kids are alright

May 26, 2011

Children should be seen and not heard... who says? A Philosophy academic at The University of Nottingham is challenging the adage by teaching primary school children to argue properly.

Retooling algorithms

Feb 25, 2011

At its most fundamental, computer science is about the search for better algorithms — more efficient ways for computers to do things like sort data or filter noise out of digital signals. But most new ...

LightRadio breakthroughs

Feb 08, 2011

The world of mobile communications moves fast. With new mobile devices, new applications and ever-growing and changing consumer demands the wireless networks in use today have to evolve. Rather than take an incremental approach ...

Recommended for you

Jurassic Welsh mammals were picky eaters, study finds

3 hours ago

For most people, mere mention of the word Jurassic conjures up images of huge dinosaurs chomping their way through lush vegetation – and each other. However, mammals and their immediate ancestors were also ...

The changing landscape of religion

6 hours ago

Religion is a key factor in demography, important for projections of future population growth as well as for other social indicators. A new journal, Yearbook of International Religious Demography, is the first to bring a quan ...

Abusive leadership infects entire team

6 hours ago

Supervisors who are abusive to individual employees can actually throw the entire work team into conflict, hurting productivity, finds new research led by a Michigan State University business scholar.

User comments : 43

Adjust slider to filter visible comments by rank

Display comments: newest first

hush1
1.2 / 5 (18) Jun 29, 2011
"The execution time of an algorithm that finds the largest number in a list, for instance, is proportional to the length of the list."

Huh?

I have (pretty thin - one dimensional)spaghetti. Each strand corresponds uniquely to a number (label for it's length).

I execute my 'algorithm' - I scoop up and set the spaghettis upright. That will take one second. The longest spaghetti will project the furthest upright. Regardless of how many spaghettis I have, my 'algorithm' will always take one second.

No matter what finite size number (lengths) your super pocket calculators generates, my swoop and stand-on-end algorithm is faster.

Nice try, though.
Next!
ncsq
not rated yet Jun 29, 2011
When N=3, the formula Nsquared/logN gives an answer of close to 19 [ie about 20 moves, as stated], but in that case Nsquared itself is only 9. This LESS than the result obtained using the formula, not more, as is implied by the article. [see para 4]

This formula only gives a result smaller than Nsquared when N is greater than 10.

This somewhat defeats the reasoning in para 4. The formula looks OK, but the argument presented to explain its effect is not.
ncsq
not rated yet Jun 29, 2011
I wonder whether the actual formula should contain LnN rather than logN. In that case It would give a result larger than Nsquared for all N > e [ie not 10], which means for all integral N >= 3.

Presumably the proportionality factor would then raise the result from 9.88 up to 20.

This could at least make the reasoning in para 4 make sense.

ncsq
not rated yet Jun 29, 2011
Oops, that should be 8.19 not 9.88, but I hope you get my point.
hush1
1 / 5 (5) Jun 29, 2011
lol
bugmenot,Deesky, feedback guys. Off topic? Wrong? Snarky ending?
Just curious.
ncsq
4.7 / 5 (3) Jun 29, 2011
Not snarky at all, just pointing out that, as written, there is something wrong with the logic. Then postulating what that error might be. Science articles should make sense. This one does not.
hush1
1.8 / 5 (5) Jun 29, 2011
Thks ncsq.
The ratings to my first comment prompted me to ask the raters if the objection I took to the line I quoted had anything to do with my objection.

Strictly speaking, I can only be faster than a calculator finding the largest number (longest spaghetti) when the length of the list is large - "the number of elements the algorithm acts upon." My 'algorithm' acts on all elements at once - parallelism to the extreme, I guess.
Javinator
5 / 5 (1) Jun 29, 2011
hush,

The more spaghetti there is, the longer it would take you to see all of the spaghetti to see which one is the longest.

If you looked widthwise you'd have to move your head across to scan all the spaghetti.

If you looked along the spaghetti you could see how long the longest one was, but you wouldn't be able to pick out which position it's in.
Javinator
5 / 5 (1) Jun 29, 2011
Also, one dimensional spaghetti doesn't exist. If you're suggesting you can bundle all the spaghetti in your hand no matter how many and pick out the longest one... well that's not a realistic comparison since a computer would have at least two dimensions. One dimension would be the magnitude of the numbers in the list and the other would be that number's place in the list.
RalphWaldoEmerson
not rated yet Jun 29, 2011
Ncsq, please note that the article says that the number of moves required is *proportional* to N^2/log N, not that it's equal to it.
AntonyJ
5 / 5 (1) Jun 29, 2011

I execute my 'algorithm' - I scoop up and set the spaghettis upright. That will take one second. The longest spaghetti will project the furthest upright. Regardless of how many spaghettis I have, my 'algorithm' will always take one second.

Sorry, your approach is naive. Parallelism doesn't magically give you the answer - how is your brain finding the longest strand in the bundle? By doing a search - it's just subconscious and highly parallelised. The more strands the longer it'll take you to locate the longest and if there are several almost the same length it could take a very long time indeed.
However you are right that N isn't the most efficient algorithm if you can use parallelism. Dividing into columns and rows and finding the longest in each row in parallel and then finding the longest row completes in x y time [formally 2*sqrt(N)]. So instead of 256 strands taking 256 (16*16) steps, it takes 32 (16 16) providing you can execute 16 steps in parallel.
hush1
2 / 5 (4) Jun 29, 2011
The example illustrates method only. Not practicability or realization. Many thought experiments begin this way. The method illustrates an 'act', an algorithm acting on all elements at once. Surely you imagination 'allows' this - just as 'real' as the 'cardinality' of an infinite set.
Royale
not rated yet Jun 29, 2011
Well put Javinator. I was going to go for the computers need to use N notation because you need to count the steps of an algorithm. With hush's theoretical 'algorithm' while it makes sense in theory, it cannot be compared to computer computation.
ncsq
not rated yet Jun 29, 2011
RalphWaldoEmerson

I had taken that into account. If the relationship is proportional, then there will be a proportionality constant - let's call it A.

Then the actual equation for calculating the number of moves would be:

either NUM = A * N**2

or NUM = A * N**2 / ln N [using natural, rather than base 10, logs]

The point the article was making is that the first equation was expected, but the second turned out to be the fact.

Now, whatever the value of the constant A, [assuming it is > 0 - a negative value would be nonsense] A * N**2 can only be > A * N**2 / ln N If ln N > 1.

The argument in para 4 says that the real observed value IS less than A * N**2 viz "...reducing the total number of moves."

If N=3 then this only works if you use ln 3 = 1.0986.. It doesn't work if you use log 3 = 0.477...

This is the point I am making, and it is not changed by the value of A.

Cube
not rated yet Jun 29, 2011
hush your example is far different from this, spaghetti strands are physical, numbers are not. you have to search each number in the list individually. the only thing that could speed up an optimized algorithm would be to split the list into smaller parts, then send each part to a separate processor to find the largest number, then find the largest number among that largest numbers found by each processor.

Its not a limitation of the algorithm, its a limitation of the way computer's work.
bugmenot23
not rated yet Jun 29, 2011
"I execute my 'algorithm' - I scoop up and set the spaghettis upright. That will take one second. The longest spaghetti will project the furthest upright. Regardless of how many spaghettis I have, my 'algorithm' will always take one second."

You have one billion spaghettis. Your algorithm will take more than one second.

Congratulations; you've illustrated not only parallelism, but also an algorithm that's optimal for only a certain range of problem sizes.

Algorithms usually assume certain constraints, like "you can only compare two values at a time". Changing those assumptions sometimes supports improved algorithms.
kaasinees
2.3 / 5 (3) Jun 29, 2011
hush your example is far different from this, spaghetti strands are physical, numbers are not. you have to search each number in the list individually. the only thing that could speed up an optimized algorithm would be to split the list into smaller parts, then send each part to a separate processor to find the largest number, then find the largest number among that largest numbers found by each processor.

Its not a limitation of the algorithm, its a limitation of the way computer's work.

or you can save a lot of time and just find the biggest numerber on insertation or order the numbers by size on insertation.
hush1
2 / 5 (4) Jun 29, 2011
Royale correctly sees in theory the sense, as well as the insight that this has nothing to do with computers.
"Spaghetti" is not the best way to picture this.
SemiNerd
not rated yet Jun 29, 2011
hush1 - the basic problem is that you have assumed that the process of gathering the spaghetti is a constant time algorithm. This isn't so. The more you have, the longer it will take to set them on end, in proportion to the number of them.

To visualize this, look at the difference in the amount of time to setup one billion strands, and 2 billion strands. It takes about twice as long to setup the 2 billion as the 1 billion.

This is a common mistake people often make when analyzing algorithm complexity.
ouroboros2
not rated yet Jun 29, 2011
nscq, The article is using "Big O" notation. The use of which is so pervasive that it's just assumed when you are talking algorithmic complexity. The formal definition would go something like f(n) < g(n) if there exists some A and x0 such that for all x > x0 f(n) < A * g(n) (or f(n) element O(g(n)) is the most correct way to say it).
Algorithmic complexity is really only concerned with the way the equation grows near infinity not their absolute relationship at every point. Also as such the different log basses are usually just ignored because in that world the difference between lg, ln and plain old log is negligible in most cases.
cdkeli
1 / 5 (1) Jun 29, 2011
The point is that the answer they have to the maximum number of moves is "proportional" not exact. The issue that needs to be addressed is extimating the complexity of arrangement of the faces or the average degree of random displacement - why is it n^2 and not n^3 given a 3-dim object - because we only solve one face at a time or need to? There are probably other solutions - some less accurate, others more accurate.
Pyle
2.3 / 5 (3) Jun 29, 2011
Everyone who rated hushpuppie's initial comment a 1 should reflect on the quality of discussion it triggered. It wasn't inflammatory, nor was it derogatory. It was somewhat philosophical though, and I guess that is a point against it.

@hushie: Still not very funny, but it was a fun thought experiment.

@everyone else: I think what hush1 is getting at is that there will be a method that doesn't involve the digitization of the algorithm and uses a more "natural" solving technique. My feeling is that expanding all solutions and picking the shortest is pretty inefficient. The key is finding a way to pose the problem so that a route can be mapped more intuitively than twisting and turning the cube thereby immediately identifying the best paths. Am I close hush?
hush1
1 / 5 (3) Jun 29, 2011
SemiNerd,
Setting aside spaghetti for a moment, there is an algorithm that acts on all elements at once for infinite sets: order.

Establishing order for a set of infinite elements takes no time at all.

'Realistically' establishing order for sets of infinite elements will take 'time'.

Proofs of correspondence attempt to circumvent the 'time' factor.

Cardinality being just one case in point.
hush1
1 / 5 (3) Jun 29, 2011
Thks Pyle.
I owe you. Always my anchor in the motion of seas.
Cube
not rated yet Jun 29, 2011
hush and pyle, i think what u guys have to understand when it comes to computers is that reading the list is like picking up the straws.

imagine a floor scattered with these minimal thickness lines of yours, your task is to find the longest one. you would first go and pick up all of the lines, and you would obviously be able to hold all since they are all really small. you would then probably tap the bottoms of them on the floor to level off the bases, and measure the longest one.

but you still had to go pick up the lines.

the point we're trying to show you is that no matter the way you go about PROCESSING the data(in your case, you process it by bundling them and visually finding the longest one), you still have to COLLECT the data. It is the collection process which adds to the time.

Cube
not rated yet Jun 29, 2011
now, the way to get what you're saying to work would be like this. imagine that one of the tips of the lines is a magnet. you have a big electromagnet on the ceiling and you turn it on and pick up all the lines SIMULTANEOUSLY. now you've eliminated the dead time in picking up the straws. This would be what parallel computing is like, you reduce the amount of time taken to 'pick up the straws'.
tcm
not rated yet Jun 29, 2011
hush1, your comment infuriates me because you state, in an arrogant way, a comment that shows you are massively ignorant:
""The execution time of an algorithm that finds the largest number in a list, for instance, is proportional to the length of the list.'

Huh?

I have [...] spaghetti. Each strand corresponds uniquely to a number (label for it's length).

I execute my 'algorithm' - I scoop up and set the spaghettis upright. That will take one second. The longest spaghetti will project the furthest upright.
[...]
Nice try, though.
Next!"

The fastest algorithm to find the largest element of a set is IN FACT proportional to the size of the set. To find the largest, you must view each element of the set once (the algorithm is O(n), n = size of set). Your spaghetti algorithm is worse than O(n). Your brain is doing more than O(n) work to identify each piece of spaghetti and move them.

Please learn about computer science before you act like you know anything about it. I hate laymen.
Vendicar_Decarian
5 / 5 (1) Jun 29, 2011
"I execute my 'algorithm' - I scoop up and set the spaghettis upright. That will take one second. The longest spaghetti will project the furthest upright. Regardless of how many spaghettis I have, my 'algorithm' will always take one second." - Hush1

Claptrap.

Suppose you have 10,000 pieces all the same length and another 10,000 that are 1/1000th of a CM longer and still one more that is 1/1000th of an cm longer yet.

Nothing quickly stands out and you are going to have to scan through them all.

The parallelism in your method doesn't come from upending the strands. It comes from the parallelism of your eyes that enables you to quickly identify a strand that is very much longer than the others.

But your method fails, when your eyes fail.

Hush1 = 0
hush1
1.8 / 5 (5) Jun 29, 2011
Cube's suggestion approaches practicality. The fastest algorithm to find the largest element of a set is not proportional to the size of the set.

You can use weight of the spaghetti strands. Place them all on a sensitive elastic membrane and the longest strand will cause the greatest deformation - all are being 'weight' at the same time.

The emphasis is on the act of measuring all elements of set at the same time, not the practicalities of realization.

To ascertain the greatest deformation quickly an inelastic membrane sensitive to contact is place on the deformation field. The first signal of contact is the longest strand.

The solution can be visualized, regardless of whether a physical implementation exists or not.
FroShow
5 / 5 (2) Jun 30, 2011
What with all the discussion about parallelism, I'm surprised no one has yet noted that parallel processing doesn't reduce the number of steps in an algorithm.
Each processor still needs to execute a portion of the steps a single processor would have executed.
Parallel processing merely reduces the time needed by divvying up the work.
In fact parallel processing ADDS steps by having to compare the work executed by each processor.

After reading through all the comments (so many!) I get the sense that the term 'algorithm' (as applied to computation) is poorly understood.

I did however find the increasingly creative methods of sorting spaghetti highly amusing - even though I can't imagine how any of them could be translated into a computer algorithm
FroShow
5 / 5 (2) Jun 30, 2011
The fastest algorithm to find the largest element of a set is not proportional to the size of the set.

I'd agree with this, only because comparing "algorithms" to "set size" is like comparing "books" to "letters of the alphabet".
The number of steps required to find the largest element IS however proportional to the set size. No ifs, ands, or buts. Fact.
CaptainSlog
2 / 5 (1) Jun 30, 2011
"The execution time of an algorithm that finds the largest number in a list, for instance, is proportional to the length of the list."

Huh?

I have (pretty thin - one dimensional)spaghetti. Each strand corresponds uniquely to a number (label for it's length).

I execute my 'algorithm' - I scoop up and set the spaghettis upright. That will take one second. The longest spaghetti will project the furthest upright. Regardless of how many spaghettis I have, my 'algorithm' will always take one second.

No matter what finite size number (lengths) your super pocket calculators generates, my swoop and stand-on-end algorithm is faster.

Nice try, though.
Next!


Next, you need to lay them flat on a plane, and the longest one must be next to the next longest which is next to the 3rd longest...which is next to the shortest.
MediocreSmoke
1 / 5 (3) Jun 30, 2011
"The execution time of an algorithm that finds the largest number in a list, for instance, is proportional to the length of the list."

Huh?

I have (pretty thin - one dimensional)spaghetti. Each strand corresponds uniquely to a number (label for it's length).

I execute my 'algorithm' - I scoop up and set the spaghettis upright. That will take one second. The longest spaghetti will project the furthest upright. Regardless of how many spaghettis I have, my 'algorithm' will always take one second.

No matter what finite size number (lengths) your super pocket calculators generates, my swoop and stand-on-end algorithm is faster.

Nice try, though.
Next!


Jesus, you are a special kind of retarded where you're armed with enough information to confuse people that are actually stupid.
And you seem to be trolling, what I'd expect out of an angry uneducated troll.
You're totally wrong, for the record. Ask someone who gets paid to think about this kind of stuff for a living.
Next!
hush1
2.3 / 5 (3) Jun 30, 2011
"paid to think" - MS
"about this kind of stuff for a living" - MS

There are very few obstacles greater than money to prevent honesty, sincerity and integrity in research, science and teaching.

You will be taught and told anything if the price is right.
Royale
5 / 5 (3) Jun 30, 2011
I can't see how this argument is still going on... obviously big(O) notation was thought of, at least for all my comments. But Jesus, hush has made it clear he understands that there is not a practical way to do his thought problem in real life and he has expressed that his 'algorithm' is not like a traditional one. Can't you guys agree to disagree. I was a CS major and I get the computer part, I get big(O) notation, but I can still see what hush is thinking. Yes you cannot do it with a computer, and some remarks seemed a bit snarky, but come now; it's obvious he understands. It's been obvious since about 10 posts in!
FroShow
not rated yet Jun 30, 2011
Nothing is obvious unless explicitly stated. Welcome to the internet: the greatest medium for causing misinterpretations.
(I think it's because people just want to be heard and don't care enough HOW they're heard.)(Yes, I do notice the irony of my comment.)
Royale
not rated yet Jul 01, 2011
Haha. Great point Fro. My first post simply referred to N notation; which I would have thought that any of my fellow Comp Sci/Engineer fellows would understand as big(O). But you're right, it's not explicitly stated, and thus open for misinterpretations.
TJ_alberta
5 / 5 (1) Jul 03, 2011
as hush1 says in so many words, "computers" don't have to be necessarily digital. sometimes analog is more appropriate.
BillFox
5 / 5 (1) Jul 03, 2011
Two dimensional spaghetti would cut your hand pretty badly when picking it up...
slash
2 / 5 (1) Jul 04, 2011
The idea behind hush's suggestion is the basis of what is called an analogue computer: a contraption that stores values physically and performs algorithms by executing physical experiments involving simple mechanics, optics or the like.

The 'algorithms' developed for such analogue computers are designed to deliver 'instant' results, but that doesn't necessarily mean they're actually more efficient than digital algorithms:

In the spaghetti example, each spaghetti has a finite diameter, and thus the whole bundle of n spaghetti has a width that is proportional to sqrt(n). The time light takes to pass over the bundle is proportional to that, and the time to sweep a 'detection ray' over the width of the bundle is also proportional to that, so the total time required to scan the top of the bundle for the highest tip is at least proportional to sqrt(n)*sqrt(n) = n
slash
2 / 5 (1) Jul 04, 2011
Maybe a better example than spaghetti would have been a system of vertical, long tubes that contain different amounts of water, corresponding to the values they represent. Each tube also holds a cork which is topped by a 45 degree mirrored cone that causes beams coming from the sides to be reflected upward.

Store these tubes all in a long line, use colored water to swallow passing light, then light parallel beams of light from both sides. Only the highest 'cork-mirror' will be hit by beams from both sides, reflecting both beams upward. When you look from above, the mirror on top of the tube containing the most water will shine (almost) twice as bright as any other. It shouldn't be too hard to send the upward beams through individual masks that identify the tube, and suppress the other beams.

Of course, the time required for the beams to pass through a contraption with a width of n tubes will still be O(n) ...
Shamus123
not rated yet Aug 17, 2011
before one can properly analyze the spaghetti, one must eat the meatballs that are sitting on top of it.
hush1
1 / 5 (1) Aug 18, 2011
lol
Meatball size is proportional to ingesting size of the orifice.

Cardinality doesn't have "an algorithm that finds the largest number in a list that is proportional to the length of the list."