# Winning While Losing: New Strategy Solves 'Two-Envelope' Paradox

(PhysOrg.com) -- Researchers from Australia have taken a step toward resolving a seemingly simple yet unsolved paradox known as the "two-envelope" problem. They’ve worked out a new strategy that can enable a player to beat the game in terms of increasing their payoff. The strategy could have applications in optimizing gains in investments and other areas.

Mark McDonnell of the University of South Australia and Derek Abbott of the University of Adelaide have published their results in a recent issue of Proceedings of the Royal Society A.

In the two-envelope paradox, a player must choose between two envelopes, one of which contains twice as much as the other. The player can open the envelope they choose, and then they have the option of switching envelopes. The other envelope, of course, has either twice the money or half the money as the first envelope, but the player does not know which.

It may seem that, since a player has a 50-50 chance of choosing either envelope, they have an equal chance of gaining or losing money whether they decide to switch or keep the original envelope. However, probability theory seems to confusingly show that it’s always better to switch.

For example, say the first envelope you pick has \$10, so that the other envelope has either \$20 or \$5. Then you can calculate the expected value (i.e. the probability-weighted sum of the possible values) of the second envelope, assuming that each possibility has a 50% chance: (0.5 x \$5) + (0.5 x \$20) = \$12.50. Since \$12.50 is more than \$10, it makes sense to switch. No matter which numbers you use, you always get an expected value for envelope two that is 5/4 higher than the value for the original envelope: if c is the value of the original envelope, the expected value of the second envelope is (0.5 x [0.5c]) + (0.5 x [2c]) = 5/4c. The mathematical difference is determined by the relations between the envelopes’ values, but it still doesn’t make sense to switch every time, since it could be argued that a player could have started out with the second envelope in the first place - yet still be advised to switch.

Mathematicians have been trying to figure out the problem (or some variation of it) since 1930, though it was not expressed in the two-envelope format until 1988 by Harvard mathematics professor Sandy Zabell. Though several researchers have claimed to have found solutions to the paradox, no consensus has been reached and so the problem is still considered unsolved.

Randomized Switching

Perhaps, as McDonnell and Abbott suggest, the key to the paradox may occur when the player looks inside the first envelope; knowing this information breaks the symmetry, since the envelopes are not identical anymore. To demonstrate this idea, the researchers have worked out a formula that can increase a player’s chance of picking the envelope with the greater amount of money, if played repeatedly.

The researchers named the new method Cover’s strategy, since it originated with a suggestion by Stanford engineering professor Tom Cover during lunch. In the strategy, a player randomly switches envelopes with a probability that depends on the amount of money in the first envelope. The larger the amount, the less likely it is that a player should switch, even without knowing how high or low the values might be (the distribution). Over 20,000 simulations, this strategy increased a player’s payoff compared with simple switching. The researchers also found that a deterministic switching strategy - where a player switches whenever the value of the first envelope is smaller than some predetermined threshold - also leads to a gain compared with never switching.

“The apparent paradox arose before because it didn't seem to make sense that opening an envelope and seeing \$10 actually tells you anything, and therefore it seemed strange that your expected value of winning is \$12.50 by switching,” Abbott told PhysOrg.com. “But we resolve this by explaining it in terms of symmetry breaking. Before the envelopes are opened, the situation is symmetrical, so it doesn't matter if you switch envelopes or not. However, once you open an envelope and use Cover's strategy, you break that symmetry, and then switching envelopes helps you in the long run (with multiple plays of the ).”

The researchers explained that the strategy emerges from recent advances in two-state switching phenomena that are emerging in the fields of physics, engineering, and economics. For example, in stochastic control theory, random switching between two unstable states can result in a stable condition.

“When I had lunch with Tom Cover in 2003 and he suggested that his strategy ought to work, I thought he was nuts and refused to believe it,” Abbott said. “It was that counterintuitive that I thought it was crazy. But I went back to Australia and slowly came around to Cover's viewpoint after careful thought over the years. My expertise in Brownian ratchets was the key to me understanding the physical picture behind it.”

As Abbott explained, a Brownian ratchet is a physical device that can organize random particles to flow in a particular direction. “The trick with a Brownian ratchet is that again it uses the idea of breaking symmetry,” he said. “It is this idea that is behind the principle of the well-known ‘Parrondo's paradox,’ which shows that you can mix two losing games and yet win. This solution to the two-envelope problem is a breakthrough in the field of Parrondo's paradox.”

Winning While Losing

Although a player can use the random switching strategy to win money when having prior knowledge of the statistical distribution of the envelopes’ values, the significant point is that this knowledge isn’t necessary. “What is surprising is that our analysis shows that you can always improve your gain using Cover's method with ignorance of the ‘house limit’ (the highest value of money allowed) and of the statistical distribution the numbers obey,” Abbott said. “That is rather amazing. And the reason it is of importance is that engineers often have to consider what are called ‘blind optimization’ problems. And so our solution may stimulate new work in this area.”

Another type of optimization method that shares similarities with the two-envelope problem is financial investing in the stock market. For instance, in "volatility pumping," switching between poor investments can result in winning an exponentially increasing amount of money.

“Volatility pumping is a ‘toy model’ that you can't use exactly in its present form on the stock market,” Abbott explained. “However, it is a toy model that illustrates underlying mechanisms that are useful. It suggests the power of changing your portfolio of stocks periodically, buying low and selling high. Both the two-envelope process plus volatility pumping appear closely related to Brownian ratchet phenomena. They both exploit the interaction of asymmetry with randomness.”

This insight also brings with it a number of open questions. For example, when playing a sequence of games, a player could modify the details of the strategy by continually updating the estimated distribution from which the envelopes’ values are chosen. Also, since the strategy relates to two-state switching in other fields, perhaps it may be possible to explain all these phenomena with a common mathematical framework.

More information: Mark D. McDonnell and Derek Abbott. “Randomized switching in the two-envelope problem.” A. doi:10.1098/rspa.2009.0312

All rights reserved. This material may not be published, broadcast, rewritten or redistributed in whole or part without the express written permission of PhysOrg.com.

Feedback to editors

Aug 18, 2009
"(0.5 x \$10) + (0.5 x \$20) = \$12.50"
?!?!?!?!?!?!

0.5 x \$10 = \$5
0.5 x \$20 = \$10
\$10 + \$5 = \$15 ***** wtf ?

Aug 18, 2009
http://en.wikiped..._problem

The math expressed on this wiki is correct, and produces the \$12.50 you were looking for.

0.5*2*A + 0.5*0.5*A

A = 10, solution = \$12.50

Sorry for all of my comments, I was having a hard time accepting Physorg made this error, jeez. /embarrassed that this is a science website.

(The wiki exposes other flaws in the original reasoning at that.)

Aug 18, 2009
The original two-envelope problem is not a paradox. It simply reflects the fact that each envelope has a positive value, and the fact that the second choice (after having picked an initial envelope) is not a double or nothing bet.

After picking the first envelope, you are guaranteed at least half of the first envelope (not zero), with the prospect of getting twice that. In order for there to be no expected gain from switching, the second envelope would need to have a 50-50 probability of having either zero dollars or double the first envelope.

What is most notable is that the expect value of the entire game does not change after the first and second choices. Before the first choice, there is a 50% chance of getting \$c and a 50% chance of getting \$2c. Thus, the expected value of the game -- before any selection is made -- is \$1.25c. This doesn't change after the initial choice has been made, as noted in the article.

What would be more paradoxical is if the expected value of the game did change after the selection of the first envelope. Since picking an envelope conveys no information about its value relative to the other envelope, there shouldn't be any change in the expected value of the game.

A variation on this game that destroys the so-called paradox would be to have some one pick one of two colors of chips (e.g., red and green). One of the chips is worth \$c and the other has a 50-50 probability of being worth either \$0 or \$2c. In this case, the expected value of the game before the first selection is \$c [or .5(\$c) .5((.5)\$0 (.5)(2c)]; and it remains \$c regardless of whether you switch chips or not, since each chip is individually has an expected value of \$c.

Aug 18, 2009
This is hilarious. Cover's strategy "works" because whoever puts the money in the envelopes has a societal bias regarding what is "a little bit of money" versus what constitutes "a lot of money." Therefore, if the envelop you choose has what you and the stuffer think is "more" than an average amount, you should stay put.

Now, if the "money" in each envelope was truly randomized (a number between 0 and infinity) then I think Cover's strategy would break down. It all depends on whether there is a social bias in play.

"Seeming" (a result of unconscious conclusions reached after several plays) can become a valid factor in deciding which envelope probably as more money than the other. Did they filter for that factor? I don't see how.

Aug 18, 2009
The article suggests a strategy that works better than random in the case that you get to apply it over many trials. If instead, your strategy was to have a higher probability of keeping the money in the first envelope whenever the amount of money is small relative to that in the other trials, you would find that you get less money than always going 50/50. The probability of switching that they're using is a function of the amount of money, and the probability of keeping increases as the amount of money increases.

I believe it does make the assumption that there is some (although it does not have to be strictly so) uniformity to the probability driving the amounts of money put in the envelopes. Thankfully (for this strategy), the strategy will tend to work if such a uniformity exists even if you in no way understand it.

Aug 18, 2009
I think I muddled what I was trying to say. The function that dictates the probabilities you should follow in your actions can exist before you see any money values. As long as it increases with the amount of money seen (the parameter supplied to the function), it will bring you better success than always going 50/50. The fact that this successful function can be produced without having first looked at any of the actual amounts is the thing that people might have a hard time considering. Its success does however, I believe, depend on the amounts of money in the envelopes being driven by a somewhat uniform probability distribution.

Aug 18, 2009
No, there is no gain in switching. Their result is most likely due to inability to properly simulate unbound random variables.

This paradox is easily resolved if one understands that if there is a conflict between predictions the prediction which employs full knowledge of the system takes precedence over one which employs only partial knowledge.

In this case the person playing the game has partial knowledge and based on this partial knowledge it does indeed seem to him that there is a gain in switching but this gain is illusory as is clearly seen from the perspective of the person who put the money in the envelopes, he knows the amounts and that no matter which one the player picks and how many times he switches he has equal chance to pick the higher and the lower amount and will as a result win half of the sum on average.

Now the above is only true if the possible amount of money is truly unbound as it is in the original paradox, if there is any limit explicit or implicit (and there always is when humans play such games) switching does make sense in certain circumstances and the strategy described in the article might indeed work.

Aug 18, 2009
You have to first look at the Expected Value of the 2 envelopes, before any opening takes place, and given the half or double parameters.
Half the time you'll get X dollars; a quarter of the time you'll get 2*X; a quarter of the time you'll get X/2.
So [(1/2)(x)] [(1/4)(2x)] [(1/4)(x/2)] = EV = 9/8, which is greater than one, which is why this is not a paradox at all because you are not starting with EV=1 to begin with.

...

Aug 18, 2009
The addition signs didn't take, so here it is with 'plus' written in:
Expected Value =
[(1/2)(x)] plus [(1/4)(2x)] plus [(1/4)(x/2)] = (9/8)x.
So given the parameters the expected value is greater than x dollars before any envelope is opened in the first place.

Aug 18, 2009
This sort of paradox tends to only exist when a theory is misapplied or broken (more often misapplied), and ends up running up against a contradictory result which isn't too hard to see. Of course the paradox is the reasoning rather than in reality, and in the context of corrected reasoning ceases to exist.

Aug 18, 2009
I find it interesting that so many people claim there is no paradox here. The paradox is that our current model of probability theory states that each envelope has a higher expected value than the other, since you could pick either envelope at first. I see that a few people are arguing that there is no paradox because the initial expected value is greater than x, where x is either of the envelopes. Why is this a resolution to the paradox? I would argue that this in fact exacerbates the problem. This analysis shows that, regardless of which envelope's value x represents, the expected value is greater than x--it's like rolling a six-sided die and getting an expected value of 7. It makes no sense. The issue is to resolve this problem, which I think the concept of asymmetry in randomness as described in this article does.

Aug 18, 2009
I suppose I can't help feeling that the word paradox should refer to something more profound than an unexpected (to many) contradiction following from logic (often due to confused reasoning).. though that actually much closer to the proper definition. So admittedly, it is a paradox. Though I could invent any number of paradoxes by intentionally creating really dumb logical arguments. I'm not sure if you were implying that I was saying there is no paradox.. but I was just meaning to point out that the paradox is in the broken logic, so isn't something as mystical as is often implied in sensational media.

Aug 18, 2009
Just as an FYI, you can use the symbol for addition, as well as other symbols, if you enter the encoded values in to the comment text box with HTML escape characters.

+ is entered with '& #43;' (remove the space.)

More:

http://www.integr...ters.htm

Aug 18, 2009
Surely the outcome depends upon the state of mind of the player when they see the amount enclosed in the first envelope. A gambler would probably open the second while a non gambler would not if the value within the first envelope was of sufficient importance to that person.

Aug 18, 2009
But isn't all this probability calculation irrelevant? We are dealing with two discrete possibilities here,
1. You get more.
2. You get less.

So it is a 50/50 chance.

Applying those probability calculations doesn't make sense because of this - in the same way that applying calculations designed for continuous statistical data don't work for discrete statistical data.

Aug 18, 2009
when you switch envelopes you have 100% chance of losing the first \$10.00.

Aug 19, 2009
You have a 50% chance of winning 100% more and a 50% chance of winning 50% less - always a good bet.
This reminds me of the famous Monty hall problem.

Aug 19, 2009
Gaining \$10 by switching is better than loosing \$5 when the odds are fair.

This is what the expected value is telling you.
20 5 / 2 = 12.5

There are two distinct concepts at play - your money and your chances of winning.

Aug 19, 2009
Surely its because the values arn't symmetroical - 5 10 20
and not 10 and then either 15 or 5 ie Gain 5 or lose 5

Aug 19, 2009
Once you have won the first \$10.00. The "max" benefit to switching is another \$10.00 not \$20.00.

Aug 19, 2009
For example, say the first envelope you pick has \$10, so that the other envelope has either \$20 or \$5.

In this case there are 3 (NOT 2) possibilities. One envelope for each of the denominated amounts 5, 10 and 20.
The way they have set this up is not the way people think it's set up - most people believe that there are only two kinds of envelope. They've been led to believe this by the conjuror's trick of distraction - the photograph, the frequent talk about 2 possibilities etc.

In other words it's a con.

Fix the envelopes BEFOREHAND at 3 denominations and calculate the odds - no difference when switching.
Likewise, fix the envelopes beforehand at 2 denominations and calculate the odds - no difference when switching.

Aug 19, 2009
The premise to the whole game is flawed. If the amount of money in the envelope is truly random, with no preset bound, then it will always be infinite. 2*infinity = infinity, infinity/2 = infinity, therefore it makes no difference which envelope you choose. If they are assuming that there is some upper bound to the amount of money in each envelope, then it is not *truly* random.

In other words, in a range of numbers 0...infinity, the odds are 0 that you will pick a non-infinite number, just like the odds are 0 that if you *randomly* pick a real number between 0 and 1, the odds are 0 that it will be rational, because there are an infinite number of real numbers between each rational number.

Aug 21, 2009
It would be more interesting to observer the effect with very large values. Instead of \$10 in the envelope the first envelope contains a check for \$100 Million. do you trade?

Aug 22, 2009
Cover's strategy is well-known. It is equivalent to picking a test value T according to a prescribed distribution, and switching if the amount in the first envelope is less than T. If we call the amounts M and 2M, then this strategy wins if T is between M and 2M, and it gives a 50% chance of winning otherwise. The strategy is described in the rec.puzzles FAQ at http://www.faqs.o...ecision/]http://www.faqs.o...ecision/[/url]

http://www.faqs.o...ecision/]http://www.faqs.o...ecision/[/url]

Aug 22, 2009
Any Street kid knows every bet has two sides. If I have one of the envelopes and Bob has the other we both have the same deal. same odds. If we switch we both switch, and these guys are saying it gives us both a better deal? hehehehe and then they say if i have stock abc and bob has stock bca if we switch we will both make more money? Only question is why are these geniuses trying to peddle this deal. Its a dumb variation of the old change for a twenty con. send them my way i have some deals for them right here.

Aug 23, 2009
I haven't worked through the
symmetry-breaking argument properly, but it seems to me to be one of those
"paradoxes" which involve 0 or infinity in a disguised form. If, say, one of
the numbers involved is selected at random from ALL numbers, then there's
only an infinitesimal chance that it will be less than the number of
particles in the universe. I vaguely remember Q Comp stuff saying that it
would then be theoretically non-computable(?). ie not simply impractical.

So the alternative would be to bias the selection of the number towards
smaller numbers. Then if the number in the first envelope was large, by some
criterion, then there would be

Aug 23, 2009
Last bit of comment:
So the alternative would be to bias the selection of the number towards
smaller numbers. Then if the number in the first envelope was large, by some
criterion, then there would be

Aug 23, 2009
Sorry - try again (new here)

Then if the number in the first envelope was large, by some
criterion, then there would be

Aug 23, 2009
Oops!

But maybe mathematicians have ignored that, because the criterion seems to
me to have to be a social one. (How you do the experiment IN PRACTICE - how
big the numbers really would be. How could the bias not be social rather
than mathematical?

Aug 24, 2009
I figured it out.

The number you get in first envelope is not a true random. There is 75% probability that the number is smaller than half of maximum amount and 25% probability that it is bigger than half of maximum amount.

with true random you can even play double or nothing and win just about 0. In this case there is double or half and you can win about 0 on average.

Aug 24, 2009
...forgot to say that it is in case when one envelope has random number and another has double that amount.

Aug 24, 2009
First envelope has \$10 , the other one could have either 5 or 20.

3 possible scenarios played out.

1 keep the tenner!!!
2 swap and get a 5
3 swap and get a 20

The probability of 2 and 3 are equal ...

However the average return is not 12.5.. it would be approximately 11.25.

My reasoning ...

A series of plays of the game where the envelopes contain 10 and 5 , assuming you always open the 10 and swap half the time results in an average return of (10 5 ) /2 = 7.50.

A series of plays of the game where the envelopes contain 10 and 20 , assuming you always open the 10 and swap half the time results in an average return of (10 20 ) /2 = 15.

And a series where of the game where 10 is always opened and swapped all of the time and assuming a 5 or 20 half the time the average return would be (7.5 15)/2 = 11.25.

Aug 25, 2009
Imagine a game show in which you've won the right to open one of two envelopes, and switch if you want. You open one with 1000 pounds in it. You then have to guess what the budget of the show is to know if it's worth switching. In practice they would rather give away 500 pouds than 2000!

Aug 25, 2009
Interesting that everyone seems intent on expressing their own take on this. I've just got round to reading the other posts, and it seems that superhuman amongst others have got it. Reality breaks the symmetry of infinity? :-P

Aug 28, 2009
It is an interesting question.

I think the game is supposed to be analogous to some class of physical problem. It seems likely that the game rules, as stated, don't model a real-world problem sufficiently to be very enlightening. It is under-constrained.

With additional constraints one could make better decisions. For example: envelopes may only contain a quantity of dollar bills. Envelopes cannot contain more than N bills. Envelopes look the same regardless of the quantity of bills contained. The object of the game is to maximize the quantity of bills collected for a set number of trials. Etc.

Without the additional rules the game model just isn't descriptive enough to match a real problem.

It sounds as if the Cover strategy is adaptive, discovering and remembering the 'house limit' as it goes along, effectively added a constraint to the model (knowledge of the 'house limit').

They don't describe how it goes about this, but for the sake of discussion, I'll assume a simple model of remembering the greatest quantity of money found in an envelope and mapping the probability of switching to the range of zero to the remembered value.

Such a strategy could be improved upon by expanding it further.

For example, if such a strategy were used against another player, that player might initially put very large quantities in the envelope in order to train the algorithm to expect a high house limit. The player would then proceed by putting only low values in the envelope, thereby causing the Cover strategy to very frequently choose to switch, greatly reducing it's effectiveness.

The strategy can combat this by maintaining a memory of all the values it has seen, recognizing distribution patterns and choosing a switch probability based on the distribution discovered and the current value.

Again, this adds more constraints to the model (knowledge of the distribution of values).

Sep 06, 2009
The value (the ratio) of the switch vs. not-switch is 1. That is, the probability of "\$12.5" is not telling what money will come, but what the relative values of switching vs. not-switching is.

First half of @paulthebassguy Aug.18,2009 sounds like this, too. Right. But, that is not about discrete vs. continuous, but value-judgment vs. absolute-value, I think.

If you would think as a long-term (only-probability, not value-weighing) view, then you have to reflect from the start.
1) drawDoubleFirst vs. drawSingleFirst
2) switch vs. not-switch
The "paradox" ignores the first step. The question of switching's value must sum both.
X1 = firstWasDouble * switch + firstWasSingle * switch
X2 = firstWasDouble * notswitch + firstWasSingle * notswitch
X1 = X2
Thus, there is no paradox, again. No gain through a strategy of long-term switchings like that.

@Prsn Aug.19,2009 sounds like telling this, too.

Thus, there is no paradox. Do we three suffice?

Sep 06, 2009
Oops. Rewriting. The first is better expressible like this:

The value (the ratio) of the switch vs. not-switch is 1. That is, the probability of "\$12.5" is not telling what money will come, but what the value is. That value is the same, if you would reflect "what is the value of not-switching?" That is \$12.5, too. That is, holding \$10 has a value of \$12.5 of notswitching, too.Thus, the question is not absolute-value (in dollar terms), but for weighing alternative options, by their relative-values.

Good for reflecting among a few options. Not necessarily only two.