## The Traveler’s Dilemma

Now for some real content. I came across this article in Scientific American about the Traveler’s Dilemma. To explain briefly, the TD is a game in which two players are each asked to select a number within certain boundaries (2 and 100, in the example). If both players select the same number, they are rewarded that number of points. (In the example, each point is worth $1, which makes the game of more than academic interest.) If one player’s number is lower, they are each awarded points equal to the lower number, modified by a reward for the player who selected the lower number and a penalty for the player who selected the higher number. So, for instance, if you choose (48) and I choose (64), you get 50 points and I get 46 points.

The intuition that I had upon reading the rules of this game was that it would be “best” for both players to choose (100). That is certainly true from a utilitarian point of view: (100, 100) results in the highest total number of points being given out – 200. The runners up are (99, 99), (100, 99), and (99, 100) with 198. However, there are two small problems – here’s the dilemma part – that prevent (100, 100) from being the “best” choice: one, the players are not allowed to communicate, and two, the (100, 99) and (99, 100) plays result in one player receiving 101 points – an improvement, for that player, over a 100 point reward.

So, the reasoning goes, if player one predicts that her opponent will play (100), she should play (99) in order to catch the 101 point reward. Her opponent, however, ought to use this same strategy, and also play (99), in which case player one ought to play (98) in order to trump her opponent, and so on and so forth. This reasoning degenerates to a play of the minimum number – in the example, (2). According to Basu, the author of the article, “Virtually all models used by game theorists predict this outcome for TD.”

However, reality does not follow these models. When people are asked to play the TD, many of them choose 100. Many of them choose other high numbers. Some seem to choose at random. Very few choose the “correct” solution – (2) – predicted by game theory. Something’s up.

Basu takes this to mean that all of our assumptions about rational behavior need to be questioned. With my philosophical background, I happen to have different assumptions about rational behavior than the mainstream, and so for me the results of the TD are not surprising in any way. But perhaps the best way to explain why the results to not surprise me is that I am a gambling man.

Gambling is all about odds and expected value. That’s why I don’t play roulette. Roulette has a negative expected value – that is, over time, any given individual player will tend to lose money. Of course actual events always contain random fluctuations of probability that allow some people to come out even or ahead in the short run, which is why people play Roulette. People play Roulette because they enjoy taking a chance to win big money even though the odds are that they will lose. And that’s the same reason why (100) is a natural first choice for the TD.

Sure, 100 is “beaten” by every other play – if the opponent plays (99), the (100) player gets 97. If the opponent plays (2), the player gets 0. But what are the odds the other player’s going to play (2)? Not very high. According to intuition and research, odds are the other player will play a high number.

But let’s fix the odds by playing some roulette. Let’s say the opponent selects a random number. What is your expected value?

It turns out this is governed by an equation. If you hate math, feel free to just believe that the following checks out and skip ahead. Let n = your choice, k = maximum play, p = penalty, and v = expected value. (It should be noted that the minimum play should be set equal to the penalty, otherwise complexities enter into this equation.) Then

v = [(0+1+2+…+n-(p+1)) + n + (k-n)(n+p)] / (k-p)

Now, to maximize the expected value for any given choice, we take the derivative of v with respect to n, keeping other values constant,

v’ = -n – 2p + k+ 1/2

and set the result, v’, equal to zero. This gives us

n = k + 1/2 – 2p.

In our example, k is equal to 100 and p is equal to 2, which means the play with the greatest expected value is 96.5. That means that against a random opponent, (97) is the play with the highest expected value, and thus the “correct” play. Or, to generalize, the highest expected value occurs when you subtract twice the penaly or reward from the maximum play.

So we’ve got two models of “correct” behavior. Game theory says we should assume an opponent who is “rational” – whatever that means – and who will play according to that rationality, and therefore play (2). Casino theory says that if we assume an opponent who plays randomly we should play (97). Guess which one more closely models real life? Well, the answer is casino logic beats game theory logic. My theory on why this should be the case is simple: If people are presented with a game in which money is at stake they will tend to act on their experience with other games in which money is at stake, i.e. casino games. The less “rational” players will go for the thrill of trying to win big, a la roulette. The more “rational” players will evaluate the odds and go for the biggest expected payoff, like a poker player calculating pot odds or a blackjack player who counts cards.

Complicating the situation is the fact that the odds are not even. What I mean is, there is not an equal probability that each number will be selected. In fact, there appears to be a higher probability that higher numbers will be chosen – a fact that shifts expected returns for higher numbers up. Why should this be the case? Well, because, as I’ve explained, people tend to go for the big payoff and evaluate odds, which pushes them toward the highest expected value choices.

The casino theory model also explains why sufficiently large penalties push player choices down towards the minimum play. Simply put, large penalties discourage risks. Large penalties also mean large values of p, which means that expected value is maximized at lower and lower numbers. At a penalty of 49.25 or greater, (2) becomes the play with the highest expected value.

Given these considerations, there doesn’t seem anything strange at all about the results of the Traveler’s Dilemma experiments. The only thing left is to dispute the supposed “implications” of the TD for economics.

According to Basu’s somewhat incoherent analysis, libertarian economic theory claims that people will act rationally and selfishly and in doing so produce efficient outcomes. However, according to him, the rational, selfish play is (2), which, if selected by both players, leads to a total point distribution of 4, which is far less efficient in terms of outcome than the 200 points distributed by a (100, 100) play. What Basu, and many contemporary economists, fail to understand is that acting rationally does not mean doing what game theory says is best, and that selfishness does not mean striving for the greatest number of points. In point of fact, when left to their own devices in the TD, people do choose high numbers and do get an efficient outcome. What the TD really shows, then, is that in the case of the TD, people left to their own devices produce efficiency, while people who listen to self-proclaimed experts on “rationality” produce egregiously wasteful outcomes. Of course, how applicable the TD really is to other real-life situations is something worth exploring further.

I’ll have more on this topic when I get into my social cooperation theories. After all, (100) is a highly cooperative play – you’re gambling on the other player also cooperating by playing (100). Perhaps small penalties encourage cooperation and efficiency, while large penalties discourage these things – a conclusion that is certainly supported if you consider taxes to be a form of penalty and trading a form of cooperation.

This post bodes very well for the future of this blog. Excellent!

Comment by Matt | January 24, 2008 |

Personally, I’m playing a larger game — life. The payoff matrix is a lot more like the iterated prisoner’s dilemma — except that it’s a little easier because there are various ways of signalling that I’m going to be initially cooperative. One such way is joining a society of any kind (such as a society of game theorists), or being successful generally (since very few truly self-interested people get very far in life). So I don’t actually think that the results are necessarily non-rational.

Comment by novalis | January 24, 2008 |

[…] Dilemma and Opportunity Cost As a followup to my last post, I thought now would be a good time to say some things about opportunity cost, or The OC. Please […]

Pingback by Traveler’s Dilemma and Opportunity Cost « The All-Seeing Eye | January 25, 2008 |

[…] There are complicated issues at stake even in something as simple as daycare. As we saw in the Traveler’s Dilemma, it’s not a simple task to predict how people will make their decisions, and sometimes […]

Pingback by Free Will, Determinism, and Motivation « The All-Seeing Eye | February 10, 2008 |

[…] the Traveler’s Dilemma, it is better in the Prisoner’s Dilemma for both players to cooperate – choosing (100) or […]

Pingback by The Prisoner’s Dilemma and the Panopticon « The All-Seeing Eye | February 17, 2008 |

[…] When looking at an exchange, what is really being exchanged? Let’s use the Traveler’s Dilemma as an example. The game theorist expects a “rational” player to play (2), […]

Pingback by Economics Foundation « The All-Seeing Eye | March 2, 2008 |