January 15, 2013

Speaking of inventive octogenarians

I won't pretend to understand this, but physicist Freeman Dyson, now 89, recently helped come up with a what may be a breakthrough in evolutionary theory:
Iterated Prisoner's Dilemma Contains Strategies that Dominate Any Evolutionary Opponent

10 comments:

anony-mouse said...

Not to mention those really expensive vacuum cleaners.

Anonymous said...

Or those really expensive shells-around-suns.

V said...

Since I do understand that particular paper (and it's a fun one!) I'll try and explain it simply:

You and I are both programming simulated robots. The robots live together, and each day the robots either "cooperate" or "defect," and get points based on how both robots acted. The robots don't have very much memory- they only remember what happened in the last day.

Most of the programming is already done for us- we just have to specify four probabilities, of how to react based on what happened the last day. If both of us cooperated yesterday, maybe I should cooperate with probability .99; if you defected and I cooperated, maybe I should cooperate with probability 0. The 'fair' strategy is tit-for-tat, which cooperates with probability 1 if the other player cooperated last round, and cooperates with probability 0 if the other player defected last round.

Since this is a really simple setup, we can run through the simulated days really quickly. In fact, if we just type in our four probabilities, we can immediately calculate the long-run average score each of our robots will get! (The mechanics are unimportant, and I'm sweeping some caveats under the rug; the important part is that we can go from the 8 probability numbers to 2 scores, one for each robot.)

Now that we understand the robots, let's take a step back, and look at us. We get to look at the 8 probabilities and the 2 scores and adjust our 4 probabilities. Suppose we get a constant stream of money based on our robot's long-term average score, instantaneously calculated from the current probabilities, and we can adjust the probabilities in real-time.

Suppose I decide that it would take way too much of my time to continuously monitor the scores and probabilities, and so I set up a simple optimization program that looks at your four probabilities, determines which probabilities I should input to maximize my score, and then do those. This allows me to walk away and do other things. (This is the evolutionary strategy, which does not have a theory of mind.)

You've now hit the jackpot, because you can choose your probabilities to be arbitrarily slanted in your favor. Instead of both of us having an average score of 3 (for both playing tit-for-tat), you can have an average score of 4- but my average score will then only be .7. The combined scores are lower, but more of it is in your pocket!

But suppose that both of us are sitting at the computer, ready to change the probabilities whenever the other one does. Now, when you try to extort me, I can stonewall. You want to get 4 on average? Well, how about I defect all the time, and you get no points? How do you like being greedy now?

But this is where the theory of mind comes in: it only makes sense for me to sacrifice my own points if I think you're sitting at your computer! If you set up an extortionate strategy and then flew to the Bahamas, I get to choose between earning no points and earning some points, and my decision of which to prefer depends on how I think that'll impact your behavior in the future.

The paper's contribution is showing a simple way of looking at IPDs, and shows that long-term people with theory of mind can take advantage of short-term people without theory of mind. (In particular, evolution is short-term and doesn't have a theory of mind.)

Steve Sailer said...

Dear V:

Thanks.

What are some implications?

Anonymous said...

Freeman Dyson is one of the great thinkers of the 20th Century, and it's great to see him so sharp and vital 13 years into the 21st.

bdoran said...

I have actually solved the real thing, as did the shall we say "other player." If you're working class you've been hearing it since childhood, although not so much anymore.

Here it is: Don't rat.

Don't betray.

Somehow this simple concept utterly eludes our elites.

That this game theory so dominates the thinking and time of our educrat elites tells me all I need to know about them, not that there isn't plenty to learn. But you see it all points the same direction.

Anonymous said...

Interesting stuff.

"What are some implications?"

Heroes are necessary.

Bill said...

Ah, this is great stuff -- extremely useful for understanding and applying legal strategy. Thanks for the tip.

Anonymous said...

there was another strategy mentioned elsewhere, called zero determinant strategy, whereby an agent with superior access to information could exploit simpler minded tit for tat agents, but I don't think that they can do anything about always defect agents.

http://golem.ph.utexas.edu/category/2012/07/zerodeterminant_strategies_in.html

Anonymous said...

hmm it seems that this paper is talking about that strategy after all.