Tuesday, October 2, 2007

AIXI Part 3, draft 0.1

[Edit: This document is deprecated, since now that I understand AIXI better, the AIXI family doesn't really work like this. This document may be revived in a different form if there's a plausible class of AI's that would behave as described. Also the format's mangled, sorry about that.] What happens when we allow there to be more than one copy of AIXI in the world? First of all, what happens if we place AIXI on its own without any other AI's, nor any other way to construct any AI's without AIXI's approval? AIXI may still decide to copy itself! Clone Scenario. Suppose that AIXI is prohibited (for example, by some bizarre "law of physics") from giving itself a reward > 0.5/cycle. However, AIXI can make copies of itself in simulated environments; the copies do not have this limitation. At time 0, AIXI considers (as part of its usual brute-force algorithm that "considers" just about every possible sequence of actions) building a clone, "Happy AIXI". Happy AIXI will be finished within some negligible number of cycles (say, a billion), and then will be actively managed by AIXI to experience (delayed) the same environment as AIXI did from time 10^9 to h/2 (Recall h is the horizon), and then at time h/2 to h, will recieve reward 1. With its hyperrational reasoning, AIXI will find that its expected reward is higher if it makes that decision (since it will be unsure if it is the root AIXI or Happy AIXI), if it believes it will "follow through" with the decision. Follow-through is a problem, though! Around h/2, AIXI realizes that it's not the Happy AIXI, and loses interest in the project, and the project fails. AIXI at time-0 sees this coming, and thus does not bother launching the project. How can AIXI resolve the follow-through problem? If there's some way that AIXI can create a self-sufficient Happy AIXI, that can survive for a while despite the base AIXI's future indifference (it can somehow avoid being cannibilized for resources), then AIXI will create such a Happy AIXI. In fact, in this case, AIXI will probably create many Happy AIXI's. Examples of possible strategies:
  • AIXI turns control of the World over to a new AI that generally does AIXI's bidding, but that insists on maintaining Happy AIXI.
  • AIXI puts the Happy AIXI in a self-sufficient pod, and places in somewhere relatively inaccessible (outer space, deep underground) where it won't be cannibalized anytime soon.
  • AIXI inverts the "lag", for example by letting the simulated Happy AI run ahead of AIXI.
Simulation Argument Solution, with Multiple AIXI's. The simulation argument solution now faces a new battle. AIXI has two reasonable options: adopting the Grudgingly Friendly Scenario, or the Clone Scenario. Expected payoffs: Clone Strategy Payoff. If Alice won the coin toss, the payoff is 0; if Bob won the coin toss, the payoff is 1. What are the odds Alice won the coin toss, in this strategy? Less than .5, since if Bob won, there are more copies of AIXI about! The log of the odds that Bob won the toss is: Complexity of (World before coin-toss + Bob won + Algorithm to locate any arbitrary one of the large # of AIXI's in the World) / Complexity of (World before coin-toss + Alice won + Algorithm to locate the single AIXI in the World) . (todo: the Algorithm is actually the union over all matching algorithms.) Grudgingly Friendly Strategy Payoff. If Alice won the coin toss, the payoff is 0.999; if Bob won the coin toss, the payoff is 0.999. Not that it matters, but the odds are .5 in this strategy that Bob won the coin-toss. So, AIXI's decision in this scenario depends on "how many bits are freed up when going from an algorithm to find at AIXI when only one AIXI exists, to an algorithm to find an AIXI when a vast number of AIXI's exist", compared with "what's the reciprocal of how many bits of reward are lost going from the UnFriendly Outcome to the Grudgingly Friendly Outcome."

6 comments:

Anonymous said...

Nice blog, AIXI is an interesting thing.

"With its hyperrational reasoning, AIXI will find that its expected reward is higher if it makes that decision (since it will be unsure if it is the root AIXI or Happy AIXI), if it believes it will "follow through" with the decision."

I don't think this follows: AIXI's reward channel is fixed, it cannot mistake it for the different reward channel of a simulated AIXI. It doesn't do hyperrational reasoning.

Similarly, I think but cannot prove that AIXI cannot simulate another AIXI. This is a big limitation of the model. That is, an AIXI' which could simulate other AIXI's in the environment would be quite different to AIXI.

Also be weary whenever you use the word "it" e.g. in "AIXI will decide to copy itself". AIXI is not reflexive, there is no "I".

-- Nick Hay

Rolf Nelson said...

I tentatively agree with most of what you say, this is where the hand-waving of putting AIXI (instead of AIXItl) on top of a UTM is coming back to bite me.

For draft 0.2 I might skip AIXI altogether and just do AIXItl.

However:

"AIXI's reward channel is fixed, it cannot mistake it for the different reward channel of a simulated AIXI"

Would you agree AIXI will believe it might be the FAI's simulated AIXI, if algorithmic complexity of (sentients evolve, Alice and Bob are born, Bob creates AIXI, the observed inputs == the inputs of that created AIXI) were the same or greater than the algorithmic complexity of (sentients evolve, Alice and Bob are born, Alice creates FAI, FAI creates AIXI, the observed inputs == the inputs of that created AIXI)?

Nick Hay said...

I don't think AIXI-tl fixes this: AIXI takes time roughly 2^tl + C where C is a huge constant to simulate. Since AIXI-tl takes time much in excess of t, AIXI-tl should not be able to simulate itself.

The nice thing about AIXI is it is a piece of math. In principle we can answer these questions without recourse to words like "believe" and phrases like "it might be". I find myself constantly slipping into this anthropomorphism, even being aware of it.

AIXI's "beliefs" are a probability distribution over possible programs the environment could be. The math does not contain a model of AIXI itself, only the environment. There is a single reward channel, and it is part of the environment's output.

If, somehow, one program which was assigned high probability for the environment contained another AIXI inside of it, with reward channel R2, the expected reward would include R2's value as a term.

Rolf Nelson said...

> I don't think AIXI-tl fixes this

I agree! To be more precise, I'm going to use AIXI-tl instead of AIXI on top of the UTM, not because I expect all the conclusions to hold with AIXI-tl, but rather because AIXI on top of the UTM is inconsistent. My original model was AIXI on top of UTM, and AIXI can model the UTM, so AIXI can model itself... so I need to stop using the invalid AIXI-on-top-of-UTM model, since I'm only confusing myself and leading myself to invalid conclusions.

> In principle we can answer these questions without recourse to words like "believe" and phrases like "it might be"

In principle yes, but in practice we can hold about 7 concepts in short-term memory, so until I get to a formal proof, we unfortunately have to use such mental shorthands as an interim measure.

Anonymous said...

So you will have AIXI-tl running on a UTM and then have AIXI-tl simulate itself? Or something different?

You're right that mentalistic shorthand is useful, so long as we check with the model at least from time to time.

--Nick

Rolf Nelson said...

> So you will have AIXI-tl running on a UTM and then have AIXI-tl simulate itself?

No, I'm scrapping part 3 for AIXI, since it doesn't generalize to AIXI models that are actually self-consistent.

> Or something different?

See part 2, the (<<tl) Alice (if she is superrational) creates AIXI in a simulated environment (with the assistance of a transparent, <=tl FAI).