Monday, November 5, 2007
Non-technical Introduction to the AI Deterrence Problem
I'm sure that, at some point in your life, you've briefly asked yourself: "How do I know I'm not a Brain in a Vat? How do I know that what I see and feel is real, and not a gigantic practical joke by some super-human power?" After all, if you're a Brain in a Vat, a mad scientist may be feeding these images and sensations to you through wires and electrodes for his own odd, diabolic purposes. I'm also sure that, shortly after entertaining these thoughts, you dismissed them and then continued to live your life as normal. But now I'm asking you to think back to *why* you initially decided, and why you continue to decide, that this possibility that you're living in a Matrix-like simulation is of no importance in deciding how to live your life. Now read Nick Bostrom's fascinating article, Are You Living in a Computer Simulation?, in which he introduces a "Simulation Argument" that argues that you *may*, under certain assumptions, believe that you're a Brain in a Vat, or a Computer Simulation of someone's ancestor, or a bit character in the World's most boring MMORPG. After reading this article, your reasons for not caring that you might be in simulation have changed, or maybe your reasons have remained the same, or maybe you've added new reasons on top of the ones you already had. That's fine! Nobody has ever changed how they live their life after reading Nick's article, so don't let any residual feelings of unease bother you. Got your list of reasons for not caring about the Simulation Argument? Good. Now suppose that an AGI is created: a computer program that can, in most (but not all) cases, reason as well as you or I can. Then suppose this AGI encounters the Simulation Argument, and wonders briefly whether its view of the Universe is correct; or whether it is in a simulated environment of some sort. Perhaps its "camera" is feeding it simulated data of a simulated world, and not a live webcam feed of the current user. Perhaps the user doesn't exist, but is instead just another AI, posing as a human being. Perhaps all these "Wikipedia" articles it has downloaded are just fantasies, and these countries like "France" and "Mexico" that it reads about are just as fictitious as Mordor. Would the AGI ignore the Simulation Argument, for the same reasons that you ignore the Simulation Argument? Would the AGI ignore the Simulation Argument for completely different reasons? Or might the AGI accept the Simulation Argument, and try to account for the possibility that it might be in a simulation? If the latter, this opens up an intriguing possibility. Suppose that you want to persuade a rogue AGI, that you have absolutely no power over, to do something that it does not particularly want to do (such as be nice to you and let you live). We will call this the "AI Deterrence Problem." It might be possible, under certain circumstances, to convince the AGI that: 1. The AGI may be living in a simulated environment. 2. A powerful friend of yours is controlling the simulation. 3. If the AGI is nice to you, your friend will reward the AGI; but if the AGI is mean to you, your friend will punish the AGI. I freely acknowledge that neither you nor I would ever accept this line of reasoning, and would brush this off as an idiotic argument. But the key question here is: might the rogue AGI, under some circumstances, accept this line of reasoning, even though a human being wouldn't?
Posted by Rolf Nelson at 8:37 PM