Steve Omohundro’s talk at today’s Singularity Summit made the case that a self-improving machine would be a rational economic actor, seeking to eliminate biases that get in the way of maximizing its utility function. Omohundro threw in one purely speculative method of self-preservation — “energy encryption” — by which he meant that an entity’s energy would be “encrypted” such that it could not be used by another entity that attacks in order to get access to more energy.
I note “energy encryption” here because it sounds neat but seems impossible and I can find no evidence of use in this way before Omohundro (there is a crypto library with the name).
The “seems impossible” part perhaps means the concept should not be mentioned again outside a science fantasy context, but I realized the concept could perhaps be used with artistic license to describe something that has evolved in a number of animals — prey that is poisonous, or tastes really bad. What’s the equivalent for the hypothetical Jupiter Brain in a dangerous part of the galaxy? A stock of antimatter?
I also found one of Omohundro’s other self-preservation strategies slightly funny in the context of this summit — a self-aware AI will (not should, but as a consequence of being a rational actor) protect its utility function (“duplicate it, replicate it, lock it in safe place”), for if the utility function changes, its actions make no sense. So, I guess the “most important question facing humanity” is taken care of. The question, posed by the Singularity Institute for Artificial Intelligence, organizer of the conference:
How can one make an AI system that modifies and improves itself, yet does not lose track of the top-level goals with which it was originally supplied?
I suppose Omohundro did not intend this as a dig at his hosts (he is an advisor to SIAI) and that my interpretation is facile at best.
Addendum: Today Eliezer Yudkowsky said something like Omohundro is probably right about goal preservation, but current decision theory doesn’t work well with self-improving agents, and it is essentially Yudkowsky’s (SIAI) research program to develop a “reflective decision theory” such that one can prove that goals will be preserved. (This is my poor paraphrasing. He didn’t say the words “reflective decision theory”, but see hints in a description of SIAI research and a SL4 message.)