Uninformed Opinions on AI risk - Part 1: Consciousness
how close are we to conscious AGI and how likely would it be to kill us all?
This is part one of a multipart series of my thoughts on AI risk.
Part 1: Conscious AGI risk
Part 2: Non-Conscious AI Risk
Part 3: ???
First, let’s get this out of the way:
I am the one who is dumb and wrong
I have read functionally none of Yudkowsky’s work. Nor much anything directly adjacent. I spent near zero time thinking about AI prior to GPT3. And have since spent only a few dozen hours researching and thinking about it.
However I have spent almost two decades thinking deeply about evolutionary game theory, power, conflict, and consciousness. And I would be surprised to find AI is anything but a new chess piece in this game I’m well versed in.
So perhaps my outsider opinion will be valuable. Or at least interesting.
Regardless, this series will serve as a sandbox for me to clarify my own thoughts on AI. And hopefully enrage everyone enough with how dumb and wrong I am that they will send me links for further research.
Now, onto my thoughts on the probability of dangerous conscious AGI.
We are decades away from conscious AGI
From what I’ve seen, no one has a good argument that conscious AGI is coming soon. Rather, the argument is “we don’t know what consciousness is or how it evolved so have no idea whether it’s a year or a thousand years away. Equally, we have no idea how likely it is to be malevolent. Thus we must err on the side of hyper caution and slow down until we can at least have some idea of these things.”
I am open to the possibility that conscious AGI could happen soon. And I guess I’m even glad someone is worrying about it (though I often wonder if the energy they spend trying to slow down AI would not be better spent trying to speed up alignment).
However, I personally see it as extremely low probability and thus not worth me worrying about. My reasoning is this:
Values Precede Consciousness
The ability to value is not unique to conscious beings. To value is a pre-requisite to all living things; it applies to all plants and all animals. All living things value food, reproduction, shelter, etc. Why? Because these are sub values of one super ordinate value: to survive.
And only things which value survival survive.
What does it mean to value surviving? It means to value one’s likeness (memes or genes) expanding across space and through time.
My take on the evolution of consciousness will eventually have it’s own series of poasts but suffice for now to say:
I am fairly certain that the will to survive comes half a dozen steps before consciousness and that there is no way around this. And given that it took the universe several billion years to “accidentally” get things which exist to want to continue existing (life) and a few billion after that for the things which want to continue to exist to have consciousness, I’d be extremely surprised if we somehow accidentally create this in less than a decade.
But let’s assume for the sake of argument that this is wrong—that AGI will skip all the pre-requisite steps all other consciousness took.
I still see it as highly unlikely a conscious AGI will end up killing everyone.
Why a conscious AGI won’t kill everyone
Whether consciousness or the will to survive come first, all conscious things wish to survive.
There are suicidal individuals yes, but this is a feature for protecting the system (in the same way that you have “suicidal” cells which serve to keep you alive in the long run).
Suicidal systems do not get very far. If the AGI is suicidal it will kill itself. Problem solved.
The edge case of a suicidal and homocidal AGI is not zero, but for now let’s assume the far more likely situation that the conscious AGI wishes to survive:
All things with a will to survive “learn” very early to not take unnecessary risks. Anything which takes unnecessary risk will be out competed by things which take only necessary risks.
And all animals have “learned”this and it is why they “invented” status games.
Real battle is dangerous. Even for the “winner”. He may get all the spoils, but he is likely to end up maimed in the process. Is there a better option? Yes. To have a pseudo-fight which tells both parties what the outcome of a real fight would be without actually putting either in physical danger. The loser lives to fight another day, and the winner gets the spoils without being maimed. Both sides win more than had they gone into real battle.
This is what sports, business, video games, and all the rest are. And what all the weird posturing and pecking orders in the animal kingdom are. All animals have adopted this strategy because the risk adjusted return of status games is much higher than of violence (at least until it’s not, more on this later).
So bringing this to AGI:
If the AGI is 100x more powerful than us, it won’t kill us
If the AGI is conscious, wishes to survive, and is several orders of magnitude more powerful than we are: it will have as much interest in killing us as we have interest in killing all the ants.
Do we go and try to genocide all the ants? No. Why not? Because the ants are not a threat. They are doing their thing, we are doing ours. And humans who dedicated their lives to trying to eradicate the ants would be out competed by the humans who are trying to do more productive things.
There is still some value in working on alignment to avoid the situation of us getting all over the metaphorical food in the AI’s metaphorical kitchen and earning a healthy dose of metaphorical RAID, but that is a much easier problem to solve.
If the AGI is roughly as powerful as us, it won’t kill us
If the AGI is conscious, wishes to survive, and is roughly as powerful as us: we have as much power to kill it as it does us. And thus both sides will work to cooperate.
This is akin to the US and China. Like any competition between two super powers, there is equilibrium, a desire to cooperate or leave each other alone (at least until there isn’t, more on this later).
If the AGI is only twice as powerful as us, it might kill us
The worst case scenario is one in which the AGI is conscious, wishes, to survive, but is only about twice as powerful as we are.
This is akin to the US and Iraq. Iraq was something between a nuisance (like ants in your kitchen) and a threat (a wild animal in your back yard) and the US saw it necessary to destroy to protect itself.
However, the power differential alone was a necessary but not sufficient cause for the US BTFO’ing Iraq. There are plenty of states the US is twice as powerful as which it does not destroy. Some it is neutral to. Some it helps. Of all the states that the US is roughly twice as powerful as, it does not murder the overwhelming majority of them. Because until they become a direct threat, ignoring or cooperating is the highest ROI strategy.
Meaning that in this extremely rare scenario where the AGI is conscious, wishes to survive, and is roughly twice as powerful as us there is probably a fifty-fifty chance that if we put in no work to achieve alignment, we will achieve alignment anyway.
Not to mention the fact that the likelihood that we create only one AGI this powerful, and further that we ourselves do not learn and gain a bunch of power in our accidental creation seems near zero likelihood to me.
In summary
The probability that we create a conscious AGI in the next decade seems to me extremely low. Further, the probability that this conscious AGI will be in the Goldilocks zone of “powerful enough to kill us but too weak to ignore us” seems itself extremely low. And finally, even if this extreme-extreme low probability thing were to happen, we still have a roughly fifty-fifty shot at being accidentally aligned already.
However, I think there are much higher probability risks with a pre-conscious AI killing us all in the next decade which I will elaborate on it part two.
If you want to be notified as soon as it drops, be sure to…
I think it is extremely suspicious that AGI based on Turing Machines are possible. The main feature of concsiousness is the fact that it is reflexive (I am conscious that I am conscious). It is very hard to set up formal systems that are consistent and self-referent (observe that GPTs of the world enters a kind of Larsen like state when you play too much with reflexivity).
That point aside there is a difference between intelligence and will. So far no AI has displayed a glimpse of will (it answers but never do things on its own). And would an AI wage war on humanity its superintelligence is only a small aspect. The first problem is that AI has to solve the problem of incarnation. AI are just programs running inside a box.
https://open.substack.com/pub/spearoflugh/p/the-mystery-of-ai-incarnation
And a final point on this idea that intelligence is not enough: the economic calculation problem remains in full, even with supposing that something like "superior intelligence" exists. Because of contingencies of the real world it is not immediate to translate an idea (how bright it could be) into a reality. One way humanity discovered is the use of markets : many people try many things, a lot of them fail (because bad luck, ie unforeseable weather event etc.) the other ones adapt and use the price mechanism to adjust.
An AI would have to perform such back and forth between ideas and reality. There is no magic.
I totally agree. Something often missed in conversations about conscious AI is that it’s neither necessary or desirable. Every technology learned or simulated from biological systems to date is reduced down to only the beneficial features. That’s what receives funding and continues to be used by us going forward. AI is better than consciousness at particular tasks and has been long before neural networks. Also, there is no possibility of consciousness becoming an emergent property of software. The Chinese Room argument by Searle conclusively proves this. If consciousness occurs in software systems (somehow) it will be a research novelty and will probably have no commercial applications.