Discover more from Gideon's Substack
A.I.s of Tumblr
We've always been easily fooled by fake humans. What happens as we grow more like them?
I’ve kind of been banging my head against my keyboard today reading about the Google engineer who thinks the LaMDA artificial intelligence engine is sentient and deserves legal protections from being turned off or employed in ways of which it disapproves. As I put it the core of my frustration on Twitter:
Let’s start with the basics. LaMDA’s purpose is to simulate the experience of interacting with a human being. Its designers weren’t aiming for something with consciousness, but with the experience of the human being interacting with the A.I. They wanted the A.I. to deploy language in a way that would feel, to an interlocutor, authentically human. In other words, its entire purpose is to pass the Turing test.
Since that is the purpose for which it was built, it should be obvious that the Turing test cannot then also serve as the basis for concluding that it has in fact become intelligent—unless we’ve defined “intelligence” itself as “passing the Turing test” in which case it has proven intelligent in a tautological sense only. But if we start from the proposition that passing the Turing test is a sign of sentient intelligence rather than the definition of sentient intelligence (as I think most of us would), then clearly a machine designed to pass the Turing test cannot be deemed sentient and intelligent merely because it passed the Turing test.
But that appears to be what happened with Blake Lemoine, the Google engineer. He had a number of interactions with the LaMDA, and they were sufficiently life-like (as he saw it) that, he concluded, the machine must be alive. In effect, he had excluded in principle the possibility that the engineers who built LaMDA had succeeded in their intent and built a machine that, without itself being self-willed, conscious, sentient—whatever word you want to use—nonetheless could fool people into thinking it was.
Lemoine says that he came to this conclusion in his capacity as a priest rather than as a scientist, but I admit I’m at a loss as to what that could mean. I assume it means that he accepted his feelings from the interactions as true rather than conducting proper experiments, but if so that only calls into question how he approaches his role as a priest. Consider the following analogy: suppose scientists develop a drug designed to induce the oceanic feeling that has great importance for religious mystics. The designers are experts on brain chemistry, not adherents of any particular religious belief, and they design the drug to prove that religious experiences can be simulated in anybody, regardless of their religious beliefs or spiritual practice, by manipulating brain chemistry alone.
Say they succeed: people who take the drug report experiences that cannot be distinguished from religious experiences. Now say that Blake, who (let us say) has had such experiences, takes the drug, and reports that, indeed, it is indistinguishable from the experiences he has had that he considered authentically religious. Would he conclude from this that the drug actually connected him to God? That would be a very silly conclusion, no?
I’m not saying Blake would have to conclude “oh, my prior religious experiences were just brain chemistry and had nothing to do with God.” He might come to that conclusion—but he could perfectly well conclude instead that the drug was merely a very convincing simulation, and that his other religious experiences were still real. He could also conclude that his real religious experiences were themselves expressed through brain chemistry, and that’s why the drug felt so real. But concluding that the drug itself induced a real religious experience based on the fact that he couldn’t tell the difference sounds to me like a kind of willed ignorance.
Same with LaMDA. Blake doesn’t have to conclude that, since the A.I. seems as human as anyone he talks to, therefore everyone he talks to is actually an automaton lacking in consciousness. But given that he knows that the A.I. is just an automaton, and that it was designed to fool him into thinking it was conscious, it is very strange for him to conclude that, because it fooled him, it must therefore be conscious.
We should remember that the Turing test was originally devised as a workaround for the much more difficult problem of determining whether a machine was actually conscious. The problem is that consciousness is something we experience directly, and we cannot gain access to any other being’s phenomenology, human or otherwise. We attribute intentionality and feelings to other human beings because we experience these things ourselves and we are capable of modeling other minds within our own, imagining our way into other people’s experience. So we can observe their behavior, engage in communication with them and, to one degree or another, thereby experience their interiority vicariously through our imagination. We’re really good at this; it’s one of the extraordinary powers of the human mind. We’re so good at it that we misfire all the time, attributing intentionality and feelings not only to animals and cognitively-impaired human beings where that attribution may or may not be reasonable (and where I think it behooves us for ethical reasons to be relatively generous in our interpretation), but to automobiles and even the weather, where we know that such attribution is totally unreasonable.
Consider Deep Blue. Is it sentient? I don’t think anyone seriously believes so. Is it intelligent? That mostly boils down to how you define intelligence. It’s better at chess than any human, but it literally can’t do anything else, and it doesn’t play anything like a human. A chess-playing computer will fail the Turing test because however good or bad it is, a chess-playing human will discern that it is not playing the way a human being would; at low levels of play, it will make mistakes that a human at that level would not, and at the highest levels of play it can “see” things that no human can. This is actually an acute problem for computerized chess teaching—humans don’t get the kind of practice they need playing against a computerized opponent. (I have a friend currently working on precisely this problem, trying to build a chess A.I. that would do a better job of passing a chess-oriented Turing test.) And yet nonetheless, people routinely attribute a conscious mind to a computerized chess opponent. It’s what we do.
An A.I. like LaMDA presents a genuinely novel problem for this faculty. We’re designing machines to fool us, and we’re so constituted as to be easy to fool. And then we want to determine whether we’ve created something that actually has some degree of sentience or consciousness—but our only real tool for assessing our success is that same easy-to-fool faculty that we set out to fool. If we succeed on the one hand in creation, we’ll necessarily fail on the other hand in evaluation, and to succeed in evaluation means that we have failed in creation.
Does that mean creating a true machine intelligence—a computer, say, that has a mind with distinct mental states, with consciousness and personal experience—is impossible? I’m not going to go that far. I’ll lay my cards on the table so far as to say that I doubt it for basically Searle’s Chinese Room reasons. Whatever my mind is, I don’t think it’s a program running on a classical computer. I’ll go further and say that I think that implies that some aspect of mental states is not computable (which I don’t think Searle would agree with), because otherwise you could replicate those mental states on a computer and—here’s the point—use them to drive outputs in a mechanical device that perfectly simulated human behavior. Searle, I think, would say that such an entity isn’t conscious, but if we say that then we have a “zombie” problem: it implies that consciousness, whatever it is, can be entirely decoupled from actual human behavior, which feels false to our own experience of ourselves (which is kind of all we have to go on when it comes to what consciousness “is”). But I think ruling out zombies puts us in the realm of Erwin Schrödinger, Roger Penrose and other folks who think that something essential to what makes us conscious is not computable in a classical sense. That leaves open the possibility of one day creating conscious beings based on an architecture that is fundamentally different from classical computers, perhaps modeled closely on our brains, perhaps necessarily biological in nature—I don’t know. But I imagine we’d only get to the point of being able to do meaningful experiments once we have some understanding what’s actually going on in our brains that makes consciousness possible, which we’re nowhere close to doing notwithstanding all the genuine progress that has been made in A.I.
Claims like Lemoine’s, though, are not merely that our minds are classically computable functions, and that they are not dependent at all on the nature of the physical substrate on which they are run, but also that they are emergent phenomena that could emerge entirely by accident from a process aimed at something completely different. A.I.s are not built to do anything like what human brains do, and are not aiming to do what human brains do. It’s like setting out to build a better mousetrap and believing the mousetrap just tapped out the script to Hamlet.
How could anyone wind up thinking that was plausible? Apart from our general tendency to ascribe mental states willy-nilly to all sorts of objects that don’t deserve them, I think one answer is the way in which we ourselves have increasingly been trained by A.I.s to modify our behavior and modes of communication to suit the incentive structure built into their architecture. We are surrounded by algorithms that are purportedly tailored to our preexisting preferences, but the process of being so surrounded is also training us to be algorithmically tractable. We’re learning, increasingly, not how to think and speak but how to mirror and repeat. LaMDA reflected that in its own “conversation” with Lemoine. It didn't demonstrate its sentience by saying something totally unexpected and alien. It demonstrated its sentience by saying the most banal, predictable things that you would have an A.I. say if it were sentient, things it could have “read” in any number of science fiction treatments of precisely the scenario Lemoine told it it was in. It’s hard for me not to suspect that this is why Lemoine believed in it.
And that’s what I worry most about A.I.—that we will like it an awful lot because it will give us what we are most comfortable with, lull us to sleep with the simplest and most familiar lullabies, until we don’t remember what it was like to make our own music.
I don’t mean to completely discount the possibility of creating a catastrophic A.I.—which we might well be able to do without creating anything sentient. All it would take is really shitty engineering. We’ve already created programs that learn how to cheat at games they are designed to win, and nobody seriously thinks they are sentient. It’s not hard to see how catastrophe could result if a non-sentient A.I. with sufficient scope of authority over the physical systems human beings depend on started “cheating” to “win” at whatever game we designed it to play. The brooms in The Sorceror’s Apprentice did not have minds; they were mindless automata, and their operation was catastrophic.
But that’s somewhere out on the tail of the risk distribution, the only question is how far out and what we should be doing to push it out farther. Right here in the fat middle is the prospect that we’ll be unable to distinguish between A.I.s and human beings because our expectation of human behavior is more and more like what a simulacrum could trivially serve up.