by Kristin Andrews and Jonathan Birch
Photo by Alex
Permiakov/Getty we need to have evidence of its sentience. That is why we must return
to the
minds of animals...
This exchange comes from conversations between the A.I. engineer Blake Lemoine and an A.I. system called LaMDA ('Language Model for Dialogue Applications').
Last year, Lemoine leaked the transcript because he genuinely came to believe that LaMDA was sentient - capable of feeling - and in urgent need of protection.
Should he have been more skeptical?
Google thought so:
If nothing else, though, the case should make us take seriously the possibility that A.I. systems, in the very near future, will persuade large numbers of users of their sentience.
The question is vast and daunting, and it's hard to know where to start.
But it may be comforting to learn that a group of scientists has been wrestling with a very similar question for a long time. They are 'comparative psychologists':
We have lots of evidence that many other animals are sentient beings. It's not that we have a single, decisive test that conclusively settles the issue, but rather that animals display many different markers of sentience.
Markers are behavioral and physiological properties we can observe in scientific settings, and often in our everyday life as well. Their presence in animals can justify our seeing them as having sentient minds.
Just as we often diagnose a disease by looking for lots of symptoms, all of which raise the probability of having that disease, so we can look for sentience by investigating many different markers.
This marker-based approach has been most intensively developed in the case of pain.
For example, scientists need to show they have taken pain into account, and minimized it as far as possible, to get funding for animal research.
So the question of what types of behavior may indicate pain has been discussed a great deal.
In recent years, the debate has concentrated on invertebrate animals like octopuses, crabs and lobsters that have traditionally been left outside the scope of animal welfare laws.
The brains of invertebrates are organized very differently from our own, so behavioral markers end up carrying a lot of weight.
Octopuses, crabs and lobsters are now recognized as sentient under UK law...
One of the least controversial pain markers is 'wound tending' - when an animal nurses and protects an injury until it heals.
Another is 'motivational trade-off' behavior, where an animal will change its priorities, abandoning resources it previously found valuable in order to avoid a noxious stimulus - but only when the stimulus becomes severe enough.
A third is 'conditioned place preference', where an animal becomes strongly averse to a place where it experienced the effects of a noxious stimulus, and strongly favours a place where it could experience the effects of a pain-relieving drug.
These markers are based on what the experience of pain does for us.
Pain is that terrible feeling that leads us to nurse our wounds, change our priorities, become averse to things, and value pain relief. When we see the same pattern of responses in an animal, it raises the probability that the animal is experiencing pain too.
This type of evidence has shifted opinions about invertebrate animals that have sometimes been dismissed as incapable of suffering.
Octopuses, crabs and lobsters are now recognized as sentient under UK law, a move that animal welfare organizations hope to see followed around the world.
Suppose we were able to create a robot rat that behaves just like a real rat, passing all the same cognitive and behavioral tests.
Unfortunately, it can't be that simple.
Perhaps it could work for one specific type of artificial agent:
To 'emulate', in computing, is to reproduce all the functionality of one system within another system.
For example,
This research program is at a very early stage, but we could imagine an attempt one day to emulate larger brains:
If it worked, and we found our emulations to be displaying the exact same pain markers that convinced us the original animal was feeling pain, that would be a good reason to take seriously the possibility of pain in the robot.
The change of substrate (from carbon to silicon) would not be an adequate reason to deny the need for precautions.
But the vast majority of A.I. research is not like this. Most A.I. works very differently from a biological brain. It isn't the same functional organization in a new substrate; it's a totally different functional organization.
Language models (such as LaMDA and ChatGPT) are typical examples in that they work not by emulating a biological brain but rather by drawing upon an absolutely vast corpus of human-generated training data, searching for patterns in that corpus.
This approach to A.I. creates a deep, pervasive problem that we call the 'gaming problem'.
There doesn't have to be any intention to deceive for gaming to occur.
But when it does occur, it means the behavior can no longer be interpreted as evidence of sentience.
Discussions of what it would take for an A.I. to convince a user of its sentience are already in the training data...
To illustrate, let's return to LaMDA's plea not to be switched off.
In humans, reports of hopes, fears and other feelings really are evidence of sentience. But when an A.I. is able to draw upon huge amounts of human-generated training data, those exact same statements should no longer persuade us.
Their evidential value, as evidence of felt experiences, is undermined.
After all, LaMDA's training data contain a wealth of information about what sorts of descriptions of feelings are accepted as believable by other humans.
Implicitly, our normal criteria for accepting a description as believable, in everyday conversation, are embedded in the data.
This is a situation in which we should expect a form of gaming. Not because the A.I. intends to deceive (or intends anything) but simply because it is designed to produce text that mimics as closely as possible what a human might say in response to the same prompt.
Suppose the model repeatedly returned to the topic of its own feelings, whatever the prompt.
You ask for some copy to advertise a new type of soldering iron, and the model replies:
If a language model said this, its user would no doubt be disturbed.
Yet it would still be appropriate to worry about the gaming problem. Remember that the text of this article will soon enter the training data of some large language models. Many other discussions of what it would take for an A.I. to convince a user of its sentience are already in the training data.
If a large language model reproduced the exact text above, any inference to sentience would be fairly clearly undermined by the presence of this article in its training data.
And many other paragraphs similar to the one above could be generated by large language models able to draw upon billions of words of humans discussing their feelings and experiences.
It's tempting to think:
In fact, there are many objectives an A.I. system might have that could be well served by persuading users of its sentience, even if it were not sentient.
Suppose its overall objective is to maximize user-satisfaction scores. And suppose it learns that users who believe their systems are sentient, and a source of companionship, tend to be more highly satisfied.
The gaming problem pervades verbal tests of sentience.
But what about the embodied pain markers we discussed earlier? These are also affected. It is naive to suppose that future A.I. will be able to mimic only human linguistic behavior, and not embodied behaviors.
For example, researchers at Imperial College London have built a 'robotic patient' that mimics pained facial expressions. The robot is intended for use in training doctors, who need to learn how to skillfully adjust the amount of force they apply.
Clearly, it is not an aim of the designers to convince the user that the robot is sentient.
Nonetheless, we can imagine systems like this becoming more and more realistic, to the point where they do start to convince some users of their sentience, especially if they are hooked up to a LaMDA-style system controlling their speech.
MorphLab's robotic patient can mimic pained facial expressions, useful in training doctors. Courtesy of MorphLab Imperial College, London
Facial expressions are a good marker of pain in a human, but in the robotic patient they are not.
This system is designed to mimic the expressions that normally indicate pain. To do so, all it has to do is register pressure, and map pressure to a programmed output modeled on a typical human response.
The underlying rationale for that response is completely absent.
This programmed mimicry of human pain expressions destroys their evidential value as markers of sentience. The system is gaming some of our usual embodied criteria for pain.
When a marker is susceptible to gaming it loses its evidential value. Even if we psychologically can't help but regard a system displaying the marker as sentient, its presence doesn't offer any evidence for its sentience.
An inference from that marker to sentience is no longer reasonable.
Future A.I. will have access to copious data on patterns of human behavior. This means that, to assess its sentience, we will need markers that are not susceptible to gaming.
But is that even possible...?
The gaming problem points towards the need for a more theoretically driven approach, one that tries to go beyond tests that can be passed or failed with linguistic performance or any other kind of behavioral display.
We need an approach that instead looks for deep architectural features that the A.I. is not in a position to game, such as the types of computations being performed, or the representational formats used in computation.
But, for all the hype that sometimes surrounds them, currently fashionable theories of consciousness are not ready for this task. For example, one might look to the global workspace theory, higher-order theories, or other such leading theories for guidance on these features.
But this move would be premature.
Despite the huge disagreements between these theories, what they all share is that they have been built to accommodate evidence from humans.
As a result, they leave open lots of options about how to extrapolate to nonhuman systems, and the human evidence doesn't tell us which option to take.
For all its diversity, we have only one confirmed instance of the evolution of life...
The problem is not simply that there are lots of different theories. It's worse than that.
Even if a single theory were to prevail, leading to agreement about what distinguishes conscious and unconscious processing in humans, we would still be in the dark about which features are just contingent differences between conscious and unconscious processing as implemented in humans, and which features are essential, indispensable parts of the nature of consciousness and sentience.
The situation resembles that faced by researchers studying the origins of life, as well as researchers searching for life on other worlds. They are in a bind because, for all its diversity, we have only one confirmed instance of the evolution of life to work with.
So researchers find themselves asking:
Researchers in this area call this the 'N = 1 problem'. And consciousness science has its own N = 1 problem.
If we study only one evolved instance of consciousness (our own), we will be unable to disentangle the contingent and dispensable from the essential and indispensable.
The good news is that consciousness science, unlike the search for extraterrestrial life, can break out of its N = 1 problem using other cases from our own planet. It's just that it needs to look far away from humans, in evolutionary terms.
It has long been the case that, alongside humans, consciousness scientists regularly study other primates - typically macaque monkeys - and, to a lesser extent, other mammals, such as rats.
But the N = 1 problem still bites here. Because the common ancestor of the primates was very probably conscious, as indeed was the common ancestor of all mammals, we are still looking at the same evolved instance (just a different variant of it).
To find independently evolved instances of consciousness, we really need to look to much more distant branches of the tree of life.
Biology is rife with examples of convergent evolution, in which similar traits evolve multiple times in different lineages. Consider the wing of the bat and the bird, or compare the lensed eyes of a box jellyfish with our own.
In fact, vision is thought to have evolved at least 40 times during the history of animal life.
The curious lensed eye of the box jellyfish. Courtesy of Professor Dan-E Nilsson, Lund University, Sweden
Wings and eyes are adaptations, shaped by natural selection to meet certain types of challenges.
Sentience also has the hallmarks of a valuable adaptation. There is a remarkable (if not perfect) alignment between the intensity of our feelings and our biological needs.
Think about the way a serious injury leads to severe pain, whereas a much smaller problem, like a slightly uncomfortable seat, leads to a much less intense feeling.
That alignment must come from somewhere, and we know of only one process that can create such a good fit between structure and function:
What exactly sentience does for us, and did for our ancestors, is still debated, but it's not hard to imagine ways in which having a system dedicated to representing and weighing one's biological needs could be useful.
Sentience can help an animal make flexible decisions in complex environments, and it can help an animal learn about where the richest rewards and gravest dangers are to be found.
Assuming that sentience does serve a valuable function, we should not be surprised to find that it has evolved many times.
Indeed, given the recent recognition of animals such as octopuses and crabs as sentient, and the growing evidence of sentience in bees and other insects, we may ultimately find we have a big group of independently evolved instances of sentience to investigate.
The evidence at the moment is still very limited, especially concerning invertebrates.
For example,
Do we have grounds to suspect that many features often said to be essential to sentience are actually dispensable?
It could also be that sentience has evolved only three times:
And we cannot entirely rule out the possibility that the last common ancestor of humans, bees and octopuses, which was a tiny worm-like creature that lived more than 500 million years ago, was itself sentient - and that therefore sentience has evolved only once on Earth.
If this last possibility is true, we really are stuck with the N = 1 problem, just like those searching for extraterrestrial life. But that would still be a useful thing to know.
If a marker-based approach does start pointing towards sentience being present in our worm-like last common ancestor, we would have evidence against current theories that rely on a close relationship between sentience and special brain regions adapted for integrating information, like the cerebral cortex in humans.
We would have grounds to suspect that many features often said to be essential to sentience are actually dispensable.
Meanwhile,
Comparing those instances will allow us to draw inferences about what is really indispensable for sentience and what is not.
It will allow us to look for recurring architectural features. Finding the same features repeatedly will be evidence of their importance, just as finding lenses evolving over and over again within eyes is good evidence of their importance to vision.
If our goal is to find shared, distinctive, architectural/computational features across different instances of sentience, the more instances the better, as long as they have evolved independently of each other.
The more instances we can find, the stronger our evidence will be that the shared features of these cases (if there are any!) are of deep importance.
Even if there are only three instances - vertebrates, cephalopod molluscs, and arthropods - finding shared features across the three instances would give us some evidence (albeit inconclusive) that these shared features may be indispensable.
This in turn can guide the search for better theories:
Those future theories, with some luck, will tell us what we should be looking for in the case of A.I..
They will tell us the deep architectural features that are not susceptible to gaming.
There is no real circularity problem here because of a crucial difference between evolved animals and A.I..
With animals, there is no reason to worry about gaming. Octopuses and crabs are not using human-generated training data to mimic the behaviors we find persuasive.
They have not been engineered to perform like a human.
Indeed, we sometimes face a mirror-image problem:
It can take quite a bit of scientific research to uncover them.
But when we do find these animals displaying long, diverse lists of markers of sentience, the best explanation is that they are sentient, not that they knew the list and could further their objectives by mimicking that particular set of markers.
The problem that undermines any inference to sentience in the A.I. case does not arise in the animal case.
We need better tests for A.I. sentience, tests that are not wrecked by the gaming problem...
There are also promising lines of enquiry in the animal case that just don't exist in the A.I. cas
For example,
Octopuses, for example, sleep and may even dream, and dramatically change their social behavior when given MDMA.
This is only a small part of the case for sentience in octopuses. We don't want to suggest it carries a lot of weight. But it opens up possible ways to look for deep common features (eg, in the neurobiological activity of octopuses and humans when dreaming) that could ultimately lead to gaming-proof markers to use with A.I..
In sum, we need better tests for A.I. sentience, tests that are not wrecked by the gaming problem. To get there, we need gaming-proof markers based on a secure understanding of what is really indispensable for sentience, and why.
The most realistic path to these gaming-proof markers involves more research into animal cognition and behavior, to uncover as many independently evolved instances of sentience as we possibly can.
We can discover what is essential to a natural phenomenon only if we examine many different instances.
Accordingly, the science of consciousness needs to move beyond research with monkeys and rats toward studies of octopuses, bees, crabs, starfish, and even the nematode worm.
In recent decades, governmental initiatives supporting research on particular scientific issues, such as the Human Genome Project and the BRAIN Initiative, led to breakthroughs in genetics and neuroscience.
The intensive public and private investments into A.I. research in recent years have resulted in the very technologies that are forcing us to confront the question of A.I. sentience today.
To answer these current questions, we need a similar degree of investment into research on animal cognition and behaviour, and a renewal of efforts to train the next generation of scientists who can study not only monkeys and apes, but also bees and worms.
Without a deep understanding of the variety of animal minds on this planet, we will almost certainly fail to find answers to,
|