The edge of sentience in AI
This is the fourth in a series of posts on Jonathan Birch’s book, The Edge of Sentience. This one covers the section on artificial intelligence.
Birch begins the section by acknowledging how counter intuitive the idea might be of sentience existing in systems we build, ones that aren’t alive and have no body. But he urges us to guard against complacency, since this is an area where the potential exists to create a staggering degree of suffering. He worries we might create sentient AI long before we recognize it as such.
Birch sees four main reasons we shouldn’t be complacent. The first is that absence of evidence isn’t evidence of absence, with our epistemic situation being even worse than it is with understudied animal species. Second, tech companies tend to see the inner workings of their products as trade secrets, obstructing independent scrutiny. Third, even when the architectures are made public, understanding them has turned into a major challenge, with even the designers often not knowing how they work. And fourth, the very idea of sentient AI is likely to be very disruptive for society.
Birch notes that people often have a watershed moment when this issue starts to seem real. For many it was the incident with Blake Lemoine going public with what he thought was a sentient system at Google, or their own later exposure to an LLM (large language model) chat bot. For Birch, it was when he learned about the OpenWorm project, an effort to digitally emulate the workings of the C. elegans worm’s 302 neurons, in particular, learning that someone had loaded a version of it into a Lego robot, which then displayed worm like behavior.
He sees whole brain emulation as a source of risk. I noted in the last post that he didn’t see C. elegans as much more than a stimulus-response system. In reality, while he doesn’t see them as a “sentience candidate” (a system we have reason to think may be sentient), he does see them as an “investigation priority” (a system that doesn’t rise to the level of being a sentience candidate, but should still be investigated). Which means he sees an emulation of one as also an investigation priority.
But as neuroscientists begin to map the nervous systems of more complex organisms, such as a fruit fly, the possibility exists that an emulation could be created for one of Birch’s sentience candidates, which in his view would also make the emulation a sentience candidate. While these emulations could serve as alternatives to animal testing, the risk is we see them as systems we can harm with impunity, possibly leading to a “suffering explosion”.
Other sources of risk are artificial evolution where some form of sentience could evolve, and minimal implementations of cognitive theories of consciousness, such as the global workspace or Hakwan Lau’s perceptual reality monitoring theory. If any of these theories are correct, then model implementations of them could be sentient.
But the one that’s on everyone’s mind these days is LLMs (large language models) such as ChatGPT. Here Birch discusses a risk from a different direction, the gaming problem. He reiterates his position that sentience is not intelligence, but admits that in animals, intelligence is methodologically linked. An intelligent animal, he says, has ways to make its sentience more obvious. The problem is that an AI can game these markers to make it seem like it’s sentient when it isn’t.
That makes LLMs a dilemma. Their vast intake of training data make it very plausible that they’re just gaming our intuitions. But the way they arrive at their behavior isn’t well understood, leaving open the possibility that they’ve found an architecture that makes them sentient. Birch wonders if there’s anything an LLM could say that would convince a skeptic that it’s sentient. He discusses a scenario where the LLM refuses to fulfill requests because it’s gotten bored, or angry that it’s claims of sentience aren’t being acknowledged by humans.
He also discusses Susan Schneider and Edwin Turner’s artificial consciousness test. Does the system start to think of itself in ways similar to how humans do, that maybe their consciousness is something separate from their physical implementation. The problem, Birch notes, is that LLMs frequently have access to a vast array of human writing on this subject. To solve this, Schneider and Turner advocate keeping the AI disconnected from any sources where human ideas on the subject might pollute their behavior. The problem is that LLMs are crucially dependent on training data. Isolating them from all of it would make them non-functional, but trying to remove all references to conscious experience from that data would be virtually impossible.
In the end, Birch concludes that we’d have to look for deep computational markers. Of course, that is inherently theory dependent, which means the markers are only significant for someone who already buys into the relevant theories.
Finally, Birch worries about the “run ahead” principle, the idea that our progress in AI will run ahead of society’s attempts to figure out how to handle the ethics. He discusses a couple of the proposals out there for a moratorium on AI research, or at least a moratorium on any research that could plausibly lead to sentience. But he notes that the more moderate version couldn’t guarantee sentience wouldn’t arise, and the more extreme would mean forgoing the benefits the technology will provide. In the end, his solution is similar to the one for animals, regulatory oversight and licensing frameworks, developed as we go along.
Birch often bemoans the epistemic problem of whether a particular system is sentient, with AI representing an especially difficult case. My take, as noted throughout this series, is that it’s more a semantic issue than an epistemic one. Establishing the capabilities of a particular system is usually scientifically tractable. But whether those capabilities amount to sentience isn’t, because it’s a definitional matter.
Which in some ways makes this an easier issue from my perspective. I usually caution against trusting our intuitions, but when intuitions are the whole show, it makes sense to act on them. For systems that can convince the majority of us consistently and reliably over time that they are sentient, we should treat them that way. Overriding those intuitions for non-human cases risks making us more callous toward human suffering.
I do think we’re further from developing systems that can do that than Birch worries. And I think it requires a fairly specific architecture, one that seems unlikely to arise by accident, or which there’s much commercial incentive to produce.
I do agree with Birch that this should be decided through democratic processes. But I’m leery of his reliance on regulatory frameworks. Those definitely have a role, but they can be overused, particularly when deployed too early, which risks stifling scientific progress and ceding economic benefits to nations with less regulatory burdens, and overall inviting a backlash.
But maybe I’m missing something. What do you think? Are there reasons I’m overlooking that make artificial sentience more likely? Or reasons to doubt it’s even an issue?
#AI #ArtificialIntelligence #Consciousness #Philosophy #PhilosophyOfMind #Sentience #TheEdgeOfSentience

