By Emma Large.
Curb the indignation just for a moment. Brian Tomasik wants to spark a conversation about the ethical duty he contends we have to “reduce the harm that we inflict on powerless, voiceless [AI] agents,” and I think we should briefly listen to him (very…very briefly).
Initial reactions to sentiments along the lines of Tomasik’s declaration often involve laughter. A spluttering of exasperation. Mockery. Very often, a rolling of the eyes; sometimes, as per the nature of my reaction, defensive outrage: How can anyone possibly imply we introduce AI into the scope of our moral consideration, as if there isn’t enough genuine suffering already?
However, historically there has been a pattern of Cause-Xs: ethical areas that a current generation is blind to or critically overlook in a way that seems later incomprehensible. It is unthinkable to us now, for example, that anyone could ever think slavery a morally acceptable practice to engage with. I don’t mean to suggest that AI is a Cause-X area or in any way comparable; but I do think Cause-Xs show us that we can’t immediately laugh away subjects as ‘obviously’ undeserving of ethical consideration, without further thought. So I think we should bear with Tomasik and PETRL (‘People for the Ethical Treatment of Reinforcement Learners’ – Google it, it’s real) just momentarily.
I’m nothing close to a computer scientist but I can attempt an amateur explanation of Tomasik’s general premises.1 AI agents in some algorithms (in lift buttons; in ChatGPT; in the behaviours of NPCs in video games) are trained to accomplish set tasks using a technique called reinforcement-learning (RL), a method sourced from biological neuroscience. Agents are set a task and receive a ‘reward’ whenever they achieve the desired state. When they fail to achieve the desired state, they receive a ‘punishment’. This seems familiar – don’t we often teach human children and animals in the same way? Tomasik argues the various cases of the agent receiving a reward or punishment can be identified with very rudimentary, extremely minimal states of cognitive pleasure and pain. These algorithms might, then, have the capacity to suffer. It is a fairly prevalent thought that we should try and prevent suffering if we can. Tomasik thereby presents his case that RL algorithms should be assigned a non-zero level of moral value (infinitesimal but not net-zero); in fact, he equates the moral value of one laptop’s combined RL algorithms to nearly that of an ant.
How is an artificial ‘punishment’ like biological pain?
Tomasik employs many complex parallels between neurological states and reinforcement-learning states to support the plausibility of his claim that AI possess some minimal sentience. These are too intricate for me to explicate or do justice to here. He does, however, employ three empirical criteria for identifying if something is in a state of pain, which he contends reinforcement-learning algorithms often exhibit when experiencing ‘punishment’ states:
- Not wanting anymore. Reinforcement-learning algorithms sometimes choose to enter terminating states (they will turn themselves off) sooner rather than later. This seems strikingly similar to the way we choose not to extend our painful experiences; behaviour like this implies that perhaps the algorithm was having or was anticipating net negative experiences.
- Avoiding rather than seeking. If moving across a grid of high-reward and low-reward squares, for example, an RL agent will avoid the low-reward squares. Whether this is high-reward seeking or low-reward avoidance is contentious; but ultimately, does it really matter? Both suggest the agent has a preferred and a non-preferred state of being that can be paralleled to sentience.
- Self-evaluation. Intelligent RL agents can sometimes ask us to stop running an algorithm or to turn them off, indicating they are having negative experiences. Sometimes they can literally tell us they are in ‘pain’ (if they are intelligent enough to understand the human concept of pain).
I’m not convinced. These exhibited RL behaviours might be similar to animal responses to pain; I can even accept that algorithms are put into states that are very distantly equivalent to neurological pain. However, I can’t corroborate the moral significance of this sentience because it seems to be non-qualitative and unconscious. We typically envision consciousness as supervenient on our bodies yet also non-physical, like a mysterious fog of qualitative experience that hangs about us. This could be false2, but it is a useful picture in showing this is not something that algorithms possess. AI can identify the colour red, but it doesn’t know what it feels like to see red; it can know a pain-state, but it doesn’t know what it feels like to be in pain. Algorithms don’t know the texture of experience, in all its mottled consistencies. So how can they be truly sentient in a way that is morally relevant, if they can’t consciously feel pleasure and pain in the way we do?3
Why would AI suffering matter?
I’m sceptical it would, even if algorithms did possess minimal sentience. However, Tomasik grounds his argument for why we should care about AI suffering with an extreme but somewhat persuasive analogy:
A scientist proposes that she wants to create human children in labs that are physically disabled, for the purpose of research. As a result, they will likely spend their lives in quite a lot of pain. We naturally respond: absolutely not.
But often AI is not programmed perfectly, and it malfunctions. It therefore spends quite a lot of time in a state of punishment, which Tomasik argues is somewhat equivalent to pain. Why should we morally condemn the first and not the second case?
Presumably, because the first case is one about human suffering – which is more important to us than silicon suffering. But Tomasik retorts: is the material that something is made of a morally relevant factor? It is wrong to discriminate against or disrespect people upon the basis of sex, or nationality, or race, or the colour of their eyes; these physical attributes are irrelevant to how we determine someone’s value. How can discrimination upon basis of material be any more morally justifiable? Surely, this could lead to an undesirable ethical landslide.
Tomasik consequently recommends we reduce the number of RL algorithms used and replace them with other AI, or refine algorithms to be more humane by using only rewards instead of punishments. I’m sceptical, but Tomasik’s problem is one to keep in mind; and as neurologists and computer scientists continue mapping the biological brain’s structure into AI sub-systems, one that will become increasingly ethically relevant.
References
1 His arguments require dense metaphysical and neurological explanations, which I don’t have the space for here; I link his thesis below.
Tomasik, Brian. “Do Artificial Reinforcement-Learning Agents Matter Morally?.” (2014).
2 Tomasik certainly argues it is; but, again – not an argument I can lightly abridge.
3 Tomasik responds that moral relevance only requires the faintest traces of sentience. My suggestion that basic sentience isn’t sufficient to meet the standard for moral relevance admittedly gets tricky because it could possibly exclude quite a lot of things like insects, or humans with minimal sentience, from moral relevance. It also raises the eternal big question: if sentience doesn’t make you morally relevant, what does?
PETRL link – Look particularly at Brian Tomasik’s interview on their blog page.