The Emotional Alignment Design Policy - AI

Submitted by Complex on Wed, 2023-05-03 11:28

Member Content Rating:

Your rating: None Average: 5 (86 votes)

Design Policy of the Excluded Middle: Don't create AI systems of disputable moral status. Doing so, one courts the risk of either underattributing or overattributing rights to the systems, and both directions of error are likely to have serious moral costs.

Violations of the Design Policy of the Excluded Middle are especially troubling when some well-informed experts reasonably hold that the AI systems are far below having humanlike moral standing and other well-informed experts reasonably hold that the AI systems deserve moral consideration similar to that of humans. The policy comes in various strengths in terms of (a.) how wide a range of uncertainty to tolerate, and (b.) how high a bar is required for legitimate disputability.

Design AI systems so that ordinary users have emotional reactions appropriate to the systems' genuine moral status.

Joanna Bryson articulates one half of this design policy in her well-known (and in my view unfortunately titled) article "Robots Should Be Slaves". According to Bryson, robots -- and AI systems in general -- are disposable tools and should be treated as such. User interfaces that encourage people to think of AI systems as anything more than disposable tools -- for example, as real companions, capable of genuine pleasure or suffering -- should be discouraged. We don't want ordinary people fooled into thinking it would be morally wrong to delete their AI "friend". And we don't want people sacrificing real human interests for what are basically complicated toasters.

Now to be clear, I think tools -- and even rocks -- can and should be valued. There's something a bit gratingly consumerist about the phrase "disposable tools" that I am inclined to use here. But I do want to highlight the difference in the type of moral status possessed, say, by a beautiful automobile versus that possessed by a human, cat, or even maybe garden snail.

The other half of the Emotional Alignment Design Policy, which goes beyond Bryson, is this: If we do someday create AI entities with real moral considerability similar to non-human animals or similar to humans, we should design them so that ordinary users will emotionally react to them in a way that is appropriate to their moral status. Don't design a human-grade AI capable of real pain and suffering, with human-like goals, rationality, and thoughts of the future, and put it in a bland box that people would be inclined to casually reformat. And if the AI warrants an intermediate level of concern -- similar, say, to a pet cat -- then give it an interface that encourages users to give it that amount of concern and no more.

I have two complementary concerns here.

One -- the nearer-term concern -- is that tech companies will be motivated to create AI systems that users emotionally attach to. Consider, for example, Replika, advertised as "the world's best AI friend". You can design an avatar for the Replika chat-bot, give it a name, and buy it clothes. You can continue conversations with it over the course of days, months, even years, and it will remember aspects of your previous interactions. Ordinary users sometimes report falling in love with their Replika. With a paid subscription, you can get Replika to send you "spicy" selfies, and it's not too hard to coax into erotic chat. (This feature was apparently toned down in February after word got out that children were having "adult" conversations with Replika.)

Now I'm inclined to doubt that ordinary users will fall in love with the current version of Replika in a way that is importantly different from how a child might love a teddy bear or a vintage automobile enthusiast might love their 1920 Model T. We know to leave these things behind in a real emergency. Reformatting or discontinuing Replika might be upsetting to people who are attached, but I don't think ordinary users would regard it as the moral equivalent of murder.

My worry is that it might not take too many more steps of technological improvement before ordinary users can become confused and can come to form emotional connections that are inappropriate to the type of thing that AI currently is. If we put our best chatbot in an attractive, furry pet-like body, give it voice-to-text and text-to-speech interfaces so that you can talk to it orally, give it an emotionally expressive face and tone of voice, give it long-term memory of previous interactions as context for new interactions -- well, then maybe users do really start to fall more seriously in love or at least treat it as being as having the moral standing of a pet mammal. This might be so even with technology not much different from what we currently have, about which there is generally expert consensus that it lacks meaningful moral standing.

It's easy to imagine how tech companies might be motivated to encourage inflated attachment to AI systems. Attached users will have high product loyalty. They will pay for monthly subscriptions. They will buy enhancements and extras. We already see a version of this with Replika. The Emotional Alignment Design Policy puts a lid on this: It should be clear that this is an interactive teddy-bear, nothing more. Buy cute clothes for your teddy bear, sure! But forgo the $4000 cancer treatment you might give to a beloved dog.

The longer-term concern is the converse: that tech companies will be inclined to make AI systems disposable even if those AI systems, eventually, are really conscious or sentient and really deserve rights. This possibility has been imagined over and over in science fiction, from Asimov's robot stories through Star Trek: The Next Generation, Black Mirror, and West World.

Now there is, I think, one thing a bit unrealistic about those fictions: The disposable AI systems are designed to look human or humanoid in a way that engages users' sympathy. (Maybe that's a function of the fictional medium: From a fiction-writing perspective, humanlike features help engage readers' and viewers' sympathy.) More realistic, probably, is the idea that if the tech companies want to minimize annoying protests about AI rights, they will give the robots or AI systems bland, not-at-all-humanlike interfaces that minimize sympathetic reactions, such as the shipboard computer in Star Trek or the boxy robots in Interstellar.

The fundamental problem in both directions is that companies' profit incentives might misalign with AI systems' moral status. For some uses, companies might be incentivized to trick users into overattributing moral status, to extract additional money from overly attached users. In other cases, companies might be incentivized to downplay the moral status of their creations -- for example, if consciousness/sentience proves to be a useful feature to build into the most sophisticated future AI workers.

The Emotional Alignment Design Policy, if adhered to, will reduce these moral risks.

Complex's blog

1301 reads