Why 'hallucination' is a problematic term (hint - it's not just because it anthropomorphizes the technology!) and what to do about it.

If youâve read an article about ChatGPT of late, you might have noticed something odd: the word âhallucinateâ is everywhere. The origin of the word is (h)allucinari, to wander in mind, and Dictionary.com defines it this way: âa sensory experience of something that does not exist outside the mind.â Now, ChatGPT doesnât have a mind, so to say it âhallucinatesâ is anthropomorphizing the technology, which as Iâve written before, is a big problem.
âHallucinateâ is the wrong word for another important reason: it implies an aberration; a mistake of some kind, as if it isnât supposed to make things up. But thatâs actually exactly what generative models do â given a bunch of words, the model probabilistically makes up the next word in that sequence. Presuming that AI models are making a mistake when theyâre actually doing what theyâre supposed to do has profound implications for how we think about accountability for harm in this context. Letâs dig in.
In March, tech journalist Casey Newton asked Googleâs Bard to give him some fun facts about the gay rights movement. Bard responded in part by saying that the first openly gay person elected to the presidency of the United States was Pete Buttigieg in 2020. Congratulations, Pete! This response was referred to by many as a âhallucinationâ â as if the response wasnât justified by its training data. But since Bard was largely trained on data from the internet, it likely includes a lot of sequences where the words âgay,â âpresident,â âUnited States,â â2020,â and âPete Buttigiegâ are close to one another. So on some level, claiming that Buttigieg was the first openly gay president isnât all that surprising â it is a plausible response from a probabilistic model.
Now, this example didnât lead to real-world harm, but who or what should be held accountable when it does? Helen Nissenbaum, a professor of information sciences, explains that weâre quick to âblame the computerâ because we anthropomorphize it in ways we wouldnât with other inanimate objects. Nissenbaum was writing in 1995 about clunky computers, and this problem has become much much worse in the intervening years. As Nissenbaum wrote then, âHere, the computer serves as a stopgap for something elusive, the one who is, or should be, accountable.â Today, the notion that AI is hallucinating serves as such a stopgap.
Paradoxically, users or operators of the technology often absorb a disproportionate amount of blame. This is what Madeleine Clare Elish, a cultural anthropologist, calls a âmoral crumple zoneâ wherein âresponsibility for an action may be misattributed to a human actor who had limited control over the behavior of an automated or autonomous system.â Traditionally, a âcrumple zoneâ is the part of the car designed to absorb the brunt of a crash in order to protect the driver. Elish argues that historically âa moral crumple zoneâ has protected the technological system at the expense of the human user or operator. Remember when New York Times journalist Kevin Roose went viral for a very weird and unsettling back-and-forth with Sydney, Bingâs new chatbot? The back-and-forth ended with the chatbot proclaiming its love for Roose. In the aftermath, many commentators argued that Roose pushed Sydney too far; that he was to blame for how the chatbot responded.
The group often conspicuously left out of this discussion of potential blame are those building and making key decisions about the model. Engineers, AI researchers, developers, corporate officers, etc. have historically avoided blame for a few reasons. The first reason is the weird idea that weâve come to accept errors or bugs in code as normal. Nissenbaum, calls this âthe problem of bugsâ and shows how it leads to an obvious problem: if imperfections are perceived as inevitable, then we canât hold those designing the system accountable. But that doesnât hold up â in a number of industries like car manufacturing and planes, where the cost of an error are very high, weâve proven this idea mostly wrong.
Then there is âthe problem of many handsâ that Nissenbaum describes, which is the notion that in modern organizational arrangements, rarely does blame for a decision lie with one person. There are lots of cooks in the proverbial decision-making kitchen, each with varying degrees of authority and power. Also, the âkitchenâ has become more complex since Nissenbaumâs writing, as machine learning processes introduce more dynamic steps and different stakeholders. In any case, this effect is compounded by the assumption that engineers canât actually explain whatâs happening inside the model. If they canât explain why specific inputs combined with model decisions contribute to certain outputs, then how can we blame them? Thatâs not quite the right question, though. One, itâs kind of insane not to assign blame because they donât understand whatâs happening â if anything, that sounds like all the more reason to dole out a liâl accountability, or at least preemptive thresholds. I argued for this in âA critique of tech-criticism,â writing:
âThe government has a long history of requiring companies and industries to meet a certain standard before launching a product. I wouldnât drive a car if federal standards didnât prevent serious injuries. Nor would I hop on a plane so often if we didnât render crashes nearly obsolete [âŚ] So what to do? Well, the government could require that AI companies be able to explain how their model produced a result before releasing it. Itâs not clear that interpretability is possible, but right now, weâre not even asking that companies try.â
Furthermore, âcauseâ is an especially high bar for blame. Joel Feinberg, a noted moral, social, and political philosopher described a set of conditions under which one would be considered âmorally blameworthyâ for a given harm even if they didnât intend to cause the harm. Nissenbaum summarizes Feinbergâs clauses this way:
âWe judge an action reckless if a person engages in it even though he foresees harm as its likely consequence but does nothing to prevent it; we judge it negligent, if he carelessly does not consider probable harmful consequences.â
In other words, an engineer might deserve blame and accountability â even if they didnât mean to cause harm â if they were reckless or negligent in how they built the technology. For example, researchers have been documenting the harms of AI and LLMs for years. Itâs simply no longer reasonable to say âah, we didnât see that harm coming, the machine must have âhallucinated.ââ That sounds pretty negligent to me. Moreover, in a totally insane 2022 survey, AI researchers were asked the question, âWhat probability do you put on human inability to control future advanced A.I. systems causing human extinction or similarly permanent and severe disempowerment of the human species?â The median reply was 10 percent. I personally find the question itself hyperbolic but yeah, itâs fair to say that AI researchers âforesee harm.â So the question of recklessness comes down to what theyâre doing to proactively prevent it.
Finally, and most importantly, the pursuit of explainability looks to the model itself for answers, when the model is entangled with both its users and creators. Purely from a technical perspective, we can only half explain why a modelâs outputs are the way they are. As we saw with Bardâs alleged first gay president, probabilistic models will result in weird outputs that donât technically exist in the training data, because they are just making up sentences based on what words are likely to follow the ones that came before it.
The other half of the explanation exists in how people, culture, norms, race, and gender, inform how the training data is constructed, and then more obviously, how the prompt is created. In some small way, Roose was to blame for Sydneyâs response â his prompts were inputs to what the chatbot generated. So weâre left with a complex system with multiple inputs â engineers, users, and the technology â dynamically interacting, each adapting, updating, and changing their actions, decisions, and outputs in response to one another.
What can be done about all this?
We need to accept that the complexity of the engineer-user-tech interaction does not absolve everyone from responsibility for error and harm. As Nissenbaum writes, âInstead of identifying a single individual whose faulty actions caused the injuries, we find we must systematically unravel a messy web of interrelated causes and decisions.â Right, we can start to disentangle these systems and assign partial blame and accountability to the appropriate stakeholder. This will require a lot (!) more information about the models themselves and the decisions engineers are making. Iâve long had a complicated relationship with âtransparency initiativesâ but much more of it will be required if weâre to move closer to accountability. It will also require banishing from our brains the cultural assumption that bugs are inevitable. But the first step along this path is to close the gap between our own expectations of what LLMs can do, and what theyâre actually doing. They arenât hallucinating â we are.
Commentaires