- | 4:00 pm
ChatGPT is behaving weirdly (and you’re probably reading too much into it)
The chatbot has been giving what OpenAI calls ‘unexpected responses’ since yesterday.
Ask ChatGPT to answer a question and you’re likely not expecting it to reply “— and it is” over and over. Nor are you likely to have much patience if it starts apologizing for replying in gibberish, saying “the cogs en la tecla might get a bit whimsical.” Yet both instances have occurred in the past 24 hours, leaving users of the chatbot befuddled and asking what the hell is going on.
On X (formerly Twitter) and Reddit, users are scratching their heads as the normally reliable conversation companion has turned a bit weird. “I hope it’s not in too much distress,” wrote one user, managing to anthropomorphize a machine that does little more than pattern-matching. “This legit has schizophrenia vibes and even making the safe assumption that it is not conscious, it still makes me feel sad to see.”
OpenAI, the company behind ChatGPT, has acknowledged that something is amiss, saying it is “investigating reports of unexpected responses from ChatGPT” since 3:40 p.m. PST on February 20. The company declined to expand when asked by Fast Company, though the firm’s status page says that it’s continuing to monitor the situation even today.
“That this happened to OpenAI in production models is surprising and quite revealing, especially given a less-publicized incident around the new year of the model becoming ‘lazy’ or outputting responses, refusing to answer queries,” says Willie Agnew, an AI researcher at the Paul G. Allen School of Computer Science and Engineering at the University of Washington. “OpenAI is supposedly a leader in creating and evaluating models, and at this point, has years of user-interaction data and supposedly years of R&D on safety, testing, and evaluation, yet they seemingly can’t predict or control what their production models do.”
One thing it’s unlikely to be is the first tremor of an AI uprising. Conversations about AI sentience and artificial general intelligence remain more science fiction than anything else. Existential risk remains a distant, if even ever likely, worry.
Instead, it’s likely to be a simple issue that spirals into a larger one when you’re overseeing the world’s most popular chatbot. “I think some component of the ChatGPT pipeline got updated and it created a cascade of effects that, under some conditions, produces gibberish output,” says Sasha Luccioni of Hugging Face, an AI company. While Luccioni admits that it’s hard to say what’s going on—one of the things about AI systems is that even their creators don’t always know what goes on under the hood—it’s possible that a model or filter that contributes to the chatbot’s output got updated “and it brought the house of cards crashing down,” she says.
That supposition is based on the rumors that ChatGPT is a mixture-of-experts (MOE)-based system. “That means there’s several LLMs running concurrently under the hood, plus safety filters and intent classifiers and such,” says Luccioni. The fact that the most egregious errors only seem to be happening in certain contexts suggests to Luccioni that it could possibly be one of the large language models within the MOE system going haywire.
The fact that some have taken what could be a single misfiring component and ascribed to it any number of human-like ailments is a worry, Luccioni adds. “People are definitely starting to treat LLMs like humans, and that’s really problematic,” she says. Humanizing AI models subliminally encourages people to put more trust in them and to become more complacent about their errors.
The errors themselves are a worry about the current “move fast and break things” approach of the generative AI revolution. “It really calls into question the foundations of evaluations and red-teaming as an effective means for understanding or regulating models if small updates to GPT can both be noticeably broken while evading all of their internal evaluation and red-teaming efforts,” says the University of Washington’s Agnew.