- | 8:00 am
Elon Musk’s selling point for ChatGPT competitor Grok may be its fatal flaw, experts say
Using the corpus of X posts as training data sounds like a good idea—until you look at the tone of conversation on the social platform.
Just four months after launching xAI, Elon Musk’s AI company has its first product.
Grok was launched in beta overnight, with Musk promising a personality- and humor-filled chatbot that’s been, in part, trained on the vast volumes of user data on X, the platform formerly known as Twitter.
That ability to tap into a real-time stream of commentary and analysis on the social network has been posited by Musk as a game changer. Unlike ChatGPT and other large language model-powered chatbots, which are often frozen with historic knowledge dating back a year or more, Grok should, in theory, be able to address queries about events as they happen. Indeed, Musk tweeted screenshots showing Grok answering questions about Sam Bankman-Fried’s recent conviction as evidence of the chatbot’s up-to-date prowess. (That it got a simple factual point about the trial—the length of time the jury deliberated—wrong wasn’t mentioned by Musk.)
Unsurprisingly, not everyone is convinced of Grok’s unvarnished excellence. Carissa Véliz, an associate professor at the University of Oxford’s Institute for Ethics in AI, takes issue with Musk’s initial premise that rival chatbots are being trained on politically correct data. “These assertions are worrying for many reasons,” says Véliz, adding that Musk is misleading with his claims that LLMs should try to speak the truth, rather than echoing a liberal viewpoint.
“LLMs are not truth-tracking,” she says. “They make statistical guesses. There’s a big difference.” She also worries that in pursuing Musk’s goals of absolute truth, xAI’s chatbot could make sexist or racist claims.
For an example of how Grok could veer off course and start producing harmful content, we might look to Microsoft’s Tay experiment in 2016. That year, Microsoft’s Twitter-based chatbot, Tay, began spewing misogynistic and racist remarks within hours after Twitter users recognized that it could be manipulated based on the input data it was fed. “I assume Grok will quickly become one of the most abused LLMs—exactly the opposite of what Musk says he wants,” says Keegan McBride, a departmental research lecturer in AI, government, and policy at the University of Oxford’s Internet Institute.
Musk’s much-professed use of posts shared on X to train and inform the output of Grok is also a potential pitfall—as much as it could be a massive benefit, believes Véliz. “It could be a boon because that’s a lot of data, and at least until recently, Twitter used to be the preferred social media platforms of journalists and academics,” she says. However, she’s concerned that the format of Twitter posts—from their character limits for nonpaying users to the way that conversations often become insult battles, could negatively impact the outputs of the LLM.
It appears theoretically possible that Grok could draw from posts made on X about key news events that are wrong or include mis- or disinformation. The platform was criticized in October for its inability to sift fact from fiction when it came to the Israel-Hamas war. (Neither xAI nor Musk responded to an interview request.)
Véliz wonders whether Musk’s big selling point for Grok—that it mines X to produce real-time answers to questions about live world events—could come back to bite Musk. “More worrisome, that Grok will have access to real-time data from X introduces a much higher risk of it being used to create or peddle misinformation,” she says. “You can imagine a troll farm peddling misinformation on X and then getting amplified through Grok.”