- | 8:58 am
Google AI’s hilariously bad answers aren’t the big problem
When a search engine is ridiculously wrong, we can deal with it. But AI’s subtler mistakes and general sloppiness are a more serious long-term problem.
Contrary to anything you’ve been told recently, putting glue on pizza is not a good idea. Cats have not been to the moon. Most doctors don’t recommend eating rocks.
None of these topics were subject to much discussion until last week, after Google began broadly rolling out “AI Overviews” at the top of some of its search results. Users soon discovered examples of these machine-generated summaries that were not just inaccurate but shockingly, absurdly so—in part because Google’s algorithm placed far too much stock in fodder such as ancient Reddit posts (the glue-on-pizza tip) and Onion satires (the rock diet). A few more deeply troubling instances of the feature getting things wrong also surfaced, such as it parroting the fringe talking point that former President Barack Obama is Muslim.
The latest in a string of Google AI mishaps, AI Overviews’ tendency toward fabulism inspired many people to experiment with odd queries and then share the results on X and other social networks. The more bizarre results certainly aren’t typical, and among the lessons of this whole kerfuffle is that human beings excel at spreading misinformation without the help of generative AI. At least one of the most widely shared examples, involving the worst possible advice for someone who’s feeling depressed, was a pretty obvious hoax. (As I write, the fabricated summary is still cited as if it were real in articles on multiple websites that show up in—wait for it—Google search results.)
Google says it’s working to refine AI Overviews and has removed inaccuracies as it’s learned about them. It’s possible, though, that the company didn’t fully anticipate that the feature could make such a lousy first impression. Unlike Microsoft’s Copilot-infused Bing, the summaries have far more guardrails than a Chatbot such as ChatGPT or Google’s own Gemini. They’re also more limited in scope than Perplexity, a startup AI search engine that bills itself as “Where knowledge begins” but is far too prone to hallucination in my experience. Google’s summaries are brief, anodyne in tone rather than chatty, and complementary to search in its traditional form instead of a wholesale replacement of it. Oftentimes, they don’t feel radically different than the snippets that have long appeared at the top of many Google results pages.
I will save for another newsletter the question of whether AI Overviews are going to destroy the media business by discouraging clicks on sites such as, well, the one I work for. And I’m not saying that my own experience with the summaries has been awful. The vast majority I’ve seen have been adequately accurate and clear. Still, even in relatively low-key form, the whole act of generating AI responses to search queries is rife with potential to mislead rather than inform. (Over at Tedium, Fast Company contributor Ernie Smith provided a useful tip for grinding down your Google searches to the simple list of web links that made the search engine popular in the first place.)
At least in its current state, the way generative AI rehashes stuff without truly understanding what it’s saying makes it inherently sloppy, in a way that clashes with Google’s mission of “organizing the world’s information and making it universally useful and accessible.” Being almost right but not quite isn’t close enough. For instance, contrary to one AI Overview I got, Dr. Seuss didn’t write a book titled King Grimalken and the Wishbones, and I can’t even find any web pages that claim he did—it was a magazine story.
Moreover, Google’s bar should be a lot higher than merely avoiding outright error. The AI Overview I got for “How do instant cameras work?” isn’t patently off, and therefore won’t become a social-media meme. But it did omit some critical details. I’d give it a C- as an explanation, and hope it doesn’t become the world’s default understanding of how a Polaroid picture develops (which—if people grow less likely to click on the results below the AI Overviews—it could).
Google is contending with a widespread impression that it’s lagging behind companies such as OpenAI and Microsoft in productizing AI. That puts it under tremendous pressure to bake the technology into its offerings. The company hasn’t shared any details about the percentage of AI Overviews that are less than stellar, or what it would consider to be a reasonable figure. At Google Search’s scale, however, even a minuscule percentage of responses that run off the rails would impact millions of people. With its namesake search engine’s reputation for excellence already a tad shopworn, Google risks damaging it further by seeming too willing to accept too many subpar responses as a necessary consequence of embracing generative AI.
Perversely, the company may have done us a favor by falling on its face in so public a fashion. Unthinkingly trusting AI-generated material’s accuracy is a terrible mistake; we need all the reminders of its fallibility we can get. The glue-on-pizza bug is wonderfully easy to comprehend, no computer-science degree required. Of course a mathematical algorithm might fail to comprehend a joke that even the dimmest Reddit user would grasp. Just keep in mind: It’s the errors that don’t call attention to their ridiculous selves that could do the most damage to Google Search and everyone who relies on it.