• | 12:00 pm

AI bots are too ‘creative’ for their own good

ChatGPT, Bing, and Bard excel at making stuff up—even if you just want facts.

AI bots are too ‘creative’ for their own good
[Source photo: Rawpixel]

If you say that a human being is creative, the odds are pretty good that you mean it as praise. But in the burgeoning generative AI-infused chatbot wars, “creative” is a convenient code word for “playing fast and loose with the facts.”

At least that’s the impression I’ve been developing lately. Last week, for instance, when Google began letting the public try Bard, its competitor to OpenAI’s new Bing, it went out of its way to lower expectations and stress that Bard is not a substitute for Google search in its conventional form. The Bard page labels it as an experiment, includes a disclaimer that it may display inaccurate information, and links to a FAQ full of provisos about the possibility of error. All of which is good because the bot is so prone to error that I wouldn’t trust anything it said without checking it first.

But the way Bard introduces itself is with a cheery, “I’m Bard, your creative and helpful collaborator.” And indeed, if you use the bot to brainstorm raw ideas for you to refine—say, “Give me five ideas for blog posts on the history of theatrical performances”—its creativity might actually be an asset.

As for the new Bing, Microsoft launched it in a loose-canon mode that got stuff wildly wrongand didn’t like having that pointed out. After it became clear just how erratic Bing was, Microsoft made it default to terser and less conversational responses, which greatly reduced the amount of fantasy it spewed in place of fact. But you can still switch Bing chat into a “more creative” setting. As far as I can see, that leaves it much more likely to make confident declarations that are at odds with reality, though Microsoft doesn’t state that explicitly.

I’m not saying there’s no room for creativity with these chatbots. Paying users of ChatGPT Plus can now try a version using GPT-4, the newest version of OpenAI’s large language model, which OpenAI says is both “40% more likely to produce factual responses than GPT-3.5 on our internal evaluations” and“more creative and collaborative than ever before.” Regardless of whether ChatGPT can pass a bar exam, I still find its purely factual responses to be pockmarked with statements that sound like they could be true, but aren’t. Last weekend, however, I had an exhilarating time playing text adventure games that ChatGPT generated for me on the fly. It was easily the most rewarding experience I’ve had with any AI chatbot.

And in some ways, today’s most useful generative AI tools are the ones that focus purely on creativity. Last week, for instance, Adobe unveiled Firefly, its answer to AI image generators such as DALL-E, Stable Diffusion, and Midjourney. The whole category is rife with potential pitfalls, from the possibility of it plagiarizing human illustrators and photographers to it being used to fake seemingly real news imagery. But everything Firefly produces is unabashedly synthetic, which makes its creativity an unalloyed virtue, as long as Adobe can prevent its willful misuse. For now, maybe the bots would also be better off picking a lane.


For the March 15 Plugged In, I explained why I’ve held onto as much of my old email as possible—everything back to mid-2008, plus a huge chunk of earlier correspondence. I got lots of feedback on the piece, much of it from kindred spirits. “I felt vindicated and heard,” wrote Rich Schineller. “When my kids see my inbox holding 370,000 messages, they are aghast.”

Some of my fellow email packrats shared details on their own preservation techniques. “Gmail sucks for offline storage,” said Lila Zubik. “I’ve been a Microsoft geek and use Outlook, which allows offline storage via PST files which I [create] regularly.” Sue Lee told me about her dead-tree archive of conversations with friends early in this century. “Of course, emails work from most recent to oldest in the string of conversation,” she explained. “So I’d have to wait till the last comments were made before printing out the whole conversation thread, reversing the page order so that you could read it from beginning to end chronologically, then stapling the pages together and storing it.”

Most importantly, several readers brought up a sensible question I hadn’t considered. “Have you given any consideration to the carbon footprint of keeping all that email?” asked one, requesting to be identified as Amy M. “If everyone did as you suggest, that would be a lot of necessary servers sucking up energy to keep a lot of emails no one will ever look at again.”

Amy is undeniably right. Google is working on a new Workspace feature that will apparently let me measure the impact of my email trove quite precisely. I’m ready to be chastened and willing to rethink my approach moving forward.

For now, I did enough research to realize that my Gmail footprint is tiny in comparison to that of other stuff I have stashed in Google Workspace and Dropbox, including vast quantities of photos and videos I’ll never need again—some of which are up there in duplicate or triplicate. Deleting the detritus sounds like a worthy project.

  Be in the Know. Subscribe to our Newsletters.


Harry McCracken is the technology editor for Fast Company, based in San Francisco. In past lives, he was editor at large for Time magazine, founder and editor of Technologizer, and editor of PC World. More More

More Top Stories: