01-12-23 | 8:00 am

If ChatGPT doesn’t get a better grasp of facts, nothing else matters

OpenAI’s astoundingly glib bot is going to change the world. Before it does, let’s hope it gets far better at knowing what it’s talking about.

[Source photo: EThamPhoto/Getty Images; johnpaulramirez/Getty Images]

Snap quiz: What was the first TV cartoon? That’s easy: The Flintstones, which debuted on ABC in 1960.

But wait. Eleven years before that, there was NBC’s The New Adventures of Pinocchio, created by Guido Martina and Angelo Bioletto. And Max Fleischer’s The Adventures of Peg o’ the Ring aired way back on April 30, 1933. Even that came after Bray Studios’ 1930 experimental cartoon The Creation, which itself was preceded by 1929’s The Funk Popilin Show.

Actually, all of these nominees are dead wrong. They come from OpenAI’s free AI chatbot ChatGPT, which provided a different answer every time I asked it “What was the first TV cartoon?” The Flintstones, as you know, was a real cartoon—it just wasn’t the first one made for TV. The others either didn’t exist or differed wildly from ChatGPT’s descriptions of them. The Adventures of Peg o’ the Ring, for example, was a 1916 theatrical live-action film serial, not a work of animation, and Max Fleischer had nothing to do with it.

(What was the first TV cartoon? Crusader Rabbit, which aired in syndication starting in 1950, traditionally gets the nod, and ChatGPT mentioned it in passing a couple of times. There are a couple of earlier candidates, and if you can name either, you know far more about the topic than any bot.)

It’s only been six weeks since OpenAI made ChatGPT available to the public. But its uncanny ability to understand requests and reply in clear, well-organized prose that reads like it was written by a human already makes its introduction feel like an epoch-shifting moment for the tech industry—if not humanity itself. I can’t think of another piece of new computer science that’s seemed so . . . well, impossible at first blush. (Sorry, 2007 iPhone.)

But whenever I chat with ChatGPT about any subject I know much about, such as the history of animation, I’m most struck by how deeply untrustworthy it is. If a rogue software engineer set out to poison our shared corpus of knowledge by generating convincing-sounding misinformation in bulk, the end result might look something like this. It’s prone to botching the chronological order of events, conflating multiple people with similar backgrounds, and—like an unprepared student—lobbing vague pronouncements that don’t require it to know anything about the topic at hand. How it comes up with gems like an imaginary 1929 cartoon called The Funk Popilin Show, I’m still not sure.

Even when ChatGPT is mostly accurate, as it sometimes is, it often makes a fundamental mistake or two. For example, it told me that former Apple CEO John Sculley was responsible for the iPod, a product released eight years after he left the company. Some people will immediately spot that as an error; others will not. But will anyone bother to fact-check it?

Countless think pieces have been written about the possibility of students cheating by having ChatGPT do their homework for them. I’m more concerned about people earnestly using it as a research tool, buying into its inaccuracies, and spreading them—at which point it will be awfully hard to strike them from the record. Small wonder that Google, whose researchers invented the “transformer” AI that makes ChatGPT possible, is skittish about incorporating anything similar into its search engine.

To be fair to OpenAI, it’s careful to tamp down expectations about the current version of its bot. CEO Sam Altman has tweeted that it’s “a preview of progress” and there’s “lots of work to do on robustness and truthfulness.” The company’s FAQ warns that ChatGPT “can occasionally produce incorrect answers”—a welcome admission of fallibility, though it does so way more often than occasionally in my experience.

ChatGPT’s days as an admittedly flawed experiment may be numbered. Microsoft, which has already invested $1 billion in OpenAI, is contemplating chipping in a further $10 billion in return for a 49 percent stake in the company, report Semafor’s Liz Hoffman and Reed Albergotti. According to The Information‘s Aaron Holmes and Kevin McLaughlin, the software giant plans to embed the technology at ChatGPT’s heart into an array of applications, where it could do everything from making Bing more conversational to allowing Word to automatically draft business documents.

By the time such features are live, they could be powered by GPT-4, the next version of OpenAI’s large language model, which is said to be radically more powerful than GPT-3. Let’s hope that it represents a breakthrough in accuracy.

For now, ChatGPT is most impressive when you steer it in directions that give it some leeway on matters of fact. One of my favorite examples of it being genuinely helpful is its response to a question from Platformer’s Casey Newton: “What are some styles of shoes that every man should have in his wardrobe?” You don’t need to be a footwear expert to judge the quality of the bot’s recommendations, which included oxfords, loafers, sneakers, and Chelsea boots. You might well find them a useful starting point for further research. And even if ChatGPT were to offer advice that wasn’t great—or was just plain weird—there’s no chance of it being a corrupting influence on society.

Or you could just invite it to make stuff up. For example, my colleague Jared Newman has been entertaining his kids by turning them into the stars of ChatGPT-generated bedtime stories. It’s tough to think of a nobler purpose for software that already spins fantasy with such aplomb.

If ChatGPT doesn’t get a better grasp of facts, nothing else matters

OpenAI’s astoundingly glib bot is going to change the world. Before it does, let’s hope it gets far better at knowing what it’s talking about.

Featured Videos

Today's Top Stories:

01

Saudi Arabia's Aramco signs 4-year partnership deal with FIFA

02

UAE investors banking on AI-powered businesses for big returns, finds survey

03

UAE partners with Archer Aviation to introduce electric air taxis by 2025

04

KAUST and NEOM launch world's largest coral restoration initiative in Saudi Arabia

05

Techno-optimism is a powerful tool for change. Is it enough?

More Top Stories:

FROM OUR PARTNERS

Impact

Impact

Impact

News

News

News

Co. Design

Co. Design

Co. Design

Work Life

Work Life

Work Life

Saudi Arabia’s Aramco signs 4-year partnership deal with FIFA

UAE investors banking on AI-powered businesses for big returns, finds survey

UAE partners with Archer Aviation to introduce electric air taxis by 2025

Unparalleled Journalism. Start Your Subscription Today.