• | 9:00 am

The most believable robots are missing this simple trait

Robots don’t need to be smarter, they need to be faster.

The most believable robots are missing this simple trait
[Source photo: FC]

Even before the Czech writer Karel Čapek coined the term “robot” in his 1920 novel R.U.R., humans have been obsessed with the idea of creating life out of inanimate objects. From the ancient golems and the mythical bronze man who defended Crete against pirates around 400 BC to Yoshiyuki Tomino’ Gundams and George Lucas’s C-3PO, our stories have featured beings of metal, clay, or protoplasm that were “alive” just like us. Today, watching the video of the ChatGPT-brained Figure 01 robot interacting with a person, it seems clear to me that humanity is on the brink of turning into gods. There’s only one element that is missing.

The element is not artificial general intelligence. It’s not a human appearance either. Sure, having an AGI brain capable of comprehending and adapting to the physical world will be key to making synthetic life more lifelike. And yes, eventually we will get to the point in which we will be able to create perfect “replicants” like in Blade Runner, Westworld, and The Bicentennial Man.

But all of that doesn’t matter now.

In the video released by Figure—a company founded by Brett Adcock with the financial backing of OpenAI, Nvidia, Microsoft, Intel, and Jeff Bezos—a robot who doesn’t have AGI or human appearance makes you think, for a few brief seconds, that it’s alive. It chats with a man, gives him an apple, sorts out some trash while explaining why he gave the man the fruit, then proceeds to put some order on the counter, placing a glass and placing dishes in a drying rack.

For those brief seconds, the voice and interaction felt so real that I thought that maybe it was one of those pre-choreographed dances and acrobatics that Boston Dynamics does with its Atlas robot. Listening to the doubt in his voice, the natural cadence of the hmms and eeers, I even thought that perhaps a human was actually voicing the bot behind the scenes. But no. It was the real thing and, for the moments Figure 01 is speaking, I believed it was a real being. For those brief seconds, the humanoid broke the barrier, and I connected. I connected like I would connect with any other person.

But the brief illusion broke with the long pauses in between answers. That’s when I realized that we already have everything we need to make a robot that can connect to us in the same way that sci-fi artificial beings like HAL 9000, the Terminator, or Bender do in the movies. It comes down to the timing—the speed. That’s what we are missing right now. We already have everything else we need, and the video shows it.

Through millions of years of evolution, our brains are trained to expect a response from the living beings that surround us. In fact, this is an expectation hardwired in our cerebral structure that influences our perception of the world. If it fails, or if there’s no response to our poking and taking, we automatically think something is off. That’s why, when we are maintaining a conversation and we don’t get an instant response—which doesn’t have to be verbal, but it has to be instant—our brains just can’t buy it. It doesn’t matter if the answer is then the smartest, most illuminating thing that anyone has ever said. It just falls flat, just like it does in comedy, where timing is perhaps more important than in any other communication exchange.

Speed and timing are the key reasons why we see the acrobatic videos of Boston Dynamics’ Atlas robot and we all say, “Oh, looks just like a human!” It’s the same with HAL 9000’s responses: His natural conversation flow is what makes him as threatening as a human sociopath in 2001: A Space Odyssey. There’s C-3PO being insufferable for Han Solo in Star Wars. Or Rachael making Deckard fall in love with him. It’s all about the flow of the conversations. But of course that works because everything is scripted, choreographed, and performed by humans.

Here in the real world, the next big barrier is not getting smarter AIs or human-like looks, but getting the timing right so our conversations feel natural. Perhaps the best proof for this is the fast growing popularity of AI-based chat applications, which work great because we are already trained to expect waiting times in our text-based communications with other humans. Here, the pause is normal. And here’s another proof, on reverse: I’ve met plenty of fast yapping humans who are stupider than ChatGPT, yet I would never mistake those humans for robots.

When it comes to direct, live communication—the moments when you need to order a coffee at the counter, give instructions to your babysitting bot, or talk about which could be a good restaurant to take a date to—that’s when timing is the key to the user experience. And perhaps that’s why Nvidia is working on the hardware that will be able to make robots to move and communicate as fluently as humans do. At the speed things are going, I wouldn’t be surprised if we cross this threshold in less than a year, Dave.

  Be in the Know. Subscribe to our Newsletters.

ABOUT THE AUTHOR

Jesus Diaz founded the new Sploid for Gawker Media after seven years working at Gizmodo, where he helmed the lost-in-a-bar iPhone 4 story. He's a creative director, screenwriter, and producer at The Magic Sauce and a contributing writer at Fast Company. More

More Top Stories:

FROM OUR PARTNERS