• | 8:00 am

Why the ‘one AI model to rule them all’ myth needs to die

The path to superintelligence involves diverse systems of models working together.

Why the ‘one AI model to rule them all’ myth needs to die
[Source photo: NicoElNino/Adobe Stock]

Welcome to AI DecodedFast Company’s weekly newsletter that breaks down the most important news in the world of AI. You can sign up to receive this newsletter every week here.

The myth of scaling to reach AGI is wearing thin

The first two years of the AI boom were all about monolithic large language models (LLMs) such as those OpenAI used to power ChatGPT. But the trick that produced the big intelligence gains in those models—scaling up their size and giving them more compute power—is now yielding diminishing returns. It turns out that models can learn only so much from pretraining on huge swaths of internet content. Yes, models continue getting smarter, but experts say that the intelligence gains are now coming mainly from human intelligence, such as human-labeled training data or human feedback on outputs.

It appears, then, that single-frontier models are not the path to artificial general intelligence (AGI), in which AI is smarter than humans across a wide variety of tasks. Rather, as Google DeepMind CEO Demis Hassabis suggested to me last year, AGI might be achieved using a system of AI models, in which LLMs would work in concert with other kinds of AI models—some of them generative, some not.

Large language models are built on top of neural networks, large math equations that take their inspiration from the workings of the neurons in the human brain. Many thinkers in the industry believe that AI systems must borrow a lot more from the human brain in order to reach higher levels of synthetic intelligence. (See: Numenta’s Thousand Brains Project.)

Andrew Filev, founder and CEO of Zencoder, points out that the human brain is a far more diverse system than the LLMs we have today. “It’s not a unified system. . . . There is some specialization in the brain,” Filev says. “What the hippocampus does is very different than what the prefrontal cortex does. It’s a complex system with multiple feedback loops, multiple different levels with which it operates, multiple independent agents working in the background . . . so it’s only logical that AI is going to end up being similarly complex.”

An LLM capable of processing lots of media (text, images, audio, video, etc.) will play a key role in an AI system, as will “critic” models whose main purpose is to give feedback on LLMs’ output.

While OpenAI hasn’t been explicit about the architecture of its o1 model family, it’s likely an early pointer to the AI systems of the future. The company has said that o1 is capable of trying different approaches to a problem, and has a mechanism for knowing when it needs to take a few steps backward and chart a new path to a solution. The company doesn’t say whether that mechanism is a separate “critic” model, but there’s reason to suspect it is. (LLMs are capable of some level of self-criticism, but it’s limited.)

The idea of “one model to rule them all”—one supergenius model to do all things—is likely on the way out. It seems more likely that organizations will ultimately draw on a whole ecosystem of synthetic brains in order to achieve the transformative effects of AGI.

Amazon debuts a new family of models, emphasizing “choice”

Amazon was not initially one of the star players in the generative AI boom. On Tuesday, it showed signs of catching up. The company announced a new family of AI models called “Nova” that perform at near-state-of-the-art levels compared to models from OpenAI, Anthropic, and Google. Amazon AGI VP Vishal Sharma tells Fast Company that his company has had its foot on the gas over the past year to push the performance of Amazon’s AI models, and has in fact been using the Nova models to run aspects of Amazon’s business.

Sharma wouldn’t divulge the size of the Nova models in parameter count, but the Nova Premier model, the largest, is likely in the hundreds of billions of parameters. That model can be used for more advanced AI tasks that require reasoning and agentic capabilities. The Nova line also includes smaller models (Lite and Micro) that are faster and less expensive to operate, as well as a new image generator (Canvas) and video generator (Reel).

Amazon operates differently than pure AI labs like OpenAI and Anthropic (though it is a major investor in Anthropic). Its first business is selling Amazon Web Services cloud data storage and app hosting. But AI is increasingly powering apps used by AWS customers, so AWS needs to provide a broad selection of models, including bleeding-edge frontier AI models. AWS has said that selling AI services is already a “multibillion-dollar” business.

So, while the new Nova models are important for Amazon’s place in the AI race, they are just one menu item within AWS’s AI offerings. AWS offers homegrown AI models for everything from video analysis to chatbots to generative AI, and has for more than a decade. AWS customers can also choose to build their apps on top of third-party LLMs such as Anthropic’s Claude 3.5 Sonnet, Meta’s Llama 3.2, and Cohere’s Command R.

Amazon looks at AI in a very utilitarian way. “It’s sort of typical classic Amazon for you, which is looking really hard at how people are going to use something and not trying to be flashy for the sake of flashiness or doing demos, but looking actually at how things are getting realized on the ground in a very practical, useful way,” Sharma says. “And you know pretty much every business in existence and every developer who has been working with this for a while is going to tell you that costs are very important.”

A look at our AI 20 series

We’re proud to announce the launch of our second annual AI 20 series, which spotlights the most interesting technologists, entrepreneurs, corporate leaders, and creative thinkers shaping the world of artificial intelligence. Among those profiled as part of the series is AI pioneer Noam Shazeer, who returned to Google this year to help guide the company’s Gemini models toward artificial general intelligence.

When I spoke to Shazeer for the profile, he expressed confidence that Google is in a strong position to emerge as the leader in large frontier models. After all, he pointed out, the company developed and owns much of the technology behind the LLM revolution.

“Google has a culture of empowering every human with all the world’s information, creating trillions of dollars of value for the company and tens of trillions of dollars of value for the world,” he told me. “Replace ‘information’ with ‘intelligence’ and ‘trillions’ with ‘quadrillions,’ and this is my goal for Google’s future.”

  Be in the Know. Subscribe to our Newsletters.

ABOUT THE AUTHOR

Mark Sullivan is a senior writer at Fast Company, covering emerging tech, AI, and tech policy. Before coming to Fast Company in January 2016, Sullivan wrote for VentureBeat, Light Reading, CNET, Wired, and PCWorld More

More Top Stories:

FROM OUR PARTNERS

Brands That Matter
Brands That Matter