• | 9:00 am

Are large language models the problem, not the solution?

Why smaller may be better when it comes to AI.

Are large language models the problem, not the solution?
[Source photo: Freerange]

There is an all-out global race for AI dominance. The largest and most powerful companies in the world are investing billions in unprecedented computing power. The most powerful countries are dedicating vast energy resources to assist them. And the race is centered on one idea: transformer-based architecture with large language models are the key to winning the AI race. What if they are wrong?

What we call intelligence evolved in biological life over hundreds of millions of years starting with simple single-celled organisms like bacteria interacting with their environment. Life gradually developed into multi-cell organisms learning to seek what they needed and to avoid what could harm them. Ultimately humans emerged with highly complex brains, billions of neurons and exponentially more neural interactions designed to respond to their needs, interactions, and associations with each other and the world. Creating an artificial form of that likely involves more than cleverly generating language with tools trained on massive repositories of largely non-curated text and marketing it as intelligence.

What if aggregating the vast collective so-called wisdom accumulated on the internet and statistically analyzing it with complex algorithms to mindlessly respond to human prompts is really just an unimaginably expensive and resource-intensive exercise in garbage-in-garbage-out? At best, it may be a clever chronicler of common wisdom. At worst, it’s an unprecedented and unnecessary waste of resources with potentially harmful consequences. Eerily foreshadowing a critique of current mainstream AI, Immanuel Kant famously wrote in his landmark work, A Critique of Pure Reason, “thoughts without content are empty, intuitions without concepts are blind.” Put another way, can eons of evolved intelligence be replicated and reduced to the world’s greatest parrot or the mother-of-all autocompletes?

With all of the global power, hype, and resources behind this one approach, you may have the impression that it is the only viable way to create an artificial form of human intelligence. Fortunately, it is not.

Incrementalism

On the incrementalist end of the spectrum of AI research and development, there are approaches that seek to make more efficient use of resources such as grouping small language models (SLMs) with AI agents (https://www.fastcompany.com/91281577/autonomous-ai-agents-are-both-exciting-and-scary) to allow more focused, economical inquiries and responses. (See, Small Language Models are the Future of Agentic AI, Cornell University, https://arxiv.org/abs/2506.02153). The theory is simple: employ flexible, efficient AI agents (technology that can autonomously interact with the environment and perform tasks without human supervision) to access SLMs, smaller, more targeted, and less resource-intensive sets of data.

The underlying theory is the same for SLMs and LLMs—aggregating data and statistically modeling it to generate text or other data. SLMs are just a smaller and more efficient (but inherently more limited) way of doing this. This approach can incorporate additional technology to achieve greater accuracy such as retrieval augmented generation (RAG). RAG can access more targeted, verifiable, and critically, real-time information rather than simply relying on static (pretrained) data alone.

A whole greater than the sum of its parts

A more significant possible alternative to the LLM and GPT architecture that more closely simulates how we think is based on attempting to replicate evolutionary biology. One company pioneering such work is Softmax (named for a statistical function used in machine learning) led by a cofounder of Twitch, Emmett Shear, who briefly served as CEO of OpenAI. This approach is modeled on cellular biology and the idea that individual parts (cells) working (or in alignment) with each other can form a whole with greater coordinated functionality than the individual parts. A human being is made up of individual but synchronized cells that, on their own, don’t function like us, but somehow cohere to allow us to think and function as human beings. In terms of building a computer model, AI agents are the equivalent of cells in this approach that in theory at least, can work together to form a greater functioning, learning entity.

If the current domination of LLMs and GPT architecture continues and other innovative approaches fall (or are pushed) by the wayside, it wouldn’t be the first time in the history of computing that commercial forces overrule potentially better alternatives (see Why bad ideas linger in software, Alan Kay, 2012, address to the Congress on the Future of Engineering Software).

As Albert Einstein famously noted, if he had an hour to save the world, he would spend 55 minutes defining the problem and five minutes solving it. The massive entities pushing the current dominant approach to AI development have yet to define the problem they are trying to solve. LLMs and GPT have proven able to perform tasks that people find useful and they will likely continue to do so. The question is, what if anything, does that have to do with intelligence, human or otherwise?

  Be in the Know. Subscribe to our Newsletters.

ABOUT THE AUTHOR

More

More Top Stories:

FROM OUR PARTNERS