The Future of AI is Small (Language Models)

Lost amidst the rush of media excitement and corporate investment in Large Language Models (LLMs) like ChatGPT or Anthropic that consume billions of dollars in training and inference costs, is the question of whether this is the right approach in the first place. The Economist magazine had a recent post exploring this issue.
To wit, most individuals and businesses don’t need the entire ocean of human-written content to perform the task at hand. A programmer working in the Python programming language can do fine with a model tuned for that discrete task. Ditto a cardiologist working just in cardiology. A risk manager in a bank. And so on.
It’s not just that Small Language Models (SLMs) can perform almost as well as LLMs these days; there are the additional factors of security and cost. Let’s start with security. Many institutions have large amounts of proprietary data they’d like accessible to workers to speed up everyday tasks. They do not want this information made public, let alone scraped and consumed by LLMs that hungrily crawl all available human text for evermore training data. Tools are rapidly improving, enabling developers to build SLMs using proprietary data and fine-tune models for specific tasks.
Then there is cost. Training costs for frontier-level LLMs currently run in the billions of dollars, and then billions more to serve, since an average chatbot response is far more computationally expensive than a comparative Google search or traditional database-backed response.
SLMs can increasingly be taught using LLMs and act far more nimbly than their bloated LLM cousins. They can also be served in-house by a company’s existing IT hosting solutions, on a user’s laptop, or even on a mobile device like an iPhone. SLMs aren’t quite there on that last point–witness Apple’s aborted recent launch of Apple Intelligence–but it’s likely coming soon.
There are still great risks in the ecosystem if only a handful of companies can train leading LLM models, but the broader sea of SLMs and even broader use cases suggest that AI is rapidly transforming in ways that might make it unrecognizable in a few years’ time.
Finally, left unsaid in the Economist piece, but which I wonder about, is what impact will ads have on all this? We know that OpenAI serves around 800 million users every week, with only a small percentage, estimated at around 2% last I saw, paying. Given the economics of training and serving these models, that is not sustainable. Ads seem unavoidable.
What happens when LLMs’ focus turns not to providing the best result, but instead towards turning the knobs on engagement and, by proxy, the number of ads served? We only have to look at Yahoo and then Google to see how this plays out. In the late 1990s, Yahoo.com was the leading search engine and notorious for ads, banners, and a dense directory-style layout. When Google.com was launched in 1998, it was almost a blank page with a single search box; the contrast couldn’t have been more striking.
Fast forward to today, where it is difficult even to find an organic search result on Google.com. The top results are now Gemini (its AI offering), paid links, YouTube (which it owns), and then maybe way down at the bottom of the page, the first organic link.
I suspect we will look back on this current age of free, uncluttered LLM chatbots as a fleeting moment. As LLMs become increasingly cluttered and muddied in their goals, SLMs seem poised to gain further traction and reach.