Admittedly, it’s a provocative title. However, this statement is not originally mine; it is borrowed from Andrew Ng who already first made the comparison in 2017. While electricity has indeed become a commodity, it's also an indispensable part of modern life. Will foundational models (e.g., from OpenAI, Anthropic) eventually follow a similar trajectory? Becoming as fundamental and taken for granted as electricity, yet in their outputs from different providers as indistinguishable as utility providers and rather competing over cost?
In the following sections, I’ll reflect on what justifies the unparalleled investment inflows into foundational model providers and whether there will be an eventual payback or if this is indeed a race to the next commodity.
The market speaks a clear and quite bullish language: The investment fervor surrounding foundational or frontier model startups like OpenAI, Anthropic and others is red hot. With funding reaching astronomical figures in the double-digit billions, it's clear the stakes are high and the expectation by investors for groundbreaking advancements & broad adoption of AI is even higher.
Why these huge ticket sizes? The answer lies partly in the past successes of large language models (LLMs), whose performance has seen predictable improvements with increases in model sizes, hardware improvements and training data—developments that, unsurprisingly, require hefty financial backing.
And that requires more than open-source charity. In a plot twist worthy of a Silicon Valley telenovela, OpenAI, once the open-source angel of the tech world, morphed into its final form: a profit-churning, closed-source Godzilla backed by Microsoft. Not even its own board could prevent this metamorphosis. Meanwhile, Elon Musk's recent lawsuit over OpenAI's fate is the cherry on top.
Who is providing the capital? For starters, we have the usual VCs (e.g., Andreessen Horowitz, Sequoia) and a number of corporations with their Corporate Venture Capital (CVC) arms.
Analyzing the corporate players reveals insights into their AI strategies: For instance, it's hardly shocking that NVIDIA has become a familiar face in several cap tables of the leading companies. However, it's also a strategic move: NVIDIA has a stake in ensuring the tech ecosystem remains a competitive battleground and doesn’t become too concentrated, which is why they’ve primarily invested in the challengers, not so much into the “incumbents” (i.e., OpenAI and Anthropic). Although, in a notable gesture, Jensen Huang donated NVIDIA's first DGX GPU server also to OpenAI (and Musk) as far back as 2016.
Yet, as we project the evolution of frontier models, we encounter a series of complications that show the path forward might not be as straightforward as simply scaling up.
Eroding Differentiation
On the surface, LLMs built on transformer architectures seem to be hitting a performance threshold. Despite increasing training data and compute power, further investments in these models are yielding diminishing returns. This trend suggests that foundational models may become less of a market differentiator in the future. In a market where every model offers similar levels of excellence, competition would shift from "who's better" to "who's more cost-effective," similar to the commoditization of electricity.
Another model or architecture breakthrough (or even entirely new concepts, like liquid neural networks or hybrid concepts) might unlock the next performance plateau (and give the inventor an edge).
Competitive pressure and free alternatives
The competitive landscape is fiercely crowded, with over 25 public LLMs offered by a diverse array of providers, hinting at an upcoming market consolidation.
Additionally, open-source models are gaining traction, particularly those from Meta, which challenges proprietary models by providing high-quality alternatives at no cost. Let me emphasize: One of the leading AI research companies, Meta, is giving away its models for free.
Why would Meta offer its models for free? While there are plenty of reasons for open sourcing software, a few stick out for Meta:
It is also worth noting that Llama’s license agreements explicitly restrict competitors from using it and they don’t allow using Llama models to improve competitor models.
Low defensibility and lock-in challenges
From a technical user standpoint, switching between LLM APIs is as simple as changing a line of code. However, integrating a new model into existing applications requires thorough re-testing and re-tuning, a task easier said than done (and the source of many new startups typing to tackle LLM validation and ops).
Because of this, companies like OpenAI are working hard to create ecosystems that encourage developer and user loyalty through services that offer higher lock-in, such as custom models, fine-tuning capabilities and marketplaces (e.g., custom GPTs and plugins).
Specialization vs. Generalization
While the advantage of using state-of-the-art models like GPT-4 for prototyping is undeniable (they are called “frontier” for a reason), many specialized applications can achieve satisfactory results with smaller (and cheaper!) fine-tuned models. This nuance suggests that foundational model providers might be overlooking significant opportunities in the production and inference segments going forward, potentially missing out on valuable revenue and data.
Despite these challenges, it's premature to dismiss foundational models as commodities. They continue to evolve, showcasing improvements in cost-efficiency, performance, and unpredictably emergent capabilities that defy expectations. These breakthroughs are a recurring reminder of the inherent potential for AI to surprise and outperform, even in areas previously thought years away (e.g., Sora, or at the time of release: ChatGPT).
And as progress in AI remains highly unpredictable, it is entirely possible that the next performance leap, based on a new or existing architecture, could happen tomorrow. Countless new papers published every day (and an increasing number kept private), could be the seed for that (e.g., 1-bit LLMs to just name one).
One might wonder if there is a reasonable performance limit for models. Can a model become “too smart” to be applicable? I don’t think so. For many use cases (e.g., trading, business strategies - everything within game theory), a better model will get better results. Therefore, as model capabilities advance, the company with the best frontier model will be able to serve use-cases that others cannot.
A classic strategy that every decent McKinsey slide will warmly recommend when corporates are building new ventures: Make it an ecosystem!
While successful ecosystems can create a mold towards the competition, lock-in users (reduce churn) and unlock new revenue streams, here are some specific examples of how foundational model providers can leverage ecosystems to achieve these benefits:
Lock in end-users (B2C):
Lock in devs and B2Bs:
Collect user feedback and data (aka fire up the data flywheel)
If you can’t beat them at the same game - play a different one! The ability to stand out and offer unique solutions will be critical, especially for more “niche” foundational model companies. Some examples on how to differentiate:
1. Support different languages or nich cases
Support a wider range of languages, catering to users from diverse linguistic backgrounds e.g., Aleph Alpha (German) and Mistral (French) or possibly a model for Chinese (from Alibaba?)
2. Optimize for compliance:
E.g., optimize the frontier model for compliance with (future) EU regulations or "sovereign" LLMs located in certain jurisdictions
3. Other functional and non-functional differentiators:
E.g., develop models with explainable AI capabilities (like Aleph Alpha promotes), enhanced IP protection or more seamless cloud/enterprise integration (see also ecosystem ideas above)
The meteoric rise of foundational AI models and the massive investments pouring into their development is a high risk, high reward game.
The fierce competition between foundational model startups will continue and likely be continuously disrupted by future breakthroughs as transformers are likely not the last architecture.
In all of that, only one thing is for sure: What an exciting time to be alive!