Foundational models as commodity?

Admittedly, it’s a provocative title. However, this statement is not originally mine; it is borrowed from Andrew Ng who already first made the comparison in 2017. While electricity has indeed become a commodity, it's also an indispensable part of modern life. Will foundational models (e.g., from OpenAI, Anthropic) eventually follow a similar trajectory? Becoming as fundamental and taken for granted as electricity, yet in their outputs from different providers as indistinguishable as utility providers and rather competing over cost?

In the following sections, I’ll reflect on what justifies the unparalleled investment inflows into foundational model providers and whether there will be an eventual payback or if this is indeed a race to the next commodity.

The market's verdict

The market speaks a clear and quite bullish language: The investment fervor surrounding foundational or frontier model startups like OpenAI, Anthropic and others is red hot. With funding reaching astronomical figures in the double-digit billions, it's clear the stakes are high and the expectation by investors for groundbreaking advancements & broad adoption of AI is even higher.

total funding into foundation model startups — Funding into AI foundation model startups and selected corporate investors. Not exhaustive.

‍

Why these huge ticket sizes? The answer lies partly in the past successes of large language models (LLMs), whose performance has seen predictable improvements with increases in model sizes, hardware improvements and training data—developments that, unsurprisingly, require hefty financial backing.

And that requires more than open-source charity. In a plot twist worthy of a Silicon Valley telenovela, OpenAI, once the open-source angel of the tech world, morphed into its final form: a profit-churning, closed-source Godzilla backed by Microsoft. Not even its own board could prevent this metamorphosis. Meanwhile, Elon Musk's recent lawsuit over OpenAI's fate is the cherry on top.

‍

Twitter twist with Sam — A preview of the Elon / Sam twist

Follow the money

Who is providing the capital? For starters, we have the usual VCs (e.g., Andreessen Horowitz, Sequoia) and a number of corporations with their Corporate Venture Capital (CVC) arms.

Analyzing the corporate players reveals insights into their AI strategies: For instance, it's hardly shocking that NVIDIA has become a familiar face in several cap tables of the leading companies. However, it's also a strategic move: NVIDIA has a stake in ensuring the tech ecosystem remains a competitive battleground and doesn’t become too concentrated, which is why they’ve primarily invested in the challengers, not so much into the “incumbents” (i.e., OpenAI and Anthropic). Although, in a notable gesture, Jensen Huang donated NVIDIA's first DGX GPU server also to OpenAI (and Musk) as far back as 2016.

‍

JEnsen donating DGX to Open AI — Those were the days my friends, when Musk and OpenAI were still BFFs and Jensen was paying the drinks

‍

High risk… high reward?

Yet, as we project the evolution of frontier models, we encounter a series of complications that show the path forward might not be as straightforward as simply scaling up.

‍

‍Eroding Differentiation

On the surface, LLMs built on transformer architectures seem to be hitting a performance threshold. Despite increasing training data and compute power, further investments in these models are yielding diminishing returns. This trend suggests that foundational models may become less of a market differentiator in the future. In a market where every model offers similar levels of excellence, competition would shift from "who's better" to "who's more cost-effective," similar to the commoditization of electricity.

Plateauing LLM knowledge benchmarks for GPT3 GPT3.5 and GPT4 — Taken from "The AI digest" and Zhen et al 2023

‍

Another model or architecture breakthrough (or even entirely new concepts, like liquid neural networks or hybrid concepts) might unlock the next performance plateau (and give the inventor an edge).

‍

Competitive pressure and free alternatives

The competitive landscape is fiercely crowded, with over 25 public LLMs offered by a diverse array of providers, hinting at an upcoming market consolidation.

Additionally, open-source models are gaining traction, particularly those from Meta, which challenges proprietary models by providing high-quality alternatives at no cost. Let me emphasize: One of the leading AI research companies, Meta, is giving away its models for free.

Why would Meta offer its models for free? While there are plenty of reasons for open sourcing software, a few stick out for Meta:

Attract & retain talent: Open-sourcing their models allows Meta to attract talented AI researchers, similar to OpenAI's strategy when they were still "open"
Reduce competitor’s moat: If they decide to compete in the foundational model game in the future, this move definitely makes their competitor’s life harder
Improve brand marketing and community support: Meta’s move has definitely changed public perception about the company (eventually also beyond the tech-bubble)
Improve model development : Meta develops these models for its diverse range of products anyway. Unlike Microsoft or Google, Meta does not need to safeguard its cloud business, which gives it more freedom in sharing its models and benefit from a larger dev community around the model development
Maybe most importantly: Go-to-market with a freemium strategy: What if the Llama 2 models are only the freemium predecessors for a later larger (cloud based) commercial model that is coming?

It is also worth noting that Llama’s license agreements explicitly restrict competitors from using it and they don’t allow using Llama models to improve competitor models.

‍

Low defensibility and lock-in challenges

From a technical user standpoint, switching between LLM APIs is as simple as changing a line of code. However, integrating a new model into existing applications requires thorough re-testing and re-tuning, a task easier said than done (and the source of many new startups typing to tackle LLM validation and ops).

Because of this, companies like OpenAI are working hard to create ecosystems that encourage developer and user loyalty through services that offer higher lock-in, such as custom models, fine-tuning capabilities and marketplaces (e.g., custom GPTs and plugins).

‍

Specialization vs. Generalization

While the advantage of using state-of-the-art models like GPT-4 for prototyping is undeniable (they are called “frontier” for a reason), many specialized applications can achieve satisfactory results with smaller (and cheaper!) fine-tuned models. This nuance suggests that foundational model providers might be overlooking significant opportunities in the production and inference segments going forward, potentially missing out on valuable revenue and data.

‍

Buy a ticket to play the long game

Despite these challenges, it's premature to dismiss foundational models as commodities. They continue to evolve, showcasing improvements in cost-efficiency, performance, and unpredictably emergent capabilities that defy expectations. These breakthroughs are a recurring reminder of the inherent potential for AI to surprise and outperform, even in areas previously thought years away (e.g., Sora, or at the time of release: ChatGPT).

And as progress in AI remains highly unpredictable, it is entirely possible that the next performance leap, based on a new or existing architecture, could happen tomorrow. Countless new papers published every day (and an increasing number kept private), could be the seed for that (e.g., 1-bit LLMs to just name one).

One might wonder if there is a reasonable performance limit for models. Can a model become “too smart” to be applicable? I don’t think so. For many use cases (e.g., trading, business strategies - everything within game theory), a better model will get better results. Therefore, as model capabilities advance, the company with the best frontier model will be able to serve use-cases that others cannot.

Build ecosystems, not just APIs

A classic strategy that every decent McKinsey slide will warmly recommend when corporates are building new ventures: Make it an ecosystem!

While successful ecosystems can create a mold towards the competition, lock-in users (reduce churn) and unlock new revenue streams, here are some specific examples of how foundational model providers can leverage ecosystems to achieve these benefits:

‍

Lock in end-users (B2C):

Develop user-friendly applications that make it easy for end-users to interact with foundational models (e.g., Chat-GPT app)
Offer customizations that are tailored to specific user needs (e.g., custom GPTs)
Integrate with popular platforms and devices to reach a wider audience (e.g., Pi’s WhatsApp integration or the Rabbit R1)

Lock in devs and B2Bs:

Cloud provider partnerships that make it easy to deploy and use models while leveraging distribution (e.g., through Azure)
Offer adjacent & more complex services such as embedding model APIs or fine-tuning support for RAG or agents
Create developer communities (e.g., OpenAI dev forum)

Collect user feedback and data (aka fire up the data flywheel)

Feedback mechanisms (e.g., thumbs up/down)
Collect and train on user data to improve future releases and increase mold

Dare to be different

If you can’t beat them at the same game - play a different one! The ability to stand out and offer unique solutions will be critical, especially for more “niche” foundational model companies. Some examples on how to differentiate:

‍

1. Support different languages or nich cases

‍Support a wider range of languages, catering to users from diverse linguistic backgrounds e.g., Aleph Alpha (German) and Mistral (French) or possibly a model for Chinese (from Alibaba?)

2. Optimize for compliance:‍

E.g., optimize the frontier model for compliance with (future) EU regulations or "sovereign" LLMs located in certain jurisdictions

3. Other functional and non-functional differentiators:

E.g., develop models with explainable AI capabilities (like Aleph Alpha promotes), enhanced IP protection or more seamless cloud/enterprise integration (see also ecosystem ideas above)

Synthesis

The meteoric rise of foundational AI models and the massive investments pouring into their development is a high risk, high reward game.

The fierce competition between foundational model startups will continue and likely be continuously disrupted by future breakthroughs as transformers are likely not the last architecture.

In all of that, only one thing is for sure: What an exciting time to be alive!

Share this post