Foundation Model
A large AI model trained on vast amounts of general data, designed to be the starting point for many different applications rather than built for a single task.
Added May 17, 2026 · 2 min read
The foundation model paradigm is what determines who has structural power in AI. The organisations that can afford to train frontier foundation models set the terms for everything built on top of them - pricing, access, acceptable use, safety constraints. It is a small group, and the gap between them and everyone else is getting wider, not narrower.
For most of AI's history, you built a separate model for each task. One model to translate languages. Another to summarise documents. Another to answer customer service questions. Each required its own data, its own training, its own maintenance. This was expensive and slow, and each model was brittle - good at one thing, useless at everything else.
Foundation models changed this. The idea is to train one very large model on an enormous, diverse dataset - text from books, websites, code, scientific papers, conversations - and develop general capabilities that can then be adapted to almost any task. Instead of starting from scratch for each application, you start from this shared foundation.
The scale of training involved is hard to overstate. Building a frontier foundation model requires months of computation across thousands of specialised chips, costing hundreds of millions of dollars. This is why only a handful of organisations - Anthropic, OpenAI, Google, Meta, a few others - are actually building foundation models. Everyone else is building on top of them.
Once a foundation model exists, it can be adapted in different ways: fine-tuned on specialised data, given access to specific tools, or simply prompted in different ways for different purposes. A single foundation model might power a coding assistant, a customer service bot, a document analysis tool, and a creative writing app - all at the same time, across different companies.
This architecture has concentrated enormous power at a small number of model providers, which is one of the defining tensions in the AI industry today. Access to foundation models - how much they cost, who can use them, what restrictions apply - shapes what AI products are even possible to build.
Analogy
A Swiss Army knife versus a drawer full of single-purpose tools. A foundation model is the Swiss Army knife - broad enough to be adapted for many different jobs. Fine-tuning is like sharpening one of the blades for a specific purpose. The underlying tool is the same; the specialisation makes it better at particular tasks.
Real-world example
When a hospital builds an AI system to assist with reading medical scans, they rarely build a model from zero. They start with a foundation model that already understands language and images in general, then adapt it on medical data. The foundation provides the base capability; the specialised training provides the domain expertise.
Why it matters
The foundation model paradigm is what determines who has structural power in AI. The organisations that can afford to train frontier foundation models set the terms for everything built on top of them - pricing, access, acceptable use, safety constraints. It is a small group, and the gap between them and everyone else is getting wider, not narrower.
In the news
AI Models Fail Simple Health Tests
Nature · 3d ago
MIT's AI Breakthrough Makes Robots Smarter for Chores
MIT News AI · 4d ago
Small AI Models Outperform Large Ones
Forbes · 5d ago
AI Inference Gets a Memory Boost: New Techniques Reduce GPU Bottlenecks
NVIDIA Dev Blog · 5d ago
AI's Hidden Power: Reasoning Enhances Fact Recall
Google AI Research · 6d ago
Related concepts
Fine-tuning
Taking a general-purpose AI model and giving it additional training on a specific subject, so it becomes noticeably better at that particular domain.
RLHF (Reinforcement Learning from Human Feedback)
A training technique that teaches AI to produce responses humans actually prefer, by having real people rate different outputs and using those ratings to improve the model.
Transformer
The AI architecture that powers virtually every major language model today - the underlying design that makes GPT, Claude, Gemini, and most other modern AI systems work.