Concept

Vector Database

A database built to store and search content by meaning rather than exact words - the engine that powers AI search and most retrieval systems.

A traditional database is very good at finding exact matches. If you search for "quarterly revenue report 2024," it will find every document that contains those exact words. What it cannot do is find documents about "Q4 financial performance" or "annual earnings summary" - because the words are different, even though the meaning is essentially the same. For many real-world searches, this limitation matters enormously.

Vector databases solve this by storing content not as words but as numerical representations of meaning - the same embeddings that AI models produce when they process text. Each document, paragraph, or sentence gets converted into a long list of numbers that captures what it means, not just what words it uses. Documents about similar topics produce similar numbers, regardless of the specific vocabulary used.

When you search a vector database, your query also gets converted into this numerical form, and the database finds the stored items whose numbers are closest to yours. "Quarterly revenue report 2024" and "Q4 financial performance" produce similar numbers, so they match each other - even though they share no words. This meaning-based search is dramatically more useful for finding relevant content in large document collections.

Vector databases are specifically engineered to do this kind of search very quickly across millions or billions of stored items. Finding which of a million documents is most similar to a query has to happen in milliseconds for it to be useful in a real application, and that speed requires specialised data structures and algorithms that regular databases are not built for.

The most important use case for vector databases right now is powering the retrieval step in RAG systems. When an AI needs to find the most relevant documents before answering your question, it searches a vector database. The quality of that search - how well the database finds genuinely relevant content - directly determines how accurate and useful the AI's answer will be.

Analogy

A library organised not by author or title, but by what the books are actually about. You walk in and describe the feeling you are looking for - "a slow-paced mystery set in a cold climate with an unreliable narrator" - and the librarian finds the closest match immediately, even if no book has all those words in its title. The library is organised by meaning, not by name.

Real-world example

Notion AI, which lets you ask questions across all your Notion pages, uses a vector database behind the scenes. Every page you write gets stored as a numerical representation of its meaning. When you ask a question, the system finds the pages whose meanings are most similar to your question and uses those to generate the answer.

Why it matters

Vector databases are the storage infrastructure behind AI-powered search, RAG systems, and recommendation engines. As more applications are built on top of AI, the vector database market is growing fast - it is the layer that makes it possible for AI to find relevant information at scale, which is the foundation of most practical AI applications.

In the news

No recent coverage - check back later.

Related concepts

Embeddings Inference RAG (Retrieval-Augmented Generation)

← Back to concepts