latentbrief
← Back to concepts

Deep Learning

Machine learning using neural networks with many layers - the approach behind almost every significant AI breakthrough of the past decade.

Added May 21, 2026 · 2 min read

Deep learning is the technology that made the current AI era possible. Before it, AI systems were brittle, narrow, and required enormous human effort to design. Deep learning enabled AI to learn from raw data at scale - which is why progress has accelerated so dramatically since 2012. Nearly every AI capability that seems remarkable today is built on top of deep learning.

The word "deep" in deep learning refers to depth: neural networks with many layers stacked on top of each other. A shallow network might have one or two hidden layers between the input and output. A deep network might have dozens, hundreds, or in the largest modern models, effectively thousands of layers of computation.

Why does depth matter? Depth enables hierarchical abstraction. Each layer in a deep network can learn to detect features that earlier layers missed, building increasingly complex representations from simpler ones. In vision, shallow networks see pixels; deep networks see objects. In language, shallow networks see words; deep networks see meaning, context, and intent.

The possibility of training deep networks had been understood for decades, but it was practically impossible. Networks with many layers suffered from a problem called vanishing gradients: during training, the error signal that flows back through the network to update weights became weaker with each layer it passed through, leaving early layers nearly unable to learn.

Several breakthroughs unlocked deep learning in the early 2010s. Better activation functions (ReLU replaced sigmoid), better weight initialisation, and dramatic increases in compute and data were all important. So was the GPU: graphics processing units turned out to be extraordinarily efficient at the parallel matrix operations that neural networks require, accelerating training by orders of magnitude.

The defining moment came in 2012, when AlexNet - a deep convolutional neural network - won the ImageNet competition by a margin that shocked researchers. Computer vision transformed almost immediately. Text followed with the introduction of transformers in 2017. By the early 2020s, deep learning dominated essentially every area of machine learning research and application.

Analogy

The difference between a one-level sorting warehouse and a multi-floor operation where each floor refines what the floor below delivered. The top floor sorts packages by country. The next by state. The next by city. The final floor delivers to the exact address. Each level builds on the precision achieved below. Deep learning does the same with data: each layer refines the representation built by the previous one.

Real-world example

GPT-4, Claude, and Gemini are all deep learning systems. They use transformers with dozens to hundreds of layers. The depth is what allows them to understand context, maintain coherence across long conversations, and reason about complex topics. The same basic architecture at smaller scale powers the photo-tagging feature in your phone's gallery.

Why it matters

Deep learning is the technology that made the current AI era possible. Before it, AI systems were brittle, narrow, and required enormous human effort to design. Deep learning enabled AI to learn from raw data at scale - which is why progress has accelerated so dramatically since 2012. Nearly every AI capability that seems remarkable today is built on top of deep learning.

In the news

Related concepts