Concept

Graph Convolutional Network

The foundational GNN architecture that adapts convolutional operations to graph-structured data by treating each node's new representation as a weighted sum of its own features and its neighbours' features - making deep learning on graphs tractable.

Added May 18, 2026

Graph Convolutional Networks (GCN), introduced by Kipf and Welling in 2016, are the most widely cited and influential GNN architecture. They provide an efficient approximation to spectral graph convolution that is both computationally tractable and empirically effective across many graph learning tasks.

The GCN layer update rule is elegantly simple: the new representation of each node is computed as a linear transformation of the normalised sum of the node's own features and its neighbours' features. In matrix form: H' = sigma(D^(-1/2) A_hat D^(-1/2) H W), where A_hat is the adjacency matrix with added self-loops, D is the degree matrix, H is the node feature matrix, W is the learned weight matrix, and sigma is a non-linear activation. The normalisation by degree prevents high-degree nodes from dominating the aggregation.

The intuition: each GCN layer mixes a node's features with its immediate neighbours. After one layer, each node knows about its 1-hop neighbourhood. After two layers, each node has integrated information from its 2-hop neighbourhood. With L layers, the receptive field extends L hops. This is the graph analogue of a convolution kernel's receptive field growing with layers in a CNN.

GCN was originally motivated from a spectral perspective - graph convolution can be defined as multiplication in the graph frequency domain (the eigenspace of the graph Laplacian). The GCN layer is a first-order approximation to this spectral convolution that avoids the expensive eigenvector computation. The connection to spectral graph theory gives GCN theoretical grounding and motivated subsequent spectral-based GNN research.

Limitations: GCN uses transductive learning - the entire graph (including test nodes) must be present during training to compute the normalised adjacency matrix. It cannot be directly applied to new, unseen graphs. GraphSAGE and other inductive methods address this by learning aggregation functions that can generalise to new nodes and graphs. GCN also uses uniform aggregation of all neighbours, treating every neighbour equally regardless of relevance - Graph Attention Networks address this by weighting neighbours by learned attention scores.

Despite these limitations, GCN remains the standard baseline for graph learning tasks and is often competitive with or superior to more complex architectures on benchmark datasets. Its simplicity makes it a reliable choice when graph structure is not too heterogeneous.

Analogy

Smoothing a surface by replacing each point's value with the average of itself and its neighbours - like Gaussian blur in image processing. A single GCN layer performs a similar smoothing on the graph: each node's features become a blend of its own features and its neighbours' features. With multiple layers, the smoothing extends further. Just as CNN layers detect increasingly global patterns in images through stacked local convolutions, GCN layers detect increasingly extended structural patterns in graphs through stacked neighbourhood aggregations.

Real-world example

Node classification on a citation network (Cora dataset): each node is a paper with a bag-of-words feature vector; edges represent citations; the task is to classify each paper into a topic category. A 2-layer GCN achieves ~81% accuracy on this task by propagating features through the citation graph. A paper cited by many AI papers will have its representation influenced by those AI papers' features after one GCN layer, making its own classification easier even if its own abstract is ambiguous.

Why it matters

GCN introduced the recipe that made deep learning on graphs practical: simple, efficient neighbourhood aggregation with a theoretical foundation. Understanding GCN is the entry point into the GNN literature, providing the baseline against which all subsequent architectures are evaluated and the framework for understanding the design choices that distinguish later methods.

In the news

No recent coverage - check back later.

Related concepts

Graph Attention Network Graph Neural Network Message Passing

← Back to concepts