VIII · Graph Neural NetworksAdvanced

Node Classification

The task of predicting a label or category for each node in a graph, based on the nodes own features and the structure of its connections to other nodes.

Added May 21, 2026 · 2 min read

Node classification turns graphs from descriptive structures into predictive tools. Any domain where data naturally forms a network - social systems, biological systems, financial systems, knowledge systems - can use node classification to label the entities in that network based on collective patterns, not just individual features.

Node classification is one of the three fundamental tasks in graph machine learning, alongside link prediction and graph classification. The goal is to assign each node a label - a category, a score, or a continuous value - by learning from both the nodes own attributes and its position in the graph.

The key insight is that a nodes label often depends not just on what it is, but on who it is connected to. On a social network, a persons political views correlate with their connections. In a citation network, a papers research area correlates with the papers it cites and is cited by. In a fraud detection graph, a suspicious account is more likely to be connected to other suspicious accounts. These patterns are captured by the homophily assumption: connected nodes tend to be similar.

Graph Neural Networks exploit this by iteratively aggregating information from each nodes neighbourhood. After several rounds of aggregation, a nodes representation encodes not just its own features but the collective characteristics of its local neighbourhood. A classifier can then use this enriched representation to predict the nodes label.

Node classification has important practical applications. In biology, it predicts protein function based on protein-protein interaction networks. In e-commerce, it identifies fake product listings based on seller networks. In knowledge graphs, it assigns types to entities based on their relationships. In academic networks, it classifies researchers by field.

The challenge is that node classification in real graphs often involves class imbalance (fraud is rare), heterophily (connected nodes are sometimes dissimilar), and missing labels (most graphs are partially labelled).

Analogy

Predicting someones profession without asking them directly, using only who they know. If most of a persons connections are doctors, they are likely a doctor themselves. Node classification is this intuition made formal and applied at scale across any graph structure.

Real-world example

Twitter/X uses a form of node classification to identify bot accounts. Each account (node) has features like posting frequency and follower ratios. The graph structure - who follows whom, who interacts with whom - provides additional signal. Accounts that cluster with known bots in the interaction graph are more likely to be bots themselves.

Why it matters

Node classification turns graphs from descriptive structures into predictive tools. Any domain where data naturally forms a network - social systems, biological systems, financial systems, knowledge systems - can use node classification to label the entities in that network based on collective patterns, not just individual features.

In the news

No recent coverage - search for Node Classification.

Related concepts

Graph Attention Network

A GNN variant that learns to assign different importance weights to different neighbours during aggregation - letting the model focus on the most relevant connections rather than treating all neighbours equally.

Graph Neural Network

A class of deep learning models designed to operate directly on graph-structured data - learning representations that capture both the features of individual nodes and the structural relationships between them.

Link Prediction

The graph learning task of predicting whether a connection should exist between two nodes - used to discover unknown relationships, recommend new connections, and complete incomplete knowledge graphs.

← Back to concepts