VIII · Graph Neural NetworksAdvanced

Heterogeneous Graph

A graph with multiple types of nodes and multiple types of edges - more faithfully representing real-world networks where different entities and different relationship types carry fundamentally different meanings.

Added May 18, 2026 · 3 min read

Most real-world knowledge networks are heterogeneous - they contain multiple entity types and multiple relationship types with distinct semantics. Standard homogeneous GNNs cannot correctly model these systems because they ignore the fundamental differences between entity and relation types. Understanding heterogeneous graphs and the architectures designed for them is essential for applying GNNs to the rich, multi-relational data that organisations actually have - knowledge bases, recommendation systems, and biomedical networks.

Most benchmark GNN datasets use homogeneous graphs: all nodes are the same type (all papers, all users, all atoms) and all edges represent the same type of relationship (all citations, all friendships, all bonds). Real-world graphs are rarely so uniform. A healthcare knowledge graph contains patients, doctors, hospitals, diseases, drugs, and procedures as distinct node types, connected by admitted-to, prescribed, diagnosed-with, treats, and performed relationships - each with different semantics.

Heterogeneous Information Networks (HINs) and heterogeneous graphs formalise this by allowing a graph with a schema: a set of defined node types and a set of defined relation types (each connecting specific node type pairs). A recommendation system graph might have: User nodes, Item nodes, and Tag nodes, with User-Rated-Item, User-Followed-User, and Item-HasTag edges. Standard GNN message passing, designed for homogeneous graphs, cannot distinguish these different types and would aggregate User features and Item features together indiscriminately.

Heterogeneous GNN architectures handle this by applying type-specific transformations. R-GCN (Relational GCN) uses a separate weight matrix for each relation type, aggregating separately per relation then combining. The relational messages do not mix features across relation types before the update. HAN (Heterogeneous Attention Network) uses meta-paths - composite paths through the schema (User-Rated-Item-Rated-User defines a "co-rating" meta-path) - and applies attention to aggregate along meta-paths.

HGT (Heterogeneous Graph Transformer) applies a full Transformer attention mechanism with type-aware projections, allowing nodes of different types to attend to each other appropriately. It achieves strong results on academic graph benchmarks (predicting paper venues given a graph of papers, authors, and institutions with typed edges).

Heterogeneous graphs appear widely in industry. Knowledge graphs are inherently heterogeneous (multiple entity and relation types). E-commerce recommendation: users, products, categories, and brands with diverse interaction types. Fraud detection: accounts, devices, IP addresses, and transactions with multiple connection types. Biomedical graphs: genes, diseases, drugs, proteins, and pathways.

Handling heterogeneity also raises the question of how to encode relation type information. Options range from separate weight matrices (R-GCN), type embeddings concatenated to features, and full attention mechanisms with type-aware biases. The right choice depends on the number of relation types and the amount of training data available per type.

Analogy

A company's organisational chart is a heterogeneous graph: employees, departments, projects, and clients are distinct node types; reports-to, member-of, assigned-to, and contracted-with are distinct relationship types. A message passing GNN that treats the company graph as homogeneous would aggregate an employee's features with a project's features as if they were the same kind of entity - clearly wrong. A heterogeneous GNN applies different aggregation logic for different edge types, correctly treating the employee's relationship to their manager (reports-to) differently from their relationship to their assigned projects (assigned-to).

Real-world example

Microsoft's Academic Knowledge Graph contains papers, authors, institutions, venues, and field-of-study nodes with heterogeneous edges. An HGT trained on this graph for paper citation count prediction uses type-aware attention to aggregate information from co-authors (Author-Author edges via paper co-authorship), publication venues (Paper-Venue edges), and topic areas (Paper-FieldOfStudy edges) separately, then combines them. This type-aware aggregation outperforms homogeneous GNNs that conflate these distinct relationship types.

Why it matters

Most real-world knowledge networks are heterogeneous - they contain multiple entity types and multiple relationship types with distinct semantics. Standard homogeneous GNNs cannot correctly model these systems because they ignore the fundamental differences between entity and relation types. Understanding heterogeneous graphs and the architectures designed for them is essential for applying GNNs to the rich, multi-relational data that organisations actually have - knowledge bases, recommendation systems, and biomedical networks.

In the news

Related concepts

Graph Attention Network

A GNN variant that learns to assign different importance weights to different neighbours during aggregation - letting the model focus on the most relevant connections rather than treating all neighbours equally.

Graph Neural Network

A class of deep learning models designed to operate directly on graph-structured data - learning representations that capture both the features of individual nodes and the structural relationships between them.

Knowledge Graph

A structured representation of real-world entities and their relationships as a directed graph - enabling machines to reason over factual knowledge, answer questions, and make inferences by traversing a web of interconnected facts.

← Back to concepts