FeedbackShare feedback

← All sections·Learning path

II

Training & Alignment29

Core

How models are built, fine-tuned, and taught to behave the way people actually want them to.

Bayesian Optimization

A smart method for finding the best hyperparameters for a model - building a probabilistic model of how settings affect performance and using it to choose which settings to try next.

Catastrophic Forgetting

The tendency of neural networks to lose previously learned capabilities when trained on new data - a fundamental challenge in continually updating AI systems.

Constitutional AI

Anthropic's approach to alignment where a model is given a set of principles and trained to critique and revise its own outputs to comply with them - reducing reliance on human labelling of harmful content.

All concepts

C

D

F

Fine-tuning
Taking a general-purpose AI model and giving it additional training on a specific subject, so it becomes noticeably better at that particular domain.

G

I

Instruction Datasets
Curated collections of instruction-response pairs used to fine-tune language models into helpful assistants - the training data that teaches models what being useful looks like.

K

Knowledge Distillation
A training technique where a small model learns to imitate a larger one - capturing most of the large model's capability at a fraction of its size and cost.

L

M

N

Neural Tangent Kernel (NTK)
A mathematical framework that reveals how infinitely wide neural networks behave during training - and provides theoretical tools for understanding why neural networks generalise as well as they do.

P

Q

Quantization-Aware Training (QAT)
Training a model while simulating the numerical precision it will run at after deployment - producing compressed models that stay accurate even when their weights are stored in low-precision formats.

R

S

T

W

Weak-to-Strong Generalization
A research finding that a stronger AI model can be supervised and improved by a weaker one - and a framework for thinking about how to align AI systems that exceed human capability.