Research3d ago

AI Isn't Just Guessing: LLMs Do More Than Predict Next Tokens

LessWrongMay 17, 20261 min brief

In brief

AI researchers are pushing back against the idea that large language models (LLMs) are merely "next token predictors." Critics argue this oversimplifies their capabilities, suggesting they lack true understanding or cognition.
Instead, LLMs use a more complex process during training, where they analyze sequences of text to predict the next word.
- This involves breaking down input text into short segments called tokens and learning patterns across vast datasets.
For example, given "The cat sat on the mat," the model predicts each subsequent word by analyzing context from previous tokens.
During generation, users provide initial text, and the model produces a probability distribution for possible next words.
- It selects one randomly based on these probabilities, building sentences step-by-step.
While this still feels like guessing, the scale and depth of training mean LLMs capture meaningful patterns beyond simple word prediction.
They can generate coherent, contextually relevant text by leveraging their extensive training data.
Looking ahead, understanding how LLMs truly operate will help refine their abilities and address ethical concerns about their decision-making processes.
Researchers are working to clarify these mechanisms, ensuring that AI systems remain transparent and trustworthy.

Terms in this brief

Tokens: The building blocks of text for LLMs. Each token is a small piece of text, like a word or part of a word, that the model processes to understand context and predict the next word in a sequence.

Read full story at LessWrong →

More briefs