General2w ago

AI Experts Debate Task Verification Categories

LessWrongApril 20, 2026

In brief

A group of AI researchers has sparked a lively discussion by challenging how we classify tasks into "easy-to-verify" and "hard-to-verify." They argue that labeling tasks this way is as vague as dividing birds into ravens and non-ravens, without clear natural boundaries.
Easy-to-verify tasks are those where a simple program can quickly check solutions without heavy resources or side effects.
Hard-to-verify tasks, however, vary widely in why they're hard-whether due to expensive AI inference, scarce human time, or the lack of a clear answer structure.
The researchers listed several types of hard tasks.
Some require costly computations, like comparing experiments that cost $100-$1000 each.
Others depend on rare expertise, such as evaluations by top mathematicians.
Some tasks are inherently ambiguous, lacking a definitive way to prove which solution is better.
In chess, for example, even with AI, deciding the best move remains tricky.
- This debate highlights the complexity of task verification in AI development.
As researchers refine these categories, it could lead to better tools and methods for evaluating AI systems.
Future discussions may uncover more nuances, helping developers design more reliable AI systems.

Terms in this brief

Task Verification Categories: A framework for categorizing AI tasks based on how difficult it is to verify their correctness. This helps in understanding which tasks require more resources or expertise to evaluate accurately.

Read full story at LessWrong →

More briefs