Launch2w ago

Anthropic Unveils 244-Page Guide to Cutting-Edge AI Model

LessWrongApril 14, 2026

In brief

A new AI model called "Claude Mythos Preview" has been described in a 244-page technical document published by Anthropic.
The model is said to have strong cybersecurity abilities but has not been made public.
The document comes after a data leak in March that revealed the model's existence.
The system card details how the model was developed and tested.
- It says the model can do things that are similar to previous models but with greater skill.
However, some parts of the model's training process are unclear.
For example, 8% of the model's training used a method that might affect how it learns.
- This method was also used in earlier models without any fixes being mentioned.
Experts will be watching how Anthropic handles the model's capabilities and risks.
The company has updated its policies but removed some details about how it plans to control the model's behavior.

Terms in this brief

Claude Mythos Preview: A new AI model developed by Anthropic, known for its cybersecurity capabilities. The model's details were shared in a comprehensive technical document, though it hasn't been publicly released yet. Experts are closely monitoring its development and the measures Anthropic takes to manage its potential risks.

Read full story at LessWrong →

More briefs