latentbrief
Back to news
Launch1w ago

DeepSeek Unveils V4 AI Models

Simon Willison

In brief

  • Chinese AI lab DeepSeek has launched two new models, DeepSeek-V4-Pro and DeepSeek-V4-Flash.
    • These models are designed with a 1 million token context window using the Mixture of Experts technique.
  • The Pro version boasts 1.6 trillion total parameters, while the Flash model has 284 billion.
  • Both are available under the MIT license on platforms like Hugging Face.
  • The DeepSeek-V4-Pro is now the largest open-source AI model, surpassing competitors like Kimi K2 and GLM-5.1 in size.
  • The models are optimized for different use cases-Pro offers high performance with 865GB of data, while Flash is more lightweight at 160GB, potentially running on devices like a 128GB MacBook Pro.
  • What makes this release standout is its affordability.
  • DeepSeek charges $0.14 per million tokens for input and $0.28 for output with the Flash model, and $1.74 and $3.48 respectively for the Pro.
    • This pricing could make advanced AI capabilities more accessible to developers and researchers globally.
  • Looking ahead, users can expect even more improvements in model efficiency and cost-effectiveness as DeepSeek continues to innovate in AI development.

Terms in this brief

Mixture of Experts
A technique where a large model is broken into smaller, specialized models (experts) that work together. This allows for efficient processing and better performance on specific tasks, while using less memory.
MIT license
A type of software license that gives users permission to use, modify, and share the software freely, as long as they credit the original authors and include a copy of the license. It's known for its simplicity and permissive nature.

Read full story at Simon Willison

More briefs