DeepSeek · DeepSeek

DeepSeek R1

The open-weights reasoning model that reset the cost curve.

DeepSeek R1 is an advanced AI language model created by DeepSeek with a large context window of 163,840 tokens, enabling it to process and generate extended sequences of text efficiently. The model's architecture is optimized for tasks requiring continuity and understanding across long inputs, such as document-level analysis, multi-turn dialogues, and dense technical data processing.

DeepSeek R1 is positioned as a high-capacity solution aimed at developers and product teams working on complex text-based workflows. Combining cost efficiency with scalable performance, it balances computational requirements with the ability to handle extensive textual inputs.

DeepSeek R1 is a core model in the DeepSeek lineup, offering significant advancements in context window capacity and performance compared to its predecessors. It targets high-demand tasks with an emphasis on large-scale input handling and versatility.

Specs

Context window: 164K tokens
Max output: 16K tokens
Input ($/1M tokens): $0.700
Output ($/1M tokens): $2.50
Modalities: Text
Released: Jan 20, 2025
Weights: Open

Pricing last synced May 18, 2026 via OpenRouter. Confirm against official docs before committing.

Capabilities

Tool use
Vision
Extended thinking
Prompt caching
Open weights

What it excels at

Large context window
The 163,840-token capacity allows processing of extensive text sequences efficiently.
Scalable performance
Handles large workflows without notable performance degradation.
Cost efficiency
Provides competitive pricing for input and output token processing.
Long-range coherence
Maintains logical and contextual continuity across lengthy inputs.
Versatile applicability
Supports a broad variety of text-based tasks and domains effectively.

When to use this model

→Document summarization and analysis - Its large context window enables effective comprehension and condensation of extended documents.
→Technical writing and content generation - Capable of producing coherent and contextually precise technical content.
→Conversational AI - Supports coherent dialogue generation across long or multi-turn interactions.
→Research data processing - Facilitates the analysis of large, text-heavy datasets in research workflows.
→Interactive storytelling - Allows the creation of deep narratives due to its ability to retain extensive context.

Analysis synthesized from gpt-4o, llama-4-maverick, phi-4, etc.

What people are saying

An ARM Homelab Server, or a Minisforum MS-R1 Review

HN↑ 12698 comments

GPT-IMAGE-2 is back on LMarena

r/singularity↑ 425113 comments

One year ago DeepSeek R1 was 25 times bigger than Gemma 4

r/LocalLLaMA↑ 40778 comments

I Let a Small Model Train on Its Own Mistakes. It Reached 80% on HumanEval and Beat GPT-3.5 on Math

r/LocalLLaMA↑ 24359 comments

API model id

deepseek/deepseek-r1-0528

Vendor docs: api-docs.deepseek.com

Compare DeepSeek R1 with