OpenAI · GPT-5
GPT-5.4
OpenAI's flagship — broadest modality and ecosystem coverage.
GPT-5.4 is a multimodal model in OpenAI's GPT-5 series released in 2023. It supports text, image, and file inputs, balancing cost and functionality to cater to developers, businesses, and technical teams. The model features a 1,050,000-token context window, which boosts its ability to process and maintain coherent understanding of extensive content, making it ideal for complex workflows.
Technically, GPT-5.4 is built on OpenAI's enhanced transformer architecture. It employs optimizations for long-context reasoning and multimodal comprehension, enabling smooth handling of mixed-media tasks, long-form texts, and high-accuracy outputs. This positions it as a reliable choice for a variety of advanced applications across industries, bridging high performance and computational efficiency.
GPT-5.4 is a mid-tier 'workhorse' model in the GPT-5 family, positioned between cost-conscious variants and high-end flagship options. It offers a significant improvement in multimodal processing and context window size compared to its predecessors, allowing for superior performance and broader task handling at a balanced price point.
Background
GPT-5 is a multimodal large language model developed by OpenAI and the fifth in its series of generative pre-trained transformer (GPT) foundation models. Preceded in the series by GPT-4, it was launched on August 7, 2025. It is publicly accessible to users of the chatbot products ChatGPT and Microsoft Copilot as well as to developers through the OpenAI API.
WikipediaSpecs
- Context window
- 1.1M tokens
- Max output
- 128K tokens
- Input ($/1M tokens)
- $2.50
- Output ($/1M tokens)
- $15.00
- Modalities
- Text · Image · File
- Weights
- Closed
Pricing last synced Apr 27, 2026 via OpenRouter. Confirm against official docs before committing.
Capabilities
- Tool use
- Vision
- Extended thinking
- Prompt caching
- Open weights
What it excels at
Large context window
Supports inputs up to 1,050,000 tokens for extended coherence and long-form processing.
Multimodal capabilities
Handles text, image, and file inputs seamlessly, enabling cross-media tasks.
Advanced coherence management
Maintains high narrative and logical coherence across extended content or dialogue.
Versatile performance
Proficient across diverse workflows, from summarization to creative generation.
When to use this model
- →Long-form content generation — Processes extensive narratives or technical documents with sustained coherence.
- →Multimodal workflows — Integrates and analyzes text, image, and file inputs for seamless cross-media functionality.
- →Comprehensive document summarization — Efficiently condenses and synthesizes large-scale, complex documents.
- →Dialogue systems in complex domains — Enables sophisticated virtual assistants with nuanced understanding and context management.
Analysis synthesized from gpt-4o, llama-4-maverick, phi-4, etc.
Recent coverage
AI Access Controls Tightened as GPT-5.4-Cyber's Release is Delayed
Analytics Vidhya · 2w ago
AI Revolution Accelerates with Breakthroughs and Challenges
NeuralPulse Daily · 2w ago
OpenAI Unveils Cutting-Edge AI to Outsmart Hackers
The Decoder · 2w ago
GitHub releases AI coding tool for terminal use
InfoQ AI · 3w ago
GPT-5.4’s Recursive Design Evolution Shows AI’s Untapped Potential
r/OpenAI · 4w ago
What people are saying
API model id
gpt-5
Vendor docs: platform.openai.com/docs
Compare GPT-5.4 with
GPT-5.4 vs Claude Opus 4.7
Anthropic's heavyweight for hard reasoning and agentic work.
GPT-5.4 vs Claude Sonnet 4.6
The pragmatic default — Claude quality without Opus pricing.
GPT-5.4 vs Claude Haiku 4.5
Fast, cheap, surprisingly capable for high-volume jobs.
GPT-5.4 vs GPT-5.4 Mini
GPT-5 economics for high-volume routine tasks.
GPT-5.4 vs Gemini 3.1 Pro
Google's latest frontier model with expanded reasoning.
GPT-5.4 vs Gemini 2.5 Pro
Google's bet on massive context and native multimodality.