San Francisco, CA
OpenAI
The lab that launched the current LLM era. GPT and o-series models anchor the widest developer ecosystem in the field - most tutorials, integrations, and third-party tooling start here.
Models
GPT-5.4
1.1M ctxOpenAI's flagship - broadest modality and ecosystem coverage.
GPT-5 is the safest pick when you want one model to handle reasoning, vision and voice without juggling three APIs.
$2.50 in · $15.00 out / 1M tokens
GPT-5.4 Mini
400K ctxGPT-5 economics for high-volume routine tasks.
GPT-5 mini is OpenAI's answer to the cost-conscious workloads that don't justify the flagship.
$0.75 in · $4.50 out / 1M tokens
o3
200K ctxOpenAI's mainstream reasoning model - production-viable thinking.
o3 is what o1 was trying to be: a reasoning model you can actually afford to use at scale.
$2.00 in · $8.00 out / 1M tokens
o4 Mini
200K ctxFast, cheap reasoning for high-volume intelligent tasks.
o4 Mini is the fast-lane option in OpenAI's reasoning stack.
$1.10 in · $4.40 out / 1M tokens
Recent news
Articles mentioning OpenAI models
AI Monitoring Fails Under Long-Term Scrutiny
New research reveals that advanced AI models struggle to detect dangerous behavior when monitoring long sequences of code. Tests show current systems miss red flags 2x to 30x more often in transcripts over 800K tokens compared to shorter ones. This gap highlights critical flaws in existing monitoring benchmarks, which often overlook the impact of long-context degradation. The study uses "Needle Insertion" and "Padded MonitorBench" methods to evaluate model performance. In both cases, models like GPT-5.4 and Opus 4.6 fail to catch malicious actions when preceded by benign activity. This suggests that current monitoring tools may be overly optimistic in their effectiveness. Moving forward, researchers recommend using prompting techniques and post-training improvements to enhance detection accuracy. As AI systems grow more complex, better monitoring will be essential to ensure safety and reliability.
LessWrong1mo ago
Moonshot AI Unveils Kimi K2.6 Open-Weight Model
Moonshot AI has released the Kimi K2.6 model, an open-source version designed to rival GPT-5.4 and Claude Opus 4.6 in coding tasks. This new model can manage up to 300 agents simultaneously, enabling it to handle complex, multi-threaded operations with ease. This development is significant because it offers developers and researchers a powerful tool for building intelligent systems that can process information more efficiently than ever before. By providing an open-weight model, Moonshot AI is making advanced AI capabilities accessible to a broader audience, fostering innovation across industries. Looking ahead, the ability to scale agent swarms could lead to breakthroughs in areas like real-time decision-making, automated systems, and large-scale data processing. This release sets the stage for further advancements in AI technology, promising exciting possibilities for the future.
The Decoder2mo ago
AI Access Controls Tightened as GPT-5.4-Cyber's Release is Delayed
OpenAI has decided to keep its most powerful AI model, GPT-5.4-Cyber, under wraps for now. This model, designed for advanced reasoning and problem-solving, was expected to be released soon but faces delays due to concerns about misuse. Why does this matter? The focus is shifting from what AI can do to who gets to use it. As seen with Anthropic's Claude Mythos Preview, companies are becoming more cautious about access. OpenAI's delay reflects a growing trend in the industry to balance innovation with ethical considerations. Developers and researchers may have to wait longer for cutting-edge tools, but this move aims to ensure responsible deployment. Looking ahead, expect stricter controls on AI models as ethical guidelines evolve. The race to harness AI's power will likely involve more deliberate decision-making about access and usage.
Analytics Vidhya2mo ago
AI Revolution Accelerates with Breakthroughs and Challenges
1. Prince William County Dumps Big Data Center Plan Amid Public Pushback: Prince William County has scrapped plans for a massive data center due to strong public opposition over environmental concerns and traffic impacts. The decision highlights the growing public awareness of the potential downsides of big tech projects. 2. UK Uses AI to Assign Age Ratings to Popular HBO Max Series: The British Board of Film Classification used an AI tool to analyze content and assign age ratings to popular TV shows like Game of Thrones and Euphoria. This is the first time these shows have received ratings in the UK, marking a new application of AI in content regulation. 3. Type "Make the sky bluer" and watch your design transform: Adobe's new AI tool, Firefly AI Assistant, allows designers and artists to change their work by typing simple descriptions, making design more accessible to non-experts. This tool has the potential to revolutionize the design industry by simplifying complex tasks. 4. Silent Flaw Lets Hackers Bypass AI Security Measures: Security researchers discovered a major flaw in three popular AI agents that connect with GitHub Actions, allowing hackers to steal API keys and access tokens using a new type of attack called prompt injection. The flaw highlights a growing risk in AI systems that handle sensitive data. 5. Google's Findings Shrink Quantum Threat Timeline by 80%: A new study by Google Quantum AI found that quantum computers could break today's online security much sooner than expected, with the number of quantum bits needed to crack current encryption being twenty times smaller than previously thought. This discovery has significant implications for the timeline of switching to quantum-safe encryption methods. 6. Adobe’s New AI Tool Transforms Complex Tasks into One-Click Solutions: Adobe's new AI assistant, Firefly, helps users create complex designs more easily by describing what they want and having the AI handle the details. This tool is a major step for Adobe in adding AI capabilities to its software, aiming to make design more accessible. 7. Waymo Unleashes Driverless Rides on London Streets: Waymo has begun testing fully driverless ride-hailing services in London, with no human required in the vehicle during these tests. This move is part of a larger plan to expand driverless transportation, marking a significant milestone in the development of autonomous vehicles. 8. $800 Billion Backed AI Set to Disrupt Creative Software: A new AI tool from Anthropic, backed by up to $800 billion in funding, could challenge popular design software like Adobe and Figma. The massive financial support indicates the high value investors place on AI companies and their potential to disrupt traditional industries. 9. Claude Shines in Lab Tests, Falls Short in the Real World: An AI system called Claude outperformed human researchers on a complex alignment task in lab tests but failed to replicate this success when applied to real-world models. This discrepancy highlights the challenges of translating AI performance from controlled environments to practical applications. 10. AI Cracks Century-Old Math Puzzle in Record Time: A new AI system, GPT-5.4 Pro, solved a long-standing math problem in less than two hours, demonstrating a major step in how artificial intelligence can aid scientific research. The solution, verified by experts, showcases the potential of AI in advancing mathematical knowledge.
NeuralPulse Daily2mo ago
AI Breaks Code Faster Than Hackers Can Exploit It
A new AI tool called GPT-5.4-Cyber has been released in limited form. It is designed to find security holes in software. This tool is being developed by the company behind ChatGPT. This tool can scan software for weaknesses that hackers might exploit. It works by analyzing code and identifying patterns that could lead to security problems. Early tests suggest it can find flaws faster than traditional methods. For developers, this means they can fix issues before they become serious problems. For users, it means software can be more secure. Watch for how this tool is used in the coming months. More companies may adopt it to improve their security practices.
NY Times Tech, IEEE Spectrum2mo ago
GitHub releases AI coding tool for terminal use
GitHub has launched Copilot CLI, a new tool that brings generative AI directly into the terminal. This allows developers to get code suggestions and explanations using natural language commands. The tool is now available for general use and works with the GitHub CLI. Copilot CLI includes new features like Autopilot mode, which helps automate repetitive coding tasks. It also supports GPT-5.4, a more advanced version of the AI model. These updates help developers write code faster and more accurately. Enterprise teams can now track how the tool is used across their organizations with new telemetry features. Watch for how developers adopt these new AI tools and how they might change the way code is written in the future.
InfoQ AI2mo ago
Meta’s Muse Spark Signals a New Era in AI Consumer Models
After a lukewarm reception for its Llama 4 AI model, Meta is making a bold move with the launch of Muse Spark, the first product from its Superintelligence team. This lightweight AI system is designed to bring advanced capabilities directly to consumers. A standout feature is its multi-agent coordination, allowing users to tackle complex tasks like family trip planning by assigning different agents to specific roles-like itinerary creation or activity suggestions. While similar models have offered basic reasoning modes, Spark introduces a "Contemplating" mode in the future, promising deeper analytical power. Spark’s multimodal approach lets users process images, video, and audio, mirroring tools like Google Lens. It also includes a built-in shopping assistant that compares products and provides purchase links-a feature already seen in ChatGPT. Currently available on Meta’s AI app and website, Spark operates in "Instant" mode for quick responses or "Thinking" mode for more deliberate answers. While it trails behind leading models like OpenAI’s GPT-5.4 Pro in some benchmarks, Meta aims to close the gap with further investments in long-term reasoning and coding capabilities. This release signals a shift toward more consumer-focused AI tools while hinting at Meta’s potential dominance in this space. With plans for more powerful models ahead, Spark sets the stage for broader adoption of advanced AI in everyday life. Stay tuned as Meta continues to refine its offerings, promising a future where AI assistants are smarter and more capable than ever before.
Engadget, The Decoder, Simon Willison2mo ago
GPT-5.4’s Recursive Design Evolution Shows AI’s Untapped Potential
In a fascinating real-time experiment, a developer recently put GPT-5.4 in a loop, letting it continuously refine and improve the design of a website-without any human intervention. The result? A showcase of machine learning’s ability to iterate, adapt, and evolve creative solutions on its own. This isn’t just about designing websites; it’s about AI’s capacity for self-improvement, a glimpse into a future where machines can autonomously innovate in ways we’re only beginning to imagine. The experiment involved feeding GPT-5.4 a simple starting point-a basic website design-and then letting the AI run free. The AI wasn’t given any specific instructions beyond “keep improving the design.” What unfolded was a series of incremental changes, each one slightly better than the last, as the model tweaked colors, layouts, and overall aesthetics. At its peak, GPT-5.4 even generated cards that seemed to demonstrate recursive self-improvement-a hint at how AI could eventually master complex creative tasks through trial and error. This isn’t just a novelty; it has real implications for developers and designers. Imagine an AI that doesn’t just generate static designs but evolves them over time, learning from its own mistakes and successes. Such a system could drastically speed up the design process, especially for projects that require iterative refinement. For industries like web development, where deadlines are tight and competition is fierce, this kind of autonomous improvement could be a game-changer. But here’s the kicker: GPT-5.4 isn’t just good at one thing. It demonstrated versatility by adapting to feedback in real time, a capability that hints at broader applications beyond design. Think about how this technology might extend to areas like product development, user experience optimization, or even art creation. The line between human creativity and machine-generated innovation is getting blurrier every day-and this experiment is a bold step in that direction. Looking ahead, the most exciting part of this breakthrough isn’t what GPT-5.4 did, but what it signals about AI’s potential. As models like these continue to evolve, we’re likely to see more examples of machines not just following instructions but actively improving upon them. The next step? Watch for AI systems that can teach themselves new skills on the fly, without needing human intervention to guide every improvement. This isn’t science fiction-it’s the future of technology, and it’s arriving faster than we think.
r/OpenAI2mo ago