AI Can Now Rebuild Complex Software Without Original Code
In brief
- AI has reached a new milestone in coding.
- Epoch AI's MirrorCode benchmark challenges models to recreate entire programs without seeing the original code.
- Claude Opus 4.7 leads the pack, solving 56% of tasks by rebuilding a 16,000-line toolkit in just 14 hours.
- While this shows progress, all tested models still struggle with complex tasks.
- This breakthrough matters because it could change how software is developed.
- If AI can reliably recreate code, it might speed up development and reduce costs.
- However, the fact that even top models fail on tough problems highlights the challenges ahead.
- The tech industry should watch for improvements in this area to see if AI can truly become a reliable coding partner.
- Next steps will focus on enhancing AI's ability to handle complexity and accuracy.
- Developers and researchers are likely to pay close attention to how these tools evolve, as they could revolutionize software development.
Terms in this brief
- MirrorCode
- A benchmark created by Epoch AI that tests whether an AI can recreate entire software programs without seeing the original code. It's a measure of how well AI models understand and reproduce complex coding tasks, which could revolutionize software development if successful.
Read full story at The Decoder →
More briefs
News Outlets Sue OpenAI for Copyright Infringement
News outlets updated their lawsuit against OpenAI, saying Microsoft encouraged users to plagiarize their work. The news outlets claim OpenAI's chatbots distort their work by providing incomplete or inaccurate summaries. This hurts their ability to sell original content. Over 10 news outlets are involved in the lawsuit. The lawsuit could cost OpenAI billions of dollars in damages. The case will continue in court.
Hartford HealthCare Launches AI-Powered Chatbot
Hartford HealthCare has launched an AI-powered chatbot that interprets lab results and answers patients' questions based on their medical records. The tool is built directly into the patient portal, giving more personalized responses. The chatbot is significant because around 32% of adults nationwide use AI for health information or advice. This tool provides a seamless and real-time conversation with a patient's medical record. The new tool will help patients understand their health information better. It will be available 24/7 to help users interpret lab results and answer health questions.
Qualcomm Takes Aim at Nvidia in AI Chip Market
Qualcomm plans to challenge Nvidia's dominance in the AI chip market. Qualcomm's CEO Cristiano Amon presented a five-year plan to investors to increase sales of AI components in data centers. Qualcomm's goal is to make over $15 billion from AI components by 2029. The company also expects to make $40 billion from businesses outside of handsets by 2029. This is double what was forecast two years ago. Qualcomm will offer power-efficient CPUs to stand out in the market. The company's shares went up 15% after the announcement. Qualcomm is also expanding into other areas like automotive and PC chips. The company just bought AI software company Modular for $3.9 billion. Qualcomm will now compete with Nvidia in the AI chip market. Qualcomm will continue to work on its AI chip plans in the coming years.
AI Agents Now Remember Conversations Across Days
AI agents can now remember details from previous interactions across days, marking a significant leap in their capabilities. Previously limited to handling single questions or short exchanges, these advancements allow agents like NVIDIA's to maintain context over extended periods. For instance, an agent can recall past user preferences and tailor responses accordingly. This improvement is crucial for developers and researchers aiming to create more intuitive AI systems that match human-like interaction. By retaining information across multiple exchanges, these agents can provide more coherent and personalized assistance, enhancing user experience in applications like customer support or personal assistants. Looking ahead, expect further developments in memory retention and contextual understanding, potentially enabling even longer-term recall and more sophisticated conversational flows.
Google's New AI Tech Speeds Up Features on Pixel Phones
Google has found a way to make AI features like notification summaries and message proofreading faster and more efficient on its Pixel phones. The company retrofitted Multi-Token Prediction (MTP) onto existing Gemini Nano v3 models, which are "frozen" and optimized for mobile devices. This new architecture allows the phone's AI to generate multiple tokens of text at once, significantly reducing the time and energy it takes to perform these tasks. This advancement is particularly important because mobile devices have limited processing power and battery life compared to servers. Traditional language models process one word at a time, creating a bottleneck that slows down performance and drains battery. By using MTP, Google claims that features like AI Notification Summaries and Proofread now work faster and consume less energy. For developers, this means they can build high-speed on-device AI features without needing to create separate, memory-heavy models for each task. The new approach is already available on the Pixel 9 and 10 series. Google says this marks a major step forward in making AI more accessible and efficient for everyday users.