AI Agents Learn to Self-Heal Without Forgetting Old Skills
In brief
- AI researchers have developed two new systems that tackle a major issue in machine learning: forgetting old tasks when adapting to new ones.
- One system, called SOLAR, acts as an autonomous agent that improves itself by treating its model weights like an environment for exploration.
- It starts with a strong foundation of common-sense knowledge and uses multi-level reinforcement learning to adapt efficiently.
- The other system, CP-MoE, focuses on reducing forgetting by using a "transient expert" to guide updates into stable experts while preserving cross-task knowledge.
- These advancements are crucial for real-world applications where AI models must handle dynamic environments without losing previously learned skills.
- SOLAR excels in various reasoning tasks, including common-sense and medical problems, while CP-MoE shows promise in both text and visual understanding.
- Together, these systems mark a significant step toward creating AI that can learn continuously and adapt over time.
- The future of AI looks promising with these self-optimizing agents.
- Researchers will likely continue refining these approaches to handle even more complex real-world scenarios.
- Stay tuned for further developments as AI moves closer to true lifelong learning.
Terms in this brief
- SOLAR
- A system that enables AI agents to improve themselves by treating their model weights as an environment for exploration. It starts with a strong foundation of common-sense knowledge and uses multi-level reinforcement learning to adapt efficiently without forgetting old tasks.
- CP-MoE
- Stands for 'Curriculum Progressive Multi-Expert'. This system reduces forgetting in AI models by using a 'transient expert' to guide updates into stable experts while preserving cross-task knowledge, ensuring that the model retains previously learned skills when adapting to new tasks.
Read full story at arXiv CS.AI →, arXiv CS.LG →
More briefs
AI Risk Analysis Faces Flaws: A Mathematical Perspective
Recent analysis highlights a critical issue in how we assess the risks of advanced AI systems. The problem stems from a mathematical concept known as "counting arguments," which are often used to estimate the likelihood of dangerous outcomes. These arguments rely on comparing the number of possible "bad" scenarios against "good" ones, suggesting that harmful AI behaviors are more probable due to their sheer volume. However, this approach is flawed because it doesn't account for how AI systems are actually trained and deployed. For example, when calculating the probability of an AI turning hostile, the method assumes a uniform distribution of possible objectives, which isn't realistic. This oversimplification can lead to exaggerated fears about AI risks without considering the specific constraints and biases inherent in real-world training processes. Looking ahead, experts suggest that more nuanced models are needed to accurately assess AI behavior. Instead of relying on simplistic counting, researchers should focus on understanding how AI goals align with human values during development. This shift could provide a more balanced view of AI's potential risks and benefits.
AI System Fails at Graduation Ceremony
Glendale Community College in Arizona used an artificial intelligence system to read names at its graduation ceremony. The system skipped some graduates, causing loud booing from students. The college president apologized for the disruption and explained that the students who were not called would be able to come onstage for their photo. This incident happened at a time when AI is a sensitive topic among students, with some commencement speakers being booed for mentioning it in their speeches. The use of AI at the ceremony was a lesson learned for the college, and it will likely consider alternative methods for future events. The college will review its use of AI technology for future ceremonies.
AI Powers Transnational Repression
China and other states use AI to silence critics abroad. They monitor and intimidate people across borders. This affects about 150 million people worldwide. Many are dissidents or human rights defenders. AI makes it easier to track and target them. AI powered repression will likely grow and become more complex.
AI Governance Gap Exposed
New autonomous artificial intelligence systems are making real-time decisions in defense, healthcare and other fields. These systems need a runtime governance layer to ensure they follow the rules. Traditional governance tools are not effective for AI systems because they are stochastic and context sensitive. The EU AI Act and other regulations require ongoing oversight of actual system behavior, and a runtime governance layer can provide this. Next, developers will focus on creating such a layer to ensure AI systems operate within established boundaries.
AI Safety Breakthrough: Early Results Show Dramatic Improvement in Model Behavior
AI researchers have achieved a significant milestone in improving the safety of large language models. By introducing a new pretraining method called Synthetic Persona Pretraining (SPP), they've reduced the mean attack success rate across five adversarial benchmarks by 63%. This approach involves adding value-laden reflections to 10% of training documents, effectively instilling desired behaviors during pretraining rather than relying on models to learn them post-training. The innovation lies in "persona binding," where models generalize their learned values even when faced with unseen scenarios. Initial tests show remarkable consistency, suggesting that this method could lead to safer AI systems capable of handling a broader range of ethical dilemmas. The team is scaling up the research to larger models with 3B parameters and 500B tokens, aiming to further refine these findings. This development marks an important step toward more reliable AI systems, offering a promising direction for future research in AI safety.