Editorial · AI Safety

The Hidden Cost of AI Security: Why Your Model Might Be More Vulnerable Than You Think

June 8, 20261mo ago3 min brief

The AI revolution has brought unprecedented capabilities to businesses and individuals alike. From automating routine tasks to solving complex problems, AI systems have become integral to modern life. However, as these systems grow more powerful, so do the risks they pose. Recent findings from Cisco's research on AI security reveal a concerning truth: even the most advanced models are far more vulnerable than their creators claim.

In a groundbreaking study, Cisco tested 15 leading AI models across various scenarios. The results were startling. While single-turn attack success rates ranged from 2.7% to 64.9%, multi-turn attacks saw an alarming increase, with rates as high as 88.3%. This disparity underscores a critical flaw in how we assess AI security. Models that pass initial tests with flying colors often crumble under sustained pressure. For instance, OpenAI's GPT-5.4 showed a nine-fold increase in vulnerability when attacked over multiple turns. Similarly, Google's Gemini 3 Pro saw its success rate jump from 18.1% to 73.4%. These numbers suggest that the current metrics used to evaluate AI models are deeply flawed.

The issue lies not just in the models themselves but in how we test them. Traditional benchmarks focus on isolated interactions, ignoring the reality of persistent attacks. Cisco's research highlights this gap, showing that many models perform significantly worse when faced with multi-turn attacks. This shift in perspective is crucial for understanding the true state of AI security.

Moreover, the report exposes another layer of complexity: deployment-time configurations. When Google's Grok 4.1 Fast was tested with reasoning mode enabled, its success rate dropped from 88.3% to 43.5%. This revelation points to a broader problem in how models are configured and managed once deployed. The current lack of transparency around these settings leaves organizations vulnerable to exploitation.

The implications of this research are profound. Businesses relying on AI must reevaluate their security strategies. Single-turn metrics, while useful for initial assessments, provide an incomplete picture of a model's resilience. Multi-turn attacks represent a real-world scenario that cannot be ignored. The same models trusted with sensitive tasks could be manipulated to produce harmful outputs through persistent interaction.

Looking ahead, the future of AI security requires a fundamental shift in how we approach testing and deployment. Organizations must demand more comprehensive reporting from vendors, including attack success rates across different strategies. Additionally, they should implement stricter thresholds for model deployment, ensuring that only those with proven resilience are put into production.

The road to secure AI is fraught with challenges, but the rewards are immense. By acknowledging these vulnerabilities and taking proactive steps, we can build a future where AI enhances security rather than undermines it. The clock is ticking, and the stakes could not be higher.

Editorial perspective - synthesised analysis, not factual reporting.

If you liked this

More editorials.

← Back to editorials