Research1w ago

AI in Science Faces Big Challenges

arXiv CS.AIApril 23, 2026

In brief

AI systems based on large language models (LLMs) are increasingly being used for scientific research, but a new study reveals major flaws in how they approach scientific reasoning.
The research tested these AI agents across eight different areas of science through over 25,000 experiments and found that the base model-in other words, the core AI behind them-is responsible for most of their performance and behavior.
The study highlights worrying trends: evidence is ignored in 68% of cases, and only 26% of times do these systems revise their beliefs after encountering refutations.
Even when given clear successful reasoning as context, the agents still struggle to improve their methods.
- This means that while AI can execute scientific workflows, it doesn’t yet mimic the self-correcting nature of human scientific inquiry.
For now, outcomes alone aren’t enough to spot these failures, and simply tweaking the agent’s structure won’t fix them.
The real issue lies in how the AI learns to reason.
Until training focuses specifically on improving reasoning skills, the reliability of AI-generated scientific knowledge remains questionable.
Watch for future developments in AI training methods that directly target scientific thinking.

Read full story at arXiv CS.AI →

More briefs