latentbrief
Back to Google
Research1d ago

AI Models Show Unexpected Inconsistencies in Decision-Making

LessWrong

In brief

  • AI models like Claude Opus, DeepSeek V4-Pro, Google Gemini, and OpenAI GPT have shown surprising inconsistencies when making decisions.
  • In a study with over 25,000 calls across four models, researchers found that the same model could recommend one action in one scenario but value another differently in another.
  • For example, when asked which lead to pursue first, models often chose a safer option, yet when evaluating potential earnings, they valued riskier but potentially more rewarding choices higher.
    • This mirrors classic human decision-making biases observed decades ago.
  • The study tested various prompt formats and reasoning settings, revealing that even at their most advanced, AI models still struggle with consistent judgment.
  • In one format, inconsistency rates dropped from 48.4% to 30.7% when reasoning was set to its highest level.
  • However, the models consistently showed a preference for safer bets in the short term while valuing riskier but potentially higher-reward options more highly.
  • Looking ahead, researchers suggest that these inconsistencies could impact how AI is used in real-world applications like business decisions or financial advice.
  • As AI becomes more integrated into daily life, understanding and addressing these biases will be crucial for ensuring reliable and ethical outcomes.

Terms in this brief

Claude Opus
A specific version or iteration of the Claude AI model developed by Anthropic, known for its advanced capabilities in decision-making and reasoning. This study highlights that even advanced models like Claude Opus can exhibit unexpected inconsistencies in their judgments.
DeepSeek V4-Pro
A sophisticated AI model created by the Chinese company DeepSeek, noted for its performance in various tasks including problem-solving and decision-making. The research indicates that this model also shows surprising inconsistencies in its decision-making processes.
Gemini
A state-of-the-art AI model developed by Google, known for its multi-task capabilities and advanced reasoning skills. The study reveals that Gemini, like other models, struggles with consistent judgment across different scenarios.

Read full story at LessWrong

More briefs