latentbrief
Back to news
General8h ago

AI Struggles to Pick a Random City: Language Models Show Surprising Biases

LessWrong, arXiv CS.LG2 min brief

In brief

  • AI language models, while advanced, have a surprising flaw: they struggle to generate truly random outputs.
  • For instance, when asked to name a random weekday, Qwen3 chooses Wednesday 80% of the time.
  • Similarly, Gemma-3 cites just four cities in response to city requests, and multiple-choice questions often place the correct answer as option C.
    • This bias isn't just an amusing quirk-it has serious implications for tasks like synthetic data generation and creative problem-solving, where diversity is crucial.
  • Recent studies reveal that these biases stem from how models are trained.
  • They lack incentives to spread probability across diverse options, leading them to "collapse" onto narrow modes even when broader diversity is needed.
  • found that models' sampling from known distributions is heavily skewed, while Gu et al.
  • highlighted the consistent positioning of correct answers in multiple-choice questions.
  • Efforts to address this issue are mixed.
  • One method involves having models first generate a random string and then manipulate it, which works in simple cases but struggles with complexity.
  • Another approach focuses on training models explicitly against known distributions to improve their stochastic behavior.
  • Early evaluations show promise in distributional fidelity and transfer, suggesting that better randomness could soon be on the horizon.

Terms in this brief

Qwen3
A language model that has shown a tendency to select 'Wednesday' with high frequency when asked for a random weekday.
Gemma-3
Another language model identified in the brief, which only cites four cities when asked about city names.

Read full story at LessWrong, arXiv CS.LG

More briefs