latentbrief
Back to news
Research2w ago

AI Models Show Promise in Creative Tasks

Simon Willison

In brief

  • Two major AI models, Qwen3.6-35B-A3B from Alibaba and Claude Opus 4.7 from Anthropic, have been tested on a creative benchmark involving generating images of pelicans riding bicycles.
  • Results show that Qwen3.6-35B-A3B accurately captured the scene with proper details, while Claude Opus 4.7 struggled with the bicycle frame's shape in its initial attempt but improved slightly when prompted to think at maximum level.
    • This highlights the growing capabilities of AI models in creative tasks despite their limitations.
  • Developers and researchers should continue monitoring advancements as AI pushes boundaries in image generation and beyond.

Terms in this brief

Qwen3.6-35B-A3B
A large language model developed by Alibaba, known for its performance in creative and detailed tasks. It demonstrated strong capabilities in generating accurate images of pelicans riding bicycles compared to other models.
Claude Opus 4.7
An AI model from Anthropic that was tested alongside Qwen3.6-35B-A3B. While it initially struggled with image generation, it showed improvement when prompted to think at a higher level, highlighting the ongoing advancements in AI creativity.

Read full story at Simon Willison

More briefs