latentbrief
Back to news
Research4h ago

AI Tools Fail to Impress in Medical Practice Tests

Nature1 min brief

In brief

  • Clinical AI tools are being used in medical practice despite a lack of independent evaluation.
    • These tools were compared to general purpose language models in three tests.
  • They were given 500 medical questions and 500 items to evaluate their agreement with expert clinicians.
  • They also received 100 real clinical queries from physicians.
  • The general purpose language models performed better in all three tests.
    • This shows that clinical AI tools may not be as effective as claimed, and independent evaluation is needed before they are used in medical practice.
  • New evaluations will be done to further test these tools.

Terms in this brief

independent evaluation
Independent evaluation refers to testing or assessing AI tools by parties that have no stake in the tool's success. This ensures unbiased and objective assessment of the tool's performance and effectiveness.

Read full story at Nature

More briefs