ChatGPT versus Gemini on radiology training exams

Advertisement

ChatGPT slightly outperformed Gemini when tasked with completing radiology training exams; however, both large language models struggled with accuracy during image interpretation, according to a study published March 20 in Cureus

Researchers from Tampa, Fla.-based USF Health Morsani College of Medicine analyzed how OpenAI’s ChatGPT-4o and Google DeepMind’s Gemini Advanced performed on the 2022 American College of Radiology’s Diagnostic Radiology In-Training exam. 

Here are five things to know from their findings:

  1. The DXIT exam required analysis of both written and image-based content spanning several radiological subspecialties. The exam was administered to the LLMs as 106 multiple-choice questions.

  2. Though ChatGPT-4o exhibited a higher overall accuracy of 69.8%, compared to Gemini Advanced’s 60.4%, the difference was “not statistically significant,” researchers said.

    ChatGPT-4o exhibited increased accuracy in the cardiac and nuclear radiology subspecialties.

  3. Both ChatGPT-4o and Gemini Advanced showed similar accuracy on text-based questions, at 88.1% and 85.7%, respectively.

  4. ChatGPT-4o’s accuracy on image-based questions was 57.8%, higher than Gemini Advanced’s 43.8%.

  5. Both models exhibited better accuracy on the written questions, highlighting the need for further training on image interpretation, the study authors said. 

Read the full study here.

Advertisement

Next Up in Radiology

Advertisement