Children represent less than 1% of publicly available medical imaging data, according to a preprint study published June 10 in MedRxiv.
To analyze the inclusion of children’s imaging data, researchers reviewed 181 public medical imaging data sets from The Cancer Imaging Archive, Stanford AI in Medicine & Imaging, the UK BioBank, the Medical Imaging and Data Resource Center and OpenNeuro.
They also reviewed 46 studies accepted to the Medical Imaging with Deep Learning conference from 2023 and 2024 to understand how pediatric data is represented in medical AI research.
Here are five notes on the analysis:
- Of the 181 datasets, 3.3% were pediatric only and 14.4% contained both adult and pediatric data.
- Among datasets with “sufficient” patient age information, children represented less than 1% of patients.
Children represented between 1% and 2% of patients within the Medical Imaging and Data Resource Center, The Cancer Imaging Archive and the Stanford AI in Medicine & Imaging datasets.
Among the 30 most-cited datasets, children represented 0.8% of patients. - Pediatric imaging data gaps varied by modality, with one pediatric ultrasound image for every six adult images, one pediatric CT scan for 302 adult scans and one pediatric MRI for 295 adult MRIs.
- Only one of the 46 studies accepted by the Medical Imaging with Deep Learning conference between 2023 and 2024 targeted pediatrics.
“These findings suggest a lack of pediatric AI applications being actively worked on in the research community, potentially driven by the pediatric data gap,” the study authors wrote. “In the absence of pediatric AI models, practitioners may opt for the off-label use of adult AI models.”
- The “glaring underrepresentation” of pediatric patients in publicly available medical imaging datasets limits the development of safe AI for pediatric use, the study authors wrote.
Read the full preprint study here.