Navigating prevalence shifts in image analysis algorithm deployment

Domain gaps are significant obstacles to the clinical implementation of machine learning (ML) solutions for medical image analysis. Although current research emphasizes new training methods and network architectures, the specific impact of prevalence shifts on algorithms in real-world applications i...

Full description

Saved in:
Bibliographic Details
Main Authors: Godau, Patrick (Author) , Kalinowski, Piotr (Author) , Christodoulou, Evangelia (Author) , Reinke, Annika (Author) , Tizabi, Minu (Author) , Ferrer, Luciana (Author) , Jäger, Paul (Author) , Maier-Hein, Lena (Author)
Format: Article (Journal)
Language:English
Published: May 2025
In: Medical image analysis
Year: 2025, Volume: 102, Pages: 1-14
ISSN:1361-8423
DOI:10.1016/j.media.2025.103504
Online Access:Verlag, kostenfrei, Volltext: https://doi.org/10.1016/j.media.2025.103504
Verlag, kostenfrei, Volltext: https://www.sciencedirect.com/science/article/pii/S1361841525000520
Get full text
Author Notes:Patrick Godau, Piotr Kalinowski, Evangelia Christodoulou, Annika Reinke, Minu Tizabi, Luciana Ferrer, Paul Jäger, Lena Maier-Hein
Description
Summary:Domain gaps are significant obstacles to the clinical implementation of machine learning (ML) solutions for medical image analysis. Although current research emphasizes new training methods and network architectures, the specific impact of prevalence shifts on algorithms in real-world applications is often overlooked. Differences in class frequencies between development and deployment data are crucial, particularly for the widespread adoption of artificial intelligence (AI), as disease prevalence can vary greatly across different times and locations. Our contribution is threefold. Based on a diverse set of 30 medical classification tasks (1) we demonstrate that lack of prevalence shift handling can have severe consequences on the quality of calibration, decision threshold, and performance assessment. Furthermore, (2) we show that prevalences can be accurately and reliably estimated in a data-driven manner. Finally, (3) we propose a new workflow for prevalence-aware image classification that uses estimated deployment prevalences to adjust a trained classifier to a new environment, without requiring additional annotated deployment data. Comprehensive experiments indicate that our proposed approach could contribute to generating better classifier decisions and more reliable performance estimates compared to current practice.
Item Description:Online verfügbar 19 February 2025, Version des Artikels 27 February 2025
Gesehen am 28.07.2025
Physical Description:Online Resource
ISSN:1361-8423
DOI:10.1016/j.media.2025.103504