Navigating prevalence shifts in image analysis algorithm deployment
Domain gaps are significant obstacles to the clinical implementation of machine learning (ML) solutions for medical image analysis. Although current research emphasizes new training methods and network architectures, the specific impact of prevalence shifts on algorithms in real-world applications i...
Saved in:
| Main Authors: | , , , , , , , |
|---|---|
| Format: | Article (Journal) |
| Language: | English |
| Published: |
May 2025
|
| In: |
Medical image analysis
Year: 2025, Volume: 102, Pages: 1-14 |
| ISSN: | 1361-8423 |
| DOI: | 10.1016/j.media.2025.103504 |
| Online Access: | Verlag, kostenfrei, Volltext: https://doi.org/10.1016/j.media.2025.103504 Verlag, kostenfrei, Volltext: https://www.sciencedirect.com/science/article/pii/S1361841525000520 |
| Author Notes: | Patrick Godau, Piotr Kalinowski, Evangelia Christodoulou, Annika Reinke, Minu Tizabi, Luciana Ferrer, Paul Jäger, Lena Maier-Hein |
| Summary: | Domain gaps are significant obstacles to the clinical implementation of machine learning (ML) solutions for medical image analysis. Although current research emphasizes new training methods and network architectures, the specific impact of prevalence shifts on algorithms in real-world applications is often overlooked. Differences in class frequencies between development and deployment data are crucial, particularly for the widespread adoption of artificial intelligence (AI), as disease prevalence can vary greatly across different times and locations. Our contribution is threefold. Based on a diverse set of 30 medical classification tasks (1) we demonstrate that lack of prevalence shift handling can have severe consequences on the quality of calibration, decision threshold, and performance assessment. Furthermore, (2) we show that prevalences can be accurately and reliably estimated in a data-driven manner. Finally, (3) we propose a new workflow for prevalence-aware image classification that uses estimated deployment prevalences to adjust a trained classifier to a new environment, without requiring additional annotated deployment data. Comprehensive experiments indicate that our proposed approach could contribute to generating better classifier decisions and more reliable performance estimates compared to current practice. |
|---|---|
| Item Description: | Online verfügbar 19 February 2025, Version des Artikels 27 February 2025 Gesehen am 28.07.2025 |
| Physical Description: | Online Resource |
| ISSN: | 1361-8423 |
| DOI: | 10.1016/j.media.2025.103504 |