Benchmarking vision-language models for diagnostics in emergency and critical care settings
The applicability of vision-language models (VLMs) for acute care in emergency and intensive care units remains underexplored. Using a multimodal dataset of diagnostic questions involving medical images and clinical context, we benchmarked several small open-source VLMs against GPT-4o. While open mo...
Gespeichert in:
| Hauptverfasser: | , , , , |
|---|---|
| Dokumenttyp: | Article (Journal) Editorial |
| Sprache: | Englisch |
| Veröffentlicht: |
10 July 2025
|
| In: |
npj digital medicine
Year: 2025, Jahrgang: 8, Pages: 1-6 |
| ISSN: | 2398-6352 |
| DOI: | 10.1038/s41746-025-01837-2 |
| Online-Zugang: | Verlag, kostenfrei, Volltext: https://doi.org/10.1038/s41746-025-01837-2 Verlag, kostenfrei, Volltext: https://www.nature.com/articles/s41746-025-01837-2 |
| Verfasserangaben: | Christoph F. Kurz, Tatiana Merzhevich, Bjoern M. Eskofier, Jakob Nikolas Kather & Benjamin Gmeiner |
| Zusammenfassung: | The applicability of vision-language models (VLMs) for acute care in emergency and intensive care units remains underexplored. Using a multimodal dataset of diagnostic questions involving medical images and clinical context, we benchmarked several small open-source VLMs against GPT-4o. While open models demonstrated limited diagnostic accuracy (up to 40.4%), GPT-4o significantly outperformed them (68.1%). Findings highlight the need for specialized training and optimization to improve open-source VLMs for acute care applications. |
|---|---|
| Beschreibung: | Gesehen am 21.11.2025 |
| Beschreibung: | Online Resource |
| ISSN: | 2398-6352 |
| DOI: | 10.1038/s41746-025-01837-2 |