Benchmarking vision-language models for diagnostics in emergency and critical care settings
The applicability of vision-language models (VLMs) for acute care in emergency and intensive care units remains underexplored. Using a multimodal dataset of diagnostic questions involving medical images and clinical context, we benchmarked several small open-source VLMs against GPT-4o. While open mo...
Gespeichert in:
| Hauptverfasser: | , , , , |
|---|---|
| Dokumenttyp: | Article (Journal) Editorial |
| Sprache: | Englisch |
| Veröffentlicht: |
10 July 2025
|
| In: |
npj digital medicine
Year: 2025, Jahrgang: 8, Pages: 1-6 |
| ISSN: | 2398-6352 |
| DOI: | 10.1038/s41746-025-01837-2 |
| Online-Zugang: | Verlag, kostenfrei, Volltext: https://doi.org/10.1038/s41746-025-01837-2 Verlag, kostenfrei, Volltext: https://www.nature.com/articles/s41746-025-01837-2 |
| Verfasserangaben: | Christoph F. Kurz, Tatiana Merzhevich, Bjoern M. Eskofier, Jakob Nikolas Kather & Benjamin Gmeiner |
MARC
| LEADER | 00000naa a2200000 c 4500 | ||
|---|---|---|---|
| 001 | 1941747612 | ||
| 003 | DE-627 | ||
| 005 | 20251121084144.0 | ||
| 007 | cr uuu---uuuuu | ||
| 008 | 251121s2025 xx |||||o 00| ||eng c | ||
| 024 | 7 | |a 10.1038/s41746-025-01837-2 |2 doi | |
| 035 | |a (DE-627)1941747612 | ||
| 035 | |a (DE-599)KXP1941747612 | ||
| 040 | |a DE-627 |b ger |c DE-627 |e rda | ||
| 041 | |a eng | ||
| 084 | |a 33 |2 sdnb | ||
| 100 | 1 | |a Kurz, Christoph |d 1981- |e VerfasserIn |0 (DE-588)1204607125 |0 (DE-627)1690006889 |4 aut | |
| 245 | 1 | 0 | |a Benchmarking vision-language models for diagnostics in emergency and critical care settings |c Christoph F. Kurz, Tatiana Merzhevich, Bjoern M. Eskofier, Jakob Nikolas Kather & Benjamin Gmeiner |
| 264 | 1 | |c 10 July 2025 | |
| 300 | |b Illustrationen, Diagramme | ||
| 300 | |a 6 | ||
| 336 | |a Text |b txt |2 rdacontent | ||
| 337 | |a Computermedien |b c |2 rdamedia | ||
| 338 | |a Online-Ressource |b cr |2 rdacarrier | ||
| 500 | |a Gesehen am 21.11.2025 | ||
| 520 | |a The applicability of vision-language models (VLMs) for acute care in emergency and intensive care units remains underexplored. Using a multimodal dataset of diagnostic questions involving medical images and clinical context, we benchmarked several small open-source VLMs against GPT-4o. While open models demonstrated limited diagnostic accuracy (up to 40.4%), GPT-4o significantly outperformed them (68.1%). Findings highlight the need for specialized training and optimization to improve open-source VLMs for acute care applications. | ||
| 650 | 4 | |a Computational biology and bioinformatics | |
| 650 | 4 | |a Health care | |
| 650 | 4 | |a Medical research | |
| 700 | 1 | |a Merzhevich, Tatiana |e VerfasserIn |4 aut | |
| 700 | 1 | |a Eskofier, Bjoern M. |e VerfasserIn |4 aut | |
| 700 | 1 | |a Kather, Jakob Nikolas |d 1989- |e VerfasserIn |0 (DE-588)1064064914 |0 (DE-627)812897587 |0 (DE-576)423589091 |4 aut | |
| 700 | 1 | |a Gmeiner, Benjamin |e VerfasserIn |4 aut | |
| 773 | 0 | 8 | |i Enthalten in |t npj digital medicine |d [Basingstoke] : Macmillan Publishers Limited, 2016 |g 8(2025), Artikel-ID 423, Seite 1-6 |h Online-Ressource |w (DE-627)1016587104 |w (DE-600)2925182-5 |w (DE-576)501513582 |x 2398-6352 |7 nnas |a Benchmarking vision-language models for diagnostics in emergency and critical care settings |
| 773 | 1 | 8 | |g volume:8 |g year:2025 |g elocationid:423 |g pages:1-6 |g extent:6 |a Benchmarking vision-language models for diagnostics in emergency and critical care settings |
| 856 | 4 | 0 | |u https://doi.org/10.1038/s41746-025-01837-2 |x Verlag |x Resolving-System |z kostenfrei |3 Volltext |
| 856 | 4 | 0 | |u https://www.nature.com/articles/s41746-025-01837-2 |x Verlag |z kostenfrei |3 Volltext |
| 951 | |a AR | ||
| 992 | |a 20251121 | ||
| 993 | |a Editorial | ||
| 994 | |a 2025 | ||
| 998 | |g 1064064914 |a Kather, Jakob Nikolas |m 1064064914:Kather, Jakob Nikolas |d 910000 |d 910100 |e 910000PK1064064914 |e 910100PK1064064914 |k 0/910000/ |k 1/910000/910100/ |p 4 | ||
| 999 | |a KXP-PPN1941747612 |e 4809768066 | ||
| BIB | |a Y | ||
| SER | |a journal | ||
| JSO | |a {"recId":"1941747612","physDesc":[{"noteIll":"Illustrationen, Diagramme","extent":"6 S."}],"title":[{"title":"Benchmarking vision-language models for diagnostics in emergency and critical care settings","title_sort":"Benchmarking vision-language models for diagnostics in emergency and critical care settings"}],"note":["Gesehen am 21.11.2025"],"type":{"bibl":"article-journal","media":"Online-Ressource"},"language":["eng"],"relHost":[{"origin":[{"publisherPlace":"[Basingstoke]","dateIssuedDisp":"[2016]-","publisher":"Macmillan Publishers Limited"}],"pubHistory":["2016-"],"type":{"bibl":"periodical","media":"Online-Ressource"},"language":["eng"],"note":["Gesehen am 06. September 2019"],"id":{"issn":["2398-6352"],"eki":["1016587104"],"zdb":["2925182-5"]},"title":[{"title":"npj digital medicine","title_sort":"npj digital medicine"}],"part":{"pages":"1-6","year":"2025","text":"8(2025), Artikel-ID 423, Seite 1-6","volume":"8","extent":"6"},"physDesc":[{"extent":"Online-Ressource"}],"recId":"1016587104","disp":"Benchmarking vision-language models for diagnostics in emergency and critical care settingsnpj digital medicine"}],"person":[{"family":"Kurz","role":"aut","given":"Christoph","display":"Kurz, Christoph"},{"given":"Tatiana","role":"aut","family":"Merzhevich","display":"Merzhevich, Tatiana"},{"display":"Eskofier, Bjoern M.","family":"Eskofier","given":"Bjoern M.","role":"aut"},{"display":"Kather, Jakob Nikolas","family":"Kather","role":"aut","given":"Jakob Nikolas"},{"family":"Gmeiner","role":"aut","given":"Benjamin","display":"Gmeiner, Benjamin"}],"origin":[{"dateIssuedDisp":"10 July 2025","dateIssuedKey":"2025"}],"id":{"eki":["1941747612"],"doi":["10.1038/s41746-025-01837-2"]},"name":{"displayForm":["Christoph F. Kurz, Tatiana Merzhevich, Bjoern M. Eskofier, Jakob Nikolas Kather & Benjamin Gmeiner"]}} | ||
| SRT | |a KURZCHRISTBENCHMARKI1020 | ||