Comparing ChatGPT-3.5 and ChatGPT-4’s alignments with the German evidence-based S3 guideline for adult soft tissue sarcoma

Clinical reliability assessment of large language models is necessary due to their increasing use in healthcare. This study assessed the performance of ChatGPT-3.5 and ChatGPT-4 in answering questions deducted from the German evidence-based S3 guideline for adult soft tissue sarcoma (STS). Reponses...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	Li, Cheng-Peng (VerfasserIn) , Jakob, Jens (VerfasserIn) , Menge, Franka (VerfasserIn) , Reißfelder, Christoph (VerfasserIn) , Hohenberger, Peter (VerfasserIn) , Yang, Cui (VerfasserIn)
Dokumenttyp:	Article (Journal)
Sprache:	Englisch
Veröffentlicht:	December 20, 2024
In:	iScience Year: 2024, Jahrgang: 27, Heft: 12, Pages: 1-9
ISSN:	2589-0042
DOI:	10.1016/j.isci.2024.111493
Online-Zugang:	Verlag, kostenfrei, Volltext: https://doi.org/10.1016/j.isci.2024.111493 Verlag, kostenfrei, Volltext: https://www.sciencedirect.com/science/article/pii/S2589004224027202
Verfasserangaben:	Cheng-Peng Li, Jens Jakob, Franka Menge, Christoph Reißfelder, Peter Hohenberger, and Cui Yang

MARC


LEADER	00000caa a2200000 c 4500
001	1922901814
003	DE-627
005	20250717010121.0
007	cr uuu---uuuuu
008	250415s2024 xx \|\|\|\|\|o 00\| \|\|eng c
024	7		\|a 10.1016/j.isci.2024.111493 \|2 doi
035			\|a (DE-627)1922901814
035			\|a (DE-599)KXP1922901814
035			\|a (OCoLC)1528044527
040			\|a DE-627 \|b ger \|c DE-627 \|e rda
041			\|a eng
084			\|a 33 \|2 sdnb
100	1		\|a Li, Cheng-Peng \|e VerfasserIn \|0 (DE-588)1363080296 \|0 (DE-627)1922902276 \|4 aut
245	1	0	\|a Comparing ChatGPT-3.5 and ChatGPT-4’s alignments with the German evidence-based S3 guideline for adult soft tissue sarcoma \|c Cheng-Peng Li, Jens Jakob, Franka Menge, Christoph Reißfelder, Peter Hohenberger, and Cui Yang
264		1	\|c December 20, 2024
300			\|a 9
336			\|a Text \|b txt \|2 rdacontent
337			\|a Computermedien \|b c \|2 rdamedia
338			\|a Online-Ressource \|b cr \|2 rdacarrier
500			\|a Online verfügbar: 28. November 2024, Artikelversion: 11. December 2024
500			\|a Gesehen am 15.04.2025
520			\|a Clinical reliability assessment of large language models is necessary due to their increasing use in healthcare. This study assessed the performance of ChatGPT-3.5 and ChatGPT-4 in answering questions deducted from the German evidence-based S3 guideline for adult soft tissue sarcoma (STS). Reponses to 80 complex clinical questions covering diagnosis, treatment, and surveillance aspects were independently scored by two sarcoma experts for accuracy and adequacy. ChatGPT-4 outperformed ChatGPT-3.5 overall, with higher median scores in both accuracy (5.5 vs. 5.0) and adequacy (5.0 vs. 4.0). While both versions performed similarly on questions about retroperitoneal/visceral sarcoma and gastrointestinal stromal tumor (GIST)-specific treatment as well as questions about surveillance, ChatGPT-4 performed better on questions about general STS treatment and extremity/trunk sarcomas. Despite their potential as a supportive tool, both models occasionally offered misleading and potentially life-threatening information. This underscores the significance of cautious adoption and human monitoring in clinical settings.
650		4	\|a Artificial intelligence
650		4	\|a Oncology
700	1		\|a Jakob, Jens \|d 1974- \|e VerfasserIn \|0 (DE-588)129491128 \|0 (DE-627)470196335 \|0 (DE-576)297687794 \|4 aut
700	1		\|a Menge, Franka \|d 1970- \|e VerfasserIn \|0 (DE-588)118076744 \|0 (DE-627)694650471 \|0 (DE-576)291702260 \|4 aut
700	1		\|a Reißfelder, Christoph \|d 1975- \|e VerfasserIn \|0 (DE-588)1025566211 \|0 (DE-627)722940297 \|0 (DE-576)370516044 \|4 aut
700	1		\|a Hohenberger, Peter \|d 1953- \|e VerfasserIn \|0 (DE-588)1025311469 \|0 (DE-627)72202875X \|0 (DE-576)370195574 \|4 aut
700	1		\|a Yang, Cui \|d 1984- \|e VerfasserIn \|0 (DE-588)1136151982 \|0 (DE-627)891949968 \|0 (DE-576)490363180 \|4 aut
773	0	8	\|i Enthalten in \|t iScience \|d Amsterdam : Elsevier, 2018 \|g 27(2024), 12 vom: Dez., Artikel-ID 111493, Seite 1-9 \|h Online-Ressource \|w (DE-627)1019532106 \|w (DE-600)2927064-9 \|w (DE-576)502115858 \|x 2589-0042 \|7 nnas \|a Comparing ChatGPT-3.5 and ChatGPT-4’s alignments with the German evidence-based S3 guideline for adult soft tissue sarcoma
773	1	8	\|g volume:27 \|g year:2024 \|g number:12 \|g month:12 \|g elocationid:111493 \|g pages:1-9 \|g extent:9 \|a Comparing ChatGPT-3.5 and ChatGPT-4’s alignments with the German evidence-based S3 guideline for adult soft tissue sarcoma
856	4	0	\|u https://doi.org/10.1016/j.isci.2024.111493 \|x Verlag \|x Resolving-System \|z kostenfrei \|3 Volltext
856	4	0	\|u https://www.sciencedirect.com/science/article/pii/S2589004224027202 \|x Verlag \|z kostenfrei \|3 Volltext
951			\|a AR
992			\|a 20250415
993			\|a Article
994			\|a 2024
998			\|g 1136151982 \|a Yang, Cui \|m 1136151982:Yang, Cui \|d 60000 \|d 61800 \|e 60000PY1136151982 \|e 61800PY1136151982 \|k 0/60000/ \|k 1/60000/61800/ \|p 6 \|y j
998			\|g 1025311469 \|a Hohenberger, Peter \|m 1025311469:Hohenberger, Peter \|d 60000 \|d 61800 \|e 60000PH1025311469 \|e 61800PH1025311469 \|k 0/60000/ \|k 1/60000/61800/ \|p 5
998			\|g 1025566211 \|a Reißfelder, Christoph \|m 1025566211:Reißfelder, Christoph \|d 60000 \|d 61800 \|d 50000 \|e 60000PR1025566211 \|e 61800PR1025566211 \|e 50000PR1025566211 \|k 0/60000/ \|k 1/60000/61800/ \|k 0/50000/ \|p 4
998			\|g 118076744 \|a Menge, Franka \|m 118076744:Menge, Franka \|p 3
998			\|g 129491128 \|a Jakob, Jens \|m 129491128:Jakob, Jens \|d 60000 \|d 61800 \|e 60000PJ129491128 \|e 61800PJ129491128 \|k 0/60000/ \|k 1/60000/61800/ \|p 2
998			\|g 1363080296 \|a Li, Chen-Peng \|m 1363080296:Li, Chen-Peng \|p 1 \|x j
999			\|a KXP-PPN1922901814 \|e 4705518772
BIB			\|a Y
SER			\|a journal
JSO			\|a {"physDesc":[{"extent":"9 S."}],"recId":"1922901814","origin":[{"dateIssuedKey":"2024","dateIssuedDisp":"December 20, 2024"}],"person":[{"given":"Cheng-Peng","role":"aut","family":"Li","display":"Li, Cheng-Peng"},{"display":"Jakob, Jens","given":"Jens","role":"aut","family":"Jakob"},{"display":"Menge, Franka","role":"aut","given":"Franka","family":"Menge"},{"display":"Reißfelder, Christoph","family":"Reißfelder","role":"aut","given":"Christoph"},{"family":"Hohenberger","given":"Peter","role":"aut","display":"Hohenberger, Peter"},{"family":"Yang","given":"Cui","role":"aut","display":"Yang, Cui"}],"relHost":[{"physDesc":[{"extent":"Online-Ressource"}],"recId":"1019532106","disp":"Comparing ChatGPT-3.5 and ChatGPT-4’s alignments with the German evidence-based S3 guideline for adult soft tissue sarcomaiScience","origin":[{"dateIssuedDisp":"[2018]-","publisherPlace":"Amsterdam ; Boston ; London ; New York ; Oxford ; Paris ; Philadelphia ; San Diego ; St. Louis","publisher":"Elsevier"}],"pubHistory":["Volume 1 (March 23, 2018)-"],"language":["eng"],"type":{"bibl":"periodical","media":"Online-Ressource"},"note":["Gesehen am 11.09.2018"],"id":{"issn":["2589-0042"],"eki":["1019532106"],"zdb":["2927064-9"]},"title":[{"title_sort":"iScience","title":"iScience"}],"part":{"issue":"12","extent":"9","text":"27(2024), 12 vom: Dez., Artikel-ID 111493, Seite 1-9","pages":"1-9","volume":"27","year":"2024"}}],"language":["eng"],"type":{"bibl":"article-journal","media":"Online-Ressource"},"note":["Online verfügbar: 28. November 2024, Artikelversion: 11. December 2024","Gesehen am 15.04.2025"],"title":[{"title":"Comparing ChatGPT-3.5 and ChatGPT-4’s alignments with the German evidence-based S3 guideline for adult soft tissue sarcoma","title_sort":"Comparing ChatGPT-3.5 and ChatGPT-4’s alignments with the German evidence-based S3 guideline for adult soft tissue sarcoma"}],"name":{"displayForm":["Cheng-Peng Li, Jens Jakob, Franka Menge, Christoph Reißfelder, Peter Hohenberger, and Cui Yang"]},"id":{"doi":["10.1016/j.isci.2024.111493"],"eki":["1922901814"]}}
SRT			\|a LICHENGPENCOMPARINGC2020

Comparing ChatGPT-3.5 and ChatGPT-4’s alignments with the German evidence-based S3 guideline for adult soft tissue sarcoma

MARC

Ähnliche Einträge