The impact of access to clinical guidelines on LLM-based treatment recommendations for chronic hepatitis B

Background and Aims: Large language models (LLMs) can potentially support clinicians in their daily routine by providing easy access to information. Yet, they are plagued by stating incorrect facts and hallucinating when queried. Increasing the context by providing external databases while prompting...

Full description

Saved in:
Bibliographic Details
Main Authors: Siepmann, Robert (Author) , Schneider, Carolin Victoria (Author) , von der Stueck, Marc Sebastian (Author) , Amygdalos, Iakovos (Author) , Große, Karsten (Author) , Schneider, Kai Markus (Author) , Pollmanns, Maike (Author) , Murad, Mohamad (Author) , Joy, Joel (Author) , Kabak, Elena (Author) , May, Marcella Ricardis (Author) , Clusmann, Jan Niklas (Author) , Kuhl, Christiane (Author) , Nebelung, Sven (Author) , Kather, Jakob Nikolas (Author) , Truhn, Daniel (Author)
Format: Article (Journal)
Language:English
Published: 2 September 2025
In: Liver international
Year: 2025, Volume: 45, Issue: 10, Pages: 1-8
ISSN:1478-3231
DOI:10.1111/liv.70324
Online Access:Resolving-System, kostenfrei, Volltext: https://doi.org/10.1111/liv.70324
Verlag, kostenfrei, Volltext: https://onlinelibrary.wiley.com/doi/abs/10.1111/liv.70324
Get full text
Author Notes:Robert Siepmann, Carolin Victoria Schneider, Marc Sebastian von der Stueck, Iakovos Amygdalos, Karsten Große, Kai Markus Schneider, Maike Rebecca Pollmanns, Mohamad Murad, Joel Joy, Elena Kabak, Marcella Ricardis May, Jan Clusmann, Christiane Kuhl, Sven Nebelung, Jakob Nikolas Kather, Daniel Truhn
Description
Summary:Background and Aims: Large language models (LLMs) can potentially support clinicians in their daily routine by providing easy access to information. Yet, they are plagued by stating incorrect facts and hallucinating when queried. Increasing the context by providing external databases while prompting LLMs may decrease the risk of misinformation. This study compares the influence of increased context on the coherence of LLM-based treatment recommendations with the recently updated WHO guidelines for the treatment of chronic hepatitis B (CHB). Methods: GPT-4 was queried with five clinical case vignettes in two configurations: with and without additional context. The clinical vignettes were explicitly constructed so that treatment recommendations differed between the formerly applicable 2015 WHO guidelines and the updated 2024 ones. GPT-4 with context was provided access to the updated guidelines, while GPT-4 without context had to rely on its internal knowledge. GPT-4 was accessed only a few days after the release of the new WHO guidelines. Treatment recommendations were compared regarding guideline coherence, information inclusion, textual errors, wording clarity and preciseness by seven physicians. Results: Using GPT-4 with context increased the coherence of the treatment recommendations with the new 2024 guidelines from 51% to 91% compared to GPT-4 without context. Similar trends were observed for all other categories, leading to an increase of 54% in preciseness and clarity, 24% in completeness of incorporating the case vignette information, and 12% in textual correctness. Conclusions: If LLMs are consulted by clinicians for medical advice, they should be given access to external data sources to increase the chance of providing factually correct advice.
Item Description:Gesehen am 12.03.2026
Physical Description:Online Resource
ISSN:1478-3231
DOI:10.1111/liv.70324