Medication information extraction using local large language models

Objective - Medication information is crucial for clinical routine and research. However, a vast amount is stored in unstructured text, such as doctor’s letters, requiring manual extraction - a resource-intensive, error-prone task. Automating this process comes with significant constraints in a clin...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Richter-Pechanski, Phillip (VerfasserIn) , Seiferling, Marvin (VerfasserIn) , Kiriakou, Christina (VerfasserIn) , Schwab, Dominic Mathias (VerfasserIn) , Geis, Nicolas (VerfasserIn) , Dieterich, Christoph (VerfasserIn) , Frank, Anette (VerfasserIn)
Dokumenttyp: Article (Journal)
Sprache:Englisch
Veröffentlicht: September 2025
In: Journal of biomedical informatics
Year: 2025, Jahrgang: 169, Pages: 1-32
ISSN:1532-0480
DOI:10.1016/j.jbi.2025.104898
Online-Zugang:Verlag, kostenfrei, Volltext: https://doi.org/10.1016/j.jbi.2025.104898
Verlag, kostenfrei, Volltext: https://www.sciencedirect.com/science/article/pii/S1532046425001273
Volltext
Verfasserangaben:Phillip Richter-Pechanski, Marvin Seiferling, Christina Kiriakou, Dominic M. Schwab, Nicolas A. Geis, Christoph Dieterich, Anette Frank

MARC

LEADER 00000caa a2200000 c 4500
001 1940998794
003 DE-627
005 20251120193729.0
007 cr uuu---uuuuu
008 251112s2025 xx |||||o 00| ||eng c
024 7 |a 10.1016/j.jbi.2025.104898  |2 doi 
035 |a (DE-627)1940998794 
035 |a (DE-599)KXP1940998794 
040 |a DE-627  |b ger  |c DE-627  |e rda 
041 |a eng 
084 |a 33  |2 sdnb 
100 1 |a Richter-Pechanski, Phillip  |e VerfasserIn  |0 (DE-588)1204395470  |0 (DE-627)1689724056  |4 aut 
245 1 0 |a Medication information extraction using local large language models  |c Phillip Richter-Pechanski, Marvin Seiferling, Christina Kiriakou, Dominic M. Schwab, Nicolas A. Geis, Christoph Dieterich, Anette Frank 
264 1 |c September 2025 
300 |b Illustrationen 
300 |a 32 
336 |a Text  |b txt  |2 rdacontent 
337 |a Computermedien  |b c  |2 rdamedia 
338 |a Online-Ressource  |b cr  |2 rdacarrier 
500 |a Online verfügbar: 21. August 2025, Artikelversion: 23. August 2025 
500 |a Gesehen am 12.11.2025 
520 |a Objective - Medication information is crucial for clinical routine and research. However, a vast amount is stored in unstructured text, such as doctor’s letters, requiring manual extraction - a resource-intensive, error-prone task. Automating this process comes with significant constraints in a clinical setup, including the demand for clinical expertise, limited time-resources, restricted IT infrastructure, and the demand for transparent predictions. Recent advances in generative large language models (LLMs) and parameter-efficient fine-tuning methods show potential to address these challenges. - Methods - We evaluated local LLMs for end-to-end extraction of medication information, combining named entity recognition and relation extraction. We used format-restricting instructions and developed an innovative feedback pipeline to facilitate automated evaluation. We applied token-level Shapley values to visualize and quantify token contributions, to improve transparency of model predictions. - Results - Two open-source LLMs - one general (Llama) and one domain-specific (OpenBioLLM) - were evaluated on the English n2c2 2018 corpus and the German CARDIO:DE corpus. OpenBioLLM frequently struggled with structured outputs and hallucinations. Fine-tuned Llama models achieved new state-of-the-art results, improving F1-score by up to 10 percentage points (pp.) for adverse drug events and 6 pp. for medication reasons on English data. On the German dataset, Llama established a new benchmark, outperforming traditional machine learning methods by up to 16 pp. micro average F1-score. - Conclusion - Our findings show that fine-tuned local open-source generative LLMs outperform SOTA methods for medication information extraction, delivering high performance with limited time and IT resources in a real-world clinical setup, and demonstrate their effectiveness on both English and German data. Applying Shapley values improved prediction transparency, supporting informed clinical decision-making. 
650 4 |a Clinical NLP 
650 4 |a Fine-tuning 
650 4 |a Interpretability 
650 4 |a Large language models 
650 4 |a Llama 
650 4 |a Medication information extraction 
700 1 |a Seiferling, Marvin  |e VerfasserIn  |0 (DE-588)1381404634  |0 (DE-627)1941001599  |4 aut 
700 1 |a Kiriakou, Christina  |e VerfasserIn  |0 (DE-588)1156805929  |0 (DE-627)1019739959  |0 (DE-576)502445238  |4 aut 
700 1 |a Schwab, Dominic Mathias  |d 1987-  |e VerfasserIn  |0 (DE-588)1163244422  |0 (DE-627)1027454801  |0 (DE-576)507847423  |4 aut 
700 1 |a Geis, Nicolas  |d 1980-  |e VerfasserIn  |0 (DE-588)138513988  |0 (DE-627)696681854  |0 (DE-576)307885313  |4 aut 
700 1 |a Dieterich, Christoph  |d 1975-  |e VerfasserIn  |0 (DE-588)130054844  |0 (DE-627)494359269  |0 (DE-576)297972448  |4 aut 
700 1 |a Frank, Anette  |e VerfasserIn  |0 (DE-588)1020288108  |0 (DE-627)691172161  |0 (DE-576)36005689X  |4 aut 
773 0 8 |i Enthalten in  |t Journal of biomedical informatics  |d San Diego, Calif. : Academic Press, 2001  |g 169(2025), Artikel-ID 104898, Seite 1-32  |h Online-Ressource  |w (DE-627)334290791  |w (DE-600)2057141-0  |w (DE-576)104194502  |x 1532-0480  |7 nnas  |a Medication information extraction using local large language models 
773 1 8 |g volume:169  |g year:2025  |g elocationid:104898  |g pages:1-32  |g extent:32  |a Medication information extraction using local large language models 
856 4 0 |u https://doi.org/10.1016/j.jbi.2025.104898  |x Verlag  |x Resolving-System  |z kostenfrei  |3 Volltext 
856 4 0 |u https://www.sciencedirect.com/science/article/pii/S1532046425001273  |x Verlag  |z kostenfrei  |3 Volltext 
951 |a AR 
992 |a 20251112 
993 |a Article 
994 |a 2025 
998 |g 1020288108  |a Frank, Anette  |m 1020288108:Frank, Anette  |d 90000  |d 90500  |e 90000PF1020288108  |e 90500PF1020288108  |k 0/90000/  |k 1/90000/90500/  |p 7  |y j 
998 |g 130054844  |a Dieterich, Christoph  |m 130054844:Dieterich, Christoph  |d 910000  |d 910100  |e 910000PD130054844  |e 910100PD130054844  |k 0/910000/  |k 1/910000/910100/  |p 6 
998 |g 138513988  |a Geis, Nicolas  |m 138513988:Geis, Nicolas  |d 910000  |d 910100  |d 50000  |e 910000PG138513988  |e 910100PG138513988  |e 50000PG138513988  |k 0/910000/  |k 1/910000/910100/  |k 0/50000/  |p 5 
998 |g 1163244422  |a Schwab, Dominic Mathias  |m 1163244422:Schwab, Dominic Mathias  |d 50000  |e 50000PS1163244422  |k 0/50000/  |p 4 
998 |g 1156805929  |a Kiriakou, Christina  |m 1156805929:Kiriakou, Christina  |d 910000  |d 910100  |e 910000PK1156805929  |e 910100PK1156805929  |k 0/910000/  |k 1/910000/910100/  |p 3 
998 |g 1381404634  |a Seiferling, Marvin  |m 1381404634:Seiferling, Marvin  |d 910000  |d 910100  |e 910000PS1381404634  |e 910100PS1381404634  |k 0/910000/  |k 1/910000/910100/  |p 2 
998 |g 1204395470  |a Richter-Pechanski, Phillip  |m 1204395470:Richter-Pechanski, Phillip  |d 910000  |d 910100  |e 910000PR1204395470  |e 910100PR1204395470  |k 0/910000/  |k 1/910000/910100/  |p 1  |x j 
999 |a KXP-PPN1940998794  |e 4803584362 
BIB |a Y 
SER |a journal 
JSO |a {"physDesc":[{"noteIll":"Illustrationen","extent":"32 S."}],"recId":"1940998794","origin":[{"dateIssuedDisp":"September 2025","dateIssuedKey":"2025"}],"person":[{"display":"Richter-Pechanski, Phillip","family":"Richter-Pechanski","given":"Phillip","role":"aut"},{"family":"Seiferling","given":"Marvin","role":"aut","display":"Seiferling, Marvin"},{"given":"Christina","role":"aut","family":"Kiriakou","display":"Kiriakou, Christina"},{"family":"Schwab","given":"Dominic Mathias","role":"aut","display":"Schwab, Dominic Mathias"},{"display":"Geis, Nicolas","family":"Geis","given":"Nicolas","role":"aut"},{"display":"Dieterich, Christoph","family":"Dieterich","role":"aut","given":"Christoph"},{"display":"Frank, Anette","family":"Frank","role":"aut","given":"Anette"}],"relHost":[{"title":[{"title":"Journal of biomedical informatics","title_sort":"Journal of biomedical informatics"}],"part":{"year":"2025","pages":"1-32","extent":"32","text":"169(2025), Artikel-ID 104898, Seite 1-32","volume":"169"},"id":{"zdb":["2057141-0"],"eki":["334290791"],"issn":["1532-0480"]},"note":["Fortsetzung der Druck-Ausgabe","Gesehen am 13.06.23"],"type":{"bibl":"periodical","media":"Online-Ressource"},"language":["eng"],"pubHistory":["34.2001 - 46.2013; Vol. 47.2014 -"],"origin":[{"publisher":"Academic Press","publisherPlace":"San Diego, Calif.","dateIssuedKey":"2001","dateIssuedDisp":"2001-"}],"disp":"Medication information extraction using local large language modelsJournal of biomedical informatics","recId":"334290791","physDesc":[{"extent":"Online-Ressource"}]}],"language":["eng"],"type":{"bibl":"article-journal","media":"Online-Ressource"},"note":["Online verfügbar: 21. August 2025, Artikelversion: 23. August 2025","Gesehen am 12.11.2025"],"title":[{"title_sort":"Medication information extraction using local large language models","title":"Medication information extraction using local large language models"}],"name":{"displayForm":["Phillip Richter-Pechanski, Marvin Seiferling, Christina Kiriakou, Dominic M. Schwab, Nicolas A. Geis, Christoph Dieterich, Anette Frank"]},"id":{"doi":["10.1016/j.jbi.2025.104898"],"eki":["1940998794"]}} 
SRT |a RICHTERPECMEDICATION2025