Medication information extraction using local large language models
Objective - Medication information is crucial for clinical routine and research. However, a vast amount is stored in unstructured text, such as doctor’s letters, requiring manual extraction - a resource-intensive, error-prone task. Automating this process comes with significant constraints in a clin...
Gespeichert in:
| Hauptverfasser: | , , , , , , |
|---|---|
| Dokumenttyp: | Article (Journal) |
| Sprache: | Englisch |
| Veröffentlicht: |
September 2025
|
| In: |
Journal of biomedical informatics
Year: 2025, Jahrgang: 169, Pages: 1-32 |
| ISSN: | 1532-0480 |
| DOI: | 10.1016/j.jbi.2025.104898 |
| Online-Zugang: | Verlag, kostenfrei, Volltext: https://doi.org/10.1016/j.jbi.2025.104898 Verlag, kostenfrei, Volltext: https://www.sciencedirect.com/science/article/pii/S1532046425001273 |
| Verfasserangaben: | Phillip Richter-Pechanski, Marvin Seiferling, Christina Kiriakou, Dominic M. Schwab, Nicolas A. Geis, Christoph Dieterich, Anette Frank |
MARC
| LEADER | 00000caa a2200000 c 4500 | ||
|---|---|---|---|
| 001 | 1940998794 | ||
| 003 | DE-627 | ||
| 005 | 20251120193729.0 | ||
| 007 | cr uuu---uuuuu | ||
| 008 | 251112s2025 xx |||||o 00| ||eng c | ||
| 024 | 7 | |a 10.1016/j.jbi.2025.104898 |2 doi | |
| 035 | |a (DE-627)1940998794 | ||
| 035 | |a (DE-599)KXP1940998794 | ||
| 040 | |a DE-627 |b ger |c DE-627 |e rda | ||
| 041 | |a eng | ||
| 084 | |a 33 |2 sdnb | ||
| 100 | 1 | |a Richter-Pechanski, Phillip |e VerfasserIn |0 (DE-588)1204395470 |0 (DE-627)1689724056 |4 aut | |
| 245 | 1 | 0 | |a Medication information extraction using local large language models |c Phillip Richter-Pechanski, Marvin Seiferling, Christina Kiriakou, Dominic M. Schwab, Nicolas A. Geis, Christoph Dieterich, Anette Frank |
| 264 | 1 | |c September 2025 | |
| 300 | |b Illustrationen | ||
| 300 | |a 32 | ||
| 336 | |a Text |b txt |2 rdacontent | ||
| 337 | |a Computermedien |b c |2 rdamedia | ||
| 338 | |a Online-Ressource |b cr |2 rdacarrier | ||
| 500 | |a Online verfügbar: 21. August 2025, Artikelversion: 23. August 2025 | ||
| 500 | |a Gesehen am 12.11.2025 | ||
| 520 | |a Objective - Medication information is crucial for clinical routine and research. However, a vast amount is stored in unstructured text, such as doctor’s letters, requiring manual extraction - a resource-intensive, error-prone task. Automating this process comes with significant constraints in a clinical setup, including the demand for clinical expertise, limited time-resources, restricted IT infrastructure, and the demand for transparent predictions. Recent advances in generative large language models (LLMs) and parameter-efficient fine-tuning methods show potential to address these challenges. - Methods - We evaluated local LLMs for end-to-end extraction of medication information, combining named entity recognition and relation extraction. We used format-restricting instructions and developed an innovative feedback pipeline to facilitate automated evaluation. We applied token-level Shapley values to visualize and quantify token contributions, to improve transparency of model predictions. - Results - Two open-source LLMs - one general (Llama) and one domain-specific (OpenBioLLM) - were evaluated on the English n2c2 2018 corpus and the German CARDIO:DE corpus. OpenBioLLM frequently struggled with structured outputs and hallucinations. Fine-tuned Llama models achieved new state-of-the-art results, improving F1-score by up to 10 percentage points (pp.) for adverse drug events and 6 pp. for medication reasons on English data. On the German dataset, Llama established a new benchmark, outperforming traditional machine learning methods by up to 16 pp. micro average F1-score. - Conclusion - Our findings show that fine-tuned local open-source generative LLMs outperform SOTA methods for medication information extraction, delivering high performance with limited time and IT resources in a real-world clinical setup, and demonstrate their effectiveness on both English and German data. Applying Shapley values improved prediction transparency, supporting informed clinical decision-making. | ||
| 650 | 4 | |a Clinical NLP | |
| 650 | 4 | |a Fine-tuning | |
| 650 | 4 | |a Interpretability | |
| 650 | 4 | |a Large language models | |
| 650 | 4 | |a Llama | |
| 650 | 4 | |a Medication information extraction | |
| 700 | 1 | |a Seiferling, Marvin |e VerfasserIn |0 (DE-588)1381404634 |0 (DE-627)1941001599 |4 aut | |
| 700 | 1 | |a Kiriakou, Christina |e VerfasserIn |0 (DE-588)1156805929 |0 (DE-627)1019739959 |0 (DE-576)502445238 |4 aut | |
| 700 | 1 | |a Schwab, Dominic Mathias |d 1987- |e VerfasserIn |0 (DE-588)1163244422 |0 (DE-627)1027454801 |0 (DE-576)507847423 |4 aut | |
| 700 | 1 | |a Geis, Nicolas |d 1980- |e VerfasserIn |0 (DE-588)138513988 |0 (DE-627)696681854 |0 (DE-576)307885313 |4 aut | |
| 700 | 1 | |a Dieterich, Christoph |d 1975- |e VerfasserIn |0 (DE-588)130054844 |0 (DE-627)494359269 |0 (DE-576)297972448 |4 aut | |
| 700 | 1 | |a Frank, Anette |e VerfasserIn |0 (DE-588)1020288108 |0 (DE-627)691172161 |0 (DE-576)36005689X |4 aut | |
| 773 | 0 | 8 | |i Enthalten in |t Journal of biomedical informatics |d San Diego, Calif. : Academic Press, 2001 |g 169(2025), Artikel-ID 104898, Seite 1-32 |h Online-Ressource |w (DE-627)334290791 |w (DE-600)2057141-0 |w (DE-576)104194502 |x 1532-0480 |7 nnas |a Medication information extraction using local large language models |
| 773 | 1 | 8 | |g volume:169 |g year:2025 |g elocationid:104898 |g pages:1-32 |g extent:32 |a Medication information extraction using local large language models |
| 856 | 4 | 0 | |u https://doi.org/10.1016/j.jbi.2025.104898 |x Verlag |x Resolving-System |z kostenfrei |3 Volltext |
| 856 | 4 | 0 | |u https://www.sciencedirect.com/science/article/pii/S1532046425001273 |x Verlag |z kostenfrei |3 Volltext |
| 951 | |a AR | ||
| 992 | |a 20251112 | ||
| 993 | |a Article | ||
| 994 | |a 2025 | ||
| 998 | |g 1020288108 |a Frank, Anette |m 1020288108:Frank, Anette |d 90000 |d 90500 |e 90000PF1020288108 |e 90500PF1020288108 |k 0/90000/ |k 1/90000/90500/ |p 7 |y j | ||
| 998 | |g 130054844 |a Dieterich, Christoph |m 130054844:Dieterich, Christoph |d 910000 |d 910100 |e 910000PD130054844 |e 910100PD130054844 |k 0/910000/ |k 1/910000/910100/ |p 6 | ||
| 998 | |g 138513988 |a Geis, Nicolas |m 138513988:Geis, Nicolas |d 910000 |d 910100 |d 50000 |e 910000PG138513988 |e 910100PG138513988 |e 50000PG138513988 |k 0/910000/ |k 1/910000/910100/ |k 0/50000/ |p 5 | ||
| 998 | |g 1163244422 |a Schwab, Dominic Mathias |m 1163244422:Schwab, Dominic Mathias |d 50000 |e 50000PS1163244422 |k 0/50000/ |p 4 | ||
| 998 | |g 1156805929 |a Kiriakou, Christina |m 1156805929:Kiriakou, Christina |d 910000 |d 910100 |e 910000PK1156805929 |e 910100PK1156805929 |k 0/910000/ |k 1/910000/910100/ |p 3 | ||
| 998 | |g 1381404634 |a Seiferling, Marvin |m 1381404634:Seiferling, Marvin |d 910000 |d 910100 |e 910000PS1381404634 |e 910100PS1381404634 |k 0/910000/ |k 1/910000/910100/ |p 2 | ||
| 998 | |g 1204395470 |a Richter-Pechanski, Phillip |m 1204395470:Richter-Pechanski, Phillip |d 910000 |d 910100 |e 910000PR1204395470 |e 910100PR1204395470 |k 0/910000/ |k 1/910000/910100/ |p 1 |x j | ||
| 999 | |a KXP-PPN1940998794 |e 4803584362 | ||
| BIB | |a Y | ||
| SER | |a journal | ||
| JSO | |a {"physDesc":[{"noteIll":"Illustrationen","extent":"32 S."}],"recId":"1940998794","origin":[{"dateIssuedDisp":"September 2025","dateIssuedKey":"2025"}],"person":[{"display":"Richter-Pechanski, Phillip","family":"Richter-Pechanski","given":"Phillip","role":"aut"},{"family":"Seiferling","given":"Marvin","role":"aut","display":"Seiferling, Marvin"},{"given":"Christina","role":"aut","family":"Kiriakou","display":"Kiriakou, Christina"},{"family":"Schwab","given":"Dominic Mathias","role":"aut","display":"Schwab, Dominic Mathias"},{"display":"Geis, Nicolas","family":"Geis","given":"Nicolas","role":"aut"},{"display":"Dieterich, Christoph","family":"Dieterich","role":"aut","given":"Christoph"},{"display":"Frank, Anette","family":"Frank","role":"aut","given":"Anette"}],"relHost":[{"title":[{"title":"Journal of biomedical informatics","title_sort":"Journal of biomedical informatics"}],"part":{"year":"2025","pages":"1-32","extent":"32","text":"169(2025), Artikel-ID 104898, Seite 1-32","volume":"169"},"id":{"zdb":["2057141-0"],"eki":["334290791"],"issn":["1532-0480"]},"note":["Fortsetzung der Druck-Ausgabe","Gesehen am 13.06.23"],"type":{"bibl":"periodical","media":"Online-Ressource"},"language":["eng"],"pubHistory":["34.2001 - 46.2013; Vol. 47.2014 -"],"origin":[{"publisher":"Academic Press","publisherPlace":"San Diego, Calif.","dateIssuedKey":"2001","dateIssuedDisp":"2001-"}],"disp":"Medication information extraction using local large language modelsJournal of biomedical informatics","recId":"334290791","physDesc":[{"extent":"Online-Ressource"}]}],"language":["eng"],"type":{"bibl":"article-journal","media":"Online-Ressource"},"note":["Online verfügbar: 21. August 2025, Artikelversion: 23. August 2025","Gesehen am 12.11.2025"],"title":[{"title_sort":"Medication information extraction using local large language models","title":"Medication information extraction using local large language models"}],"name":{"displayForm":["Phillip Richter-Pechanski, Marvin Seiferling, Christina Kiriakou, Dominic M. Schwab, Nicolas A. Geis, Christoph Dieterich, Anette Frank"]},"id":{"doi":["10.1016/j.jbi.2025.104898"],"eki":["1940998794"]}} | ||
| SRT | |a RICHTERPECMEDICATION2025 | ||