An open-source framework for end-to-end analysis of electronic health record data

With progressive digitalization of healthcare systems worldwide, large-scale collection of electronic health records (EHRs) has become commonplace. However, an extensible framework for comprehensive exploratory analysis that accounts for data heterogeneity is missing. Here we introduce ehrapy, a mod...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Heumos, Lukas (VerfasserIn) , Ehmele, Philipp (VerfasserIn) , Treis, Tim (VerfasserIn) , Upmeier zu Belzen, Julius (VerfasserIn) , Roellin, Eljas (VerfasserIn) , May, Lilly (VerfasserIn) , Namsaraeva, Altana (VerfasserIn) , Horlava, Nastassya (VerfasserIn) , Shitov, Vladimir A. (VerfasserIn) , Zhang, Xinyue (VerfasserIn) , Zappia, Luke (VerfasserIn) , Knoll, Rainer (VerfasserIn) , Lang, Niklas J. (VerfasserIn) , Hetzel, Leon (VerfasserIn) , Virshup, Isaac (VerfasserIn) , Sikkema, Lisa (VerfasserIn) , Curion, Fabiola (VerfasserIn) , Eils, Roland (VerfasserIn) , Schiller, Herbert B. (VerfasserIn) , Hilgendorff, Anne (VerfasserIn) , Theis, Fabian J. (VerfasserIn)
Dokumenttyp: Article (Journal)
Sprache:Englisch
Veröffentlicht: 12 September 2024
In: Nature medicine
Year: 2024, Jahrgang: 30, Heft: 11, Pages: 3369-3380
ISSN:1546-170X
DOI:10.1038/s41591-024-03214-0
Online-Zugang:Verlag, kostenfrei, Volltext: https://doi.org/10.1038/s41591-024-03214-0
Verlag, kostenfrei, Volltext: https://www.nature.com/articles/s41591-024-03214-0
Volltext
Verfasserangaben:Lukas Heumos, Philipp Ehmele, Tim Treis, Julius Upmeier zu Belzen, Eljas Roellin, Lilly May, Altana Namsaraeva, Nastassya Horlava, Vladimir A. Shitov, Xinyue Zhang, Luke Zappia, Rainer Knoll, Niklas J. Lang, Leon Hetzel, Isaac Virshup, Lisa Sikkema, Fabiola Curion, Roland Eils, Herbert B. Schiller, Anne Hilgendorff, Fabian J. Theis

MARC

LEADER 00000caa a2200000 c 4500
001 1919312447
003 DE-627
005 20250716235728.0
007 cr uuu---uuuuu
008 250307s2024 xx |||||o 00| ||eng c
024 7 |a 10.1038/s41591-024-03214-0  |2 doi 
035 |a (DE-627)1919312447 
035 |a (DE-599)KXP1919312447 
035 |a (OCoLC)1528020257 
040 |a DE-627  |b ger  |c DE-627  |e rda 
041 |a eng 
084 |a 32  |2 sdnb 
100 1 |a Heumos, Lukas  |e VerfasserIn  |0 (DE-588)1358985529  |0 (DE-627)1919313168  |4 aut 
245 1 3 |a An open-source framework for end-to-end analysis of electronic health record data  |c Lukas Heumos, Philipp Ehmele, Tim Treis, Julius Upmeier zu Belzen, Eljas Roellin, Lilly May, Altana Namsaraeva, Nastassya Horlava, Vladimir A. Shitov, Xinyue Zhang, Luke Zappia, Rainer Knoll, Niklas J. Lang, Leon Hetzel, Isaac Virshup, Lisa Sikkema, Fabiola Curion, Roland Eils, Herbert B. Schiller, Anne Hilgendorff, Fabian J. Theis 
264 1 |c 12 September 2024 
300 |b Illustrationen 
300 |a 12 
336 |a Text  |b txt  |2 rdacontent 
337 |a Computermedien  |b c  |2 rdamedia 
338 |a Online-Ressource  |b cr  |2 rdacarrier 
500 |a Gesehen am 07.03.2025 
520 |a With progressive digitalization of healthcare systems worldwide, large-scale collection of electronic health records (EHRs) has become commonplace. However, an extensible framework for comprehensive exploratory analysis that accounts for data heterogeneity is missing. Here we introduce ehrapy, a modular open-source Python framework designed for exploratory analysis of heterogeneous epidemiology and EHR data. ehrapy incorporates a series of analytical steps, from data extraction and quality control to the generation of low-dimensional representations. Complemented by rich statistical modules, ehrapy facilitates associating patients with disease states, differential comparison between patient clusters, survival analysis, trajectory inference, causal inference and more. Leveraging ontologies, ehrapy further enables data sharing and training EHR deep learning models, paving the way for foundational models in biomedical research. We demonstrate ehrapy’s features in six distinct examples. We applied ehrapy to stratify patients affected by unspecified pneumonia into finer-grained phenotypes. Furthermore, we reveal biomarkers for significant differences in survival among these groups. Additionally, we quantify medication-class effects of pneumonia medications on length of stay. We further leveraged ehrapy to analyze cardiovascular risks across different data modalities. We reconstructed disease state trajectories in patients with severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) based on imaging data. Finally, we conducted a case study to demonstrate how ehrapy can detect and mitigate biases in EHR data. ehrapy, thus, provides a framework that we envision will standardize analysis pipelines on EHR data and serve as a cornerstone for the community. 
650 4 |a Epidemiology 
650 4 |a Translational research 
700 1 |a Ehmele, Philipp  |e VerfasserIn  |4 aut 
700 1 |a Treis, Tim  |e VerfasserIn  |0 (DE-588)1212049853  |0 (DE-627)1700646877  |4 aut 
700 1 |a Upmeier zu Belzen, Julius  |e VerfasserIn  |4 aut 
700 1 |a Roellin, Eljas  |e VerfasserIn  |4 aut 
700 1 |a May, Lilly  |e VerfasserIn  |4 aut 
700 1 |a Namsaraeva, Altana  |e VerfasserIn  |4 aut 
700 1 |a Horlava, Nastassya  |e VerfasserIn  |4 aut 
700 1 |a Shitov, Vladimir A.  |e VerfasserIn  |4 aut 
700 1 |a Zhang, Xinyue  |e VerfasserIn  |4 aut 
700 1 |a Zappia, Luke  |e VerfasserIn  |4 aut 
700 1 |a Knoll, Rainer  |e VerfasserIn  |4 aut 
700 1 |a Lang, Niklas J.  |e VerfasserIn  |4 aut 
700 1 |a Hetzel, Leon  |e VerfasserIn  |4 aut 
700 1 |a Virshup, Isaac  |e VerfasserIn  |4 aut 
700 1 |a Sikkema, Lisa  |e VerfasserIn  |4 aut 
700 1 |a Curion, Fabiola  |e VerfasserIn  |4 aut 
700 1 |a Eils, Roland  |d 1965-  |e VerfasserIn  |0 (DE-588)1020648287  |0 (DE-627)691291705  |0 (DE-576)361718195  |4 aut 
700 1 |a Schiller, Herbert B.  |e VerfasserIn  |4 aut 
700 1 |a Hilgendorff, Anne  |d 1974-  |e VerfasserIn  |0 (DE-588)129877166  |0 (DE-627)482373075  |0 (DE-576)297882465  |4 aut 
700 1 |a Theis, Fabian J.  |e VerfasserIn  |4 aut 
773 0 8 |i Enthalten in  |t Nature medicine  |d [New York, NY] : Springer Nature, 1995  |g 30(2024), 11, Seite 3369-3380  |h Online-Ressource  |w (DE-627)301511640  |w (DE-600)1484517-9  |w (DE-576)288571274  |x 1546-170X  |7 nnas  |a An open-source framework for end-to-end analysis of electronic health record data 
773 1 8 |g volume:30  |g year:2024  |g number:11  |g pages:3369-3380  |g extent:12  |a An open-source framework for end-to-end analysis of electronic health record data 
856 4 0 |u https://doi.org/10.1038/s41591-024-03214-0  |x Verlag  |x Resolving-System  |z kostenfrei  |3 Volltext 
856 4 0 |u https://www.nature.com/articles/s41591-024-03214-0  |x Verlag  |z kostenfrei  |3 Volltext 
951 |a AR 
992 |a 20250307 
993 |a Article 
994 |a 2024 
998 |g 1020648287  |a Eils, Roland  |m 1020648287:Eils, Roland  |d 910000  |d 911810  |d 50000  |e 910000PE1020648287  |e 911810PE1020648287  |e 50000PE1020648287  |k 0/910000/  |k 1/910000/911810/  |k 0/50000/  |p 18 
999 |a KXP-PPN1919312447  |e 4683842378 
BIB |a Y 
SER |a journal 
JSO |a {"id":{"doi":["10.1038/s41591-024-03214-0"],"eki":["1919312447"]},"origin":[{"dateIssuedKey":"2024","dateIssuedDisp":"12 September 2024"}],"name":{"displayForm":["Lukas Heumos, Philipp Ehmele, Tim Treis, Julius Upmeier zu Belzen, Eljas Roellin, Lilly May, Altana Namsaraeva, Nastassya Horlava, Vladimir A. Shitov, Xinyue Zhang, Luke Zappia, Rainer Knoll, Niklas J. Lang, Leon Hetzel, Isaac Virshup, Lisa Sikkema, Fabiola Curion, Roland Eils, Herbert B. Schiller, Anne Hilgendorff, Fabian J. Theis"]},"recId":"1919312447","person":[{"family":"Heumos","given":"Lukas","role":"aut","display":"Heumos, Lukas"},{"family":"Ehmele","role":"aut","given":"Philipp","display":"Ehmele, Philipp"},{"display":"Treis, Tim","given":"Tim","role":"aut","family":"Treis"},{"given":"Julius","role":"aut","family":"Upmeier zu Belzen","display":"Upmeier zu Belzen, Julius"},{"display":"Roellin, Eljas","given":"Eljas","role":"aut","family":"Roellin"},{"family":"May","given":"Lilly","role":"aut","display":"May, Lilly"},{"display":"Namsaraeva, Altana","role":"aut","given":"Altana","family":"Namsaraeva"},{"display":"Horlava, Nastassya","given":"Nastassya","role":"aut","family":"Horlava"},{"family":"Shitov","role":"aut","given":"Vladimir A.","display":"Shitov, Vladimir A."},{"family":"Zhang","given":"Xinyue","role":"aut","display":"Zhang, Xinyue"},{"role":"aut","given":"Luke","family":"Zappia","display":"Zappia, Luke"},{"display":"Knoll, Rainer","given":"Rainer","role":"aut","family":"Knoll"},{"display":"Lang, Niklas J.","family":"Lang","given":"Niklas J.","role":"aut"},{"given":"Leon","role":"aut","family":"Hetzel","display":"Hetzel, Leon"},{"display":"Virshup, Isaac","given":"Isaac","role":"aut","family":"Virshup"},{"family":"Sikkema","role":"aut","given":"Lisa","display":"Sikkema, Lisa"},{"family":"Curion","given":"Fabiola","role":"aut","display":"Curion, Fabiola"},{"display":"Eils, Roland","role":"aut","given":"Roland","family":"Eils"},{"family":"Schiller","role":"aut","given":"Herbert B.","display":"Schiller, Herbert B."},{"display":"Hilgendorff, Anne","family":"Hilgendorff","role":"aut","given":"Anne"},{"role":"aut","given":"Fabian J.","family":"Theis","display":"Theis, Fabian J."}],"title":[{"title_sort":"open-source framework for end-to-end analysis of electronic health record data","title":"An open-source framework for end-to-end analysis of electronic health record data"}],"type":{"media":"Online-Ressource","bibl":"article-journal"},"language":["eng"],"physDesc":[{"extent":"12 S.","noteIll":"Illustrationen"}],"relHost":[{"title":[{"title_sort":"Nature medicine","title":"Nature medicine"}],"physDesc":[{"extent":"Online-Ressource"}],"pubHistory":["1.1995 -"],"language":["eng"],"type":{"media":"Online-Ressource","bibl":"periodical"},"note":["Gesehen am 24. Juli 2018"],"part":{"issue":"11","year":"2024","pages":"3369-3380","volume":"30","text":"30(2024), 11, Seite 3369-3380","extent":"12"},"id":{"issn":["1546-170X"],"zdb":["1484517-9"],"eki":["301511640"]},"origin":[{"dateIssuedDisp":"2017-","dateIssuedKey":"2017","publisher":"Springer Nature ; Nature America Inc.","publisherPlace":"[New York, NY] ; New York, NY"}],"disp":"An open-source framework for end-to-end analysis of electronic health record dataNature medicine","recId":"301511640"}],"note":["Gesehen am 07.03.2025"]} 
SRT |a HEUMOSLUKAOPENSOURCE1220