De-Identification of German medical admission notes
Medical texts are a vast resource for medical and computational research. In contrast to newswire or wikipedia texts medical texts need to be de-identified before making them accessible to a wider NLP research community. We created a prototype for German medical text de-identification and named enti...
Gespeichert in:
| Hauptverfasser: | , , |
|---|---|
| Dokumenttyp: | Kapitel/Artikel Konferenzschrift |
| Sprache: | Englisch |
| Veröffentlicht: |
[2018]
|
| In: |
German medical data sciences
Year: 2018, Jahrgang: 253, Pages: 165-169 |
| DOI: | 10.3233/978-1-61499-896-9-165 |
| Online-Zugang: | Resolving-System: https://doi.org/10.3233/978-1-61499-896-9-165 |
| Verfasserangaben: | Phillip Richter-Pechanski, Stefan Riezler and Christoph Dieterich |
MARC
| LEADER | 00000caa a2200000 c 4500 | ||
|---|---|---|---|
| 001 | 1689724153 | ||
| 003 | DE-627 | ||
| 005 | 20250114090620.0 | ||
| 007 | cr uuu---uuuuu | ||
| 008 | 200210s2018 xx |||||o 00| ||eng c | ||
| 024 | 7 | |a 10.3233/978-1-61499-896-9-165 |2 doi | |
| 035 | |a (DE-627)1689724153 | ||
| 035 | |a (DE-599)KXP1689724153 | ||
| 035 | |a (OCoLC)1341304218 | ||
| 040 | |a DE-627 |b ger |c DE-627 |e rda | ||
| 041 | |a eng | ||
| 084 | |a 33 |2 sdnb | ||
| 100 | 1 | |a Richter-Pechanski, Phillip |e VerfasserIn |0 (DE-588)1204395470 |0 (DE-627)1689724056 |4 aut | |
| 245 | 1 | 0 | |a De-Identification of German medical admission notes |c Phillip Richter-Pechanski, Stefan Riezler and Christoph Dieterich |
| 264 | 1 | |c [2018] | |
| 300 | |a 5 | ||
| 336 | |a Text |b txt |2 rdacontent | ||
| 337 | |a Computermedien |b c |2 rdamedia | ||
| 338 | |a Online-Ressource |b cr |2 rdacarrier | ||
| 500 | |a Gesehen am 10.02.2020 | ||
| 520 | |a Medical texts are a vast resource for medical and computational research. In contrast to newswire or wikipedia texts medical texts need to be de-identified before making them accessible to a wider NLP research community. We created a prototype for German medical text de-identification and named entity recognition using a three-step approach. First, we used well known rule-based models based on regular expressions and gazetteers, second we used a spelling variant detector based on Levenshtein distance, exploiting the fact that the medical texts contain semi-structured headers including sensible personal data, and third we trained a named entity recognition model on out of domain data to add statistical capabilities to our prototype. Using a baseline based on regular expressions and gazetteers we could improve F2-score from 78% to 85% for de-identification. Our prototype is a first step for further research on German medical text de-identification and could show that using spelling variant detection and out of domain trained statistical models can improve de-identification performance significantly. | ||
| 650 | 4 | |a anonymization | |
| 650 | 4 | |a Data Anonymization | |
| 650 | 4 | |a De-identification | |
| 650 | 4 | |a Electronic Health Records | |
| 650 | 4 | |a Germany | |
| 650 | 4 | |a medical admission notes | |
| 650 | 4 | |a named entity recognition | |
| 650 | 4 | |a Natural Language Processing | |
| 650 | 4 | |a Patient Admission | |
| 650 | 4 | |a personal health information | |
| 700 | 1 | |a Riezler, Stefan |e VerfasserIn |0 (DE-588)1033925454 |0 (DE-627)743677528 |0 (DE-576)381607615 |4 aut | |
| 700 | 1 | |a Dieterich, Christoph |d 1975- |e VerfasserIn |0 (DE-588)130054844 |0 (DE-627)494359269 |0 (DE-576)297972448 |4 aut | |
| 773 | 0 | 8 | |i Enthalten in |a Deutsche Gesellschaft für Medizinische Informatik, Biometrie und Epidemiologie (63. : 2018 : Osnabrück) |t German medical data sciences |d Amsterdam [u.a.] : IOS Press [u.a.], 2018 |g 253(2018), Seite 165-169 |h 1 Online-Ressource |w (DE-627)1655133330 |w (DE-576)517787385 |z 9781614998969 |7 nnam |
| 773 | 1 | 8 | |g volume:253 |g year:2018 |g pages:165-169 |g extent:5 |a De-Identification of German medical admission notes |
| 856 | 4 | 0 | |u https://doi.org/10.3233/978-1-61499-896-9-165 |x Resolving-System |
| 951 | |a AR | ||
| 992 | |a 20200210 | ||
| 993 | |a ConferencePaper | ||
| 994 | |a 2018 | ||
| 998 | |g 130054844 |a Dieterich, Christoph |m 130054844:Dieterich, Christoph |d 910000 |d 910100 |e 910000PD130054844 |e 910100PD130054844 |k 0/910000/ |k 1/910000/910100/ |p 3 |y j | ||
| 998 | |g 1033925454 |a Riezler, Stefan |m 1033925454:Riezler, Stefan |d 90000 |d 90500 |e 90000PR1033925454 |e 90500PR1033925454 |k 0/90000/ |k 1/90000/90500/ |p 2 | ||
| 998 | |g 1204395470 |a Richter-Pechanski, Phillip |m 1204395470:Richter-Pechanski, Phillip |p 1 |x j | ||
| 999 | |a KXP-PPN1689724153 |e 3591798568 | ||
| BIB | |a Y | ||
| JSO | |a {"id":{"eki":["1689724153"],"doi":["10.3233/978-1-61499-896-9-165"]},"name":{"displayForm":["Phillip Richter-Pechanski, Stefan Riezler and Christoph Dieterich"]},"title":[{"title_sort":"De-Identification of German medical admission notes","title":"De-Identification of German medical admission notes"}],"note":["Gesehen am 10.02.2020"],"language":["eng"],"type":{"bibl":"chapter","media":"Online-Ressource"},"person":[{"display":"Richter-Pechanski, Phillip","family":"Richter-Pechanski","given":"Phillip","role":"aut"},{"display":"Riezler, Stefan","family":"Riezler","role":"aut","given":"Stefan"},{"given":"Christoph","role":"aut","family":"Dieterich","display":"Dieterich, Christoph"}],"relHost":[{"corporate":[{"role":"aut","display":"Deutsche Gesellschaft für Medizinische Informatik, Biometrie und Epidemiologie (63., 2018, Osnabrück)"}],"name":{"displayForm":["edited by Ursula Hübner, Ulrich Sax, Hans-Ulrich Prokosch, Bernhard Breil, Harald Binder, Antonia Zapf, Brigitte Strahwald, Tim Beißbarth, Niels Grabe, Anke Schöler"]},"id":{"isbn":["9781614998969"],"eki":["1655133330"]},"part":{"extent":"5","text":"253(2018), Seite 165-169","volume":"253","year":"2018","pages":"165-169"},"physDesc":[{"extent":"1 Online-Ressource"}],"relMultPart":[{"origin":[{"dateIssuedDisp":"1991-","dateIssuedKey":"1991","publisherPlace":"Amsterdam [u.a.]","publisher":"IOS Press [u.a.]"}],"pubHistory":["1.1991 -"],"type":{"bibl":"serial","media":"Online-Ressource"},"language":["eng"],"id":{"zdb":["2708884-4"],"issn":["1879-8365"],"eki":["739899465"]},"title":[{"title_sort":"Studies in health technology and informatics","title":"Studies in health technology and informatics"}],"part":{"number_sort":["253"],"number":["volume 253"]},"physDesc":[{"extent":"Online-Ressource"}],"dispAlt":"Studies in health technology and informatics","recId":"739899465","disp":"Studies in health technology and informatics"}],"recId":"1655133330","disp":"Deutsche Gesellschaft für Medizinische Informatik, Biometrie und Epidemiologie (63. : 2018 : Osnabrück)German medical data sciences","origin":[{"publisher":"IOS Press [u.a.]","publisherPlace":"Amsterdam [u.a.]","dateIssuedDisp":"2018","dateIssuedKey":"2018"}],"person":[{"role":"edt","given":"Ursula","family":"Hübner","display":"Hübner, Ursula"}],"language":["eng"],"type":{"media":"Online-Ressource","bibl":"book"},"note":["Gesehen am 19.02.2019"],"title":[{"title_sort":"German medical data sciences","title":"German medical data sciences","subtitle":"a learning healthcare system : proceedings of the 63rd annual meeting of the German Association of Medical Informatics, Biometry and Epidemiology (gmds e.V.) 2018 in Osnabrück, Germany - GMDS 2018"}]}],"origin":[{"dateIssuedDisp":"[2018]","dateIssuedKey":"2018"}],"recId":"1689724153","physDesc":[{"extent":"5 S."}]} | ||
| SRT | |a RICHTERPECDEIDENTIFI2018 | ||