Source code and data for the PhD Thesis "On-premise medical information extraction from German doctor’s letters under clinical constraints"
Dataset overview This dataset contains source code and annotation guidelines used in the PhD thesis: “On-Premise Medical Information Extraction from German Doctor’s Letters under Clinical Constraints” Repository structure The dataset is split into five repositories: Source code for Chapter 2.6 De-id...
Saved in:
| Main Author: | |
|---|---|
| Format: | Database Research Data |
| Language: | English |
| Published: |
Heidelberg
Universität
2026-04-21
|
| DOI: | 10.11588/DATA/USQLMB |
| Subjects: | |
| Online Access: | Verlag, lizenzpflichtig, Volltext: https://doi.org/10.11588/DATA/USQLMB Verlag, lizenzpflichtig, Volltext: https://heidata.uni-heidelberg.de/dataset.xhtml?persistentId=doi:10.11588/DATA/USQLMB |
| Author Notes: | Phillip Richter-Pechanski |
| Summary: | Dataset overview This dataset contains source code and annotation guidelines used in the PhD thesis: “On-Premise Medical Information Extraction from German Doctor’s Letters under Clinical Constraints” Repository structure The dataset is split into five repositories: Source code for Chapter 2.6 De-identification of German doctor’s letters Source code for Chapter 5 Clinical Section Classification using Pretrained Language Models and Prompting Source code for Chapter 6 Medication Information Extraction using Local Large Language Models Source code for Chapter 7Clinical Application: Medication Trends and Polypharmacy Annotation guidelines for Chapters 2.6, 4, 5, and 7 CARDIO:DE The main dataset used for experiments in Chapters 5, 6, and 7: CARDIO:DE - https://doi.org/10.11588/DATA/AFYQDY Additional datasets (not included here) Other datasets used include: n2c2 2018 Track 2 (used in Chapter 6) - https://doi.org/10.1093/jamia/ocz166 Notes on additional data and model availability Doctor’s letters from the cardiology domain used in Chapters 2, 5, 6, and 7 (except for CARDIO:DE) and all further-pretrained and finetuned models cannot be distributed due to data protection regulations. |
|---|---|
| Item Description: | Gesehen am 27.04.2026 |
| Physical Description: | Online Resource |
| DOI: | 10.11588/DATA/USQLMB |