Pan-european data harmonization for biobanks in ADOPT BBMRI-ERIC

<p> <b>Background</b> High-quality clinical data and biological specimens are key for medical research and personalized medicine. The Biobanking and Biomolecular Resources Research Infrastructure-European Research Infrastructure Consortium (BBMRI-ERIC) aims to facilitate access to...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Mate, Sebastian (VerfasserIn) , Kampf, Marvin (VerfasserIn) , Rödle, Wolfgang (VerfasserIn) , Kraus, Stefan (VerfasserIn) , Proynova, Rumyana (VerfasserIn) , Silander, Kaisa (VerfasserIn) , Ebert, Lars (VerfasserIn) , Lablans, Martin (VerfasserIn) , Schüttler, Christina (VerfasserIn) , Knell, Christian (VerfasserIn) , Eklund, Niina (VerfasserIn) , Hummel, Michael (VerfasserIn) , Holub, Petr (VerfasserIn) , Prokosch, Hans-Ulrich (VerfasserIn)
Dokumenttyp: Article (Journal)
Sprache:Englisch
Veröffentlicht: 2019-11-09
In: Applied clinical informatics
Year: 2019, Jahrgang: 10, Heft: 04, Pages: 679-692
ISSN:1869-0327
DOI:10.1055/s-0039-1695793
Online-Zugang:Verlag, lizenzpflichtig, Volltext: https://doi.org/10.1055/s-0039-1695793
Verlag, lizenzpflichtig, Volltext: http://www.thieme-connect.de/DOI/DOI?10.1055/s-0039-1695793
Volltext
Verfasserangaben:Sebastian Mate, Marvin Kampf, Wolfgang Rödle, Stefan Kraus, Rumyana Proynova, Kaisa Silander, Lars Ebert, Martin Lablans, Christina Schüttler, Christian Knell, Niina Eklund, Michael Hummel, Petr Holub, Hans-Ulrich Prokosch
Beschreibung
Zusammenfassung:<p> <b>Background</b> High-quality clinical data and biological specimens are key for medical research and personalized medicine. The Biobanking and Biomolecular Resources Research Infrastructure-European Research Infrastructure Consortium (BBMRI-ERIC) aims to facilitate access to such biological resources. The accompanying ADOPT BBMRI-ERIC project kick-started BBMRI-ERIC by collecting colorectal cancer data from European biobanks.</p> <p> <b>Objectives</b> To transform these data into a common representation, a uniform approach for data integration and harmonization had to be developed. This article describes the design and the implementation of a toolset for this task.</p> <p> <b>Methods</b> Based on the semantics of a metadata repository, we developed a lexical bag-of-words matcher, capable of semiautomatically mapping local biobank terms to the central ADOPT BBMRI-ERIC terminology. Its algorithm supports fuzzy matching, utilization of synonyms, and sentiment tagging. To process the anonymized instance data based on these mappings, we also developed a data transformation application.</p> <p> <b>Results</b> The implementation was used to process the data from 10 European biobanks. The lexical matcher automatically and correctly mapped 78.48% of the 1,492 local biobank terms, and human experts were able to complete the remaining mappings. We used the expert-curated mappings to successfully process 147,608 data records from 3,415 patients.</p> <p> <b>Conclusion</b> A generic harmonization approach was created and successfully used for cross-institutional data harmonization across 10 European biobanks. The software tools were made available as open source.</p>
Beschreibung:Gesehen am 25.03.2020
Beschreibung:Online Resource
ISSN:1869-0327
DOI:10.1055/s-0039-1695793