The impact of site-specific digital histology signatures on deep learning model accuracy and bias

The Cancer Genome Atlas (TCGA) is one of the largest biorepositories of digital histology. Deep learning (DL) models have been trained on TCGA to predict numerous features directly from histology, including survival, gene expression patterns, and driver mutations. However, we demonstrate that these...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Howard, Frederick (VerfasserIn) , Dolezal, James (VerfasserIn) , Kochanny, Sara (VerfasserIn) , Schulte, Jefree (VerfasserIn) , Chen, Heather (VerfasserIn) , Heij, Lara (VerfasserIn) , Huo, Dezheng (VerfasserIn) , Nanda, Rita (VerfasserIn) , Olopade, Olufunmilayo I. (VerfasserIn) , Kather, Jakob Nikolas (VerfasserIn) , Cipriani, Nicole (VerfasserIn) , Grossman, Robert L. (VerfasserIn) , Pearson, Alexander T. (VerfasserIn)
Dokumenttyp: Article (Journal)
Sprache:Englisch
Veröffentlicht: 20 July 2021
In: Nature Communications
Year: 2021, Jahrgang: 12, Pages: 1-13
ISSN:2041-1723
DOI:10.1038/s41467-021-24698-1
Online-Zugang:Verlag, lizenzpflichtig, Volltext: https://doi.org/10.1038/s41467-021-24698-1
Verlag, lizenzpflichtig, Volltext: https://www.nature.com/articles/s41467-021-24698-1
Volltext
Verfasserangaben:Frederick M. Howard, James Dolezal, Sara Kochanny, Jefree Schulte, Heather Chen, Lara Heij, Dezheng Huo, Rita Nanda, Olufunmilayo I. Olopade, Jakob N. Kather, Nicole Cipriani, Robert L. Grossman & Alexander T. Pearson

MARC

LEADER 00000caa a2200000 c 4500
001 1772743518
003 DE-627
005 20220820053701.0
007 cr uuu---uuuuu
008 211007s2021 xx |||||o 00| ||eng c
024 7 |a 10.1038/s41467-021-24698-1  |2 doi 
035 |a (DE-627)1772743518 
035 |a (DE-599)KXP1772743518 
035 |a (OCoLC)1341421777 
040 |a DE-627  |b ger  |c DE-627  |e rda 
041 |a eng 
084 |a 33  |2 sdnb 
100 1 |a Howard, Frederick  |e VerfasserIn  |0 (DE-588)1242646175  |0 (DE-627)1772744050  |4 aut 
245 1 4 |a The impact of site-specific digital histology signatures on deep learning model accuracy and bias  |c Frederick M. Howard, James Dolezal, Sara Kochanny, Jefree Schulte, Heather Chen, Lara Heij, Dezheng Huo, Rita Nanda, Olufunmilayo I. Olopade, Jakob N. Kather, Nicole Cipriani, Robert L. Grossman & Alexander T. Pearson 
264 1 |c 20 July 2021 
300 |a 13 
336 |a Text  |b txt  |2 rdacontent 
337 |a Computermedien  |b c  |2 rdamedia 
338 |a Online-Ressource  |b cr  |2 rdacarrier 
500 |a Gesehen am 07.10.2021 
520 |a The Cancer Genome Atlas (TCGA) is one of the largest biorepositories of digital histology. Deep learning (DL) models have been trained on TCGA to predict numerous features directly from histology, including survival, gene expression patterns, and driver mutations. However, we demonstrate that these features vary substantially across tissue submitting sites in TCGA for over 3,000 patients with six cancer subtypes. Additionally, we show that histologic image differences between submitting sites can easily be identified with DL. Site detection remains possible despite commonly used color normalization and augmentation methods, and we quantify the image characteristics constituting this site-specific digital histology signature. We demonstrate that these site-specific signatures lead to biased accuracy for prediction of features including survival, genomic mutations, and tumor stage. Furthermore, ethnicity can also be inferred from site-specific signatures, which must be accounted for to ensure equitable application of DL. These site-specific signatures can lead to overoptimistic estimates of model performance, and we propose a quadratic programming method that abrogates this bias by ensuring models are not trained and validated on samples from the same site. 
700 1 |a Dolezal, James  |e VerfasserIn  |4 aut 
700 1 |a Kochanny, Sara  |e VerfasserIn  |4 aut 
700 1 |a Schulte, Jefree  |e VerfasserIn  |4 aut 
700 1 |a Chen, Heather  |e VerfasserIn  |4 aut 
700 1 |a Heij, Lara  |e VerfasserIn  |4 aut 
700 1 |a Huo, Dezheng  |e VerfasserIn  |4 aut 
700 1 |a Nanda, Rita  |e VerfasserIn  |4 aut 
700 1 |a Olopade, Olufunmilayo I.  |e VerfasserIn  |4 aut 
700 1 |a Kather, Jakob Nikolas  |d 1989-  |e VerfasserIn  |0 (DE-588)1064064914  |0 (DE-627)812897587  |0 (DE-576)423589091  |4 aut 
700 1 |a Cipriani, Nicole  |e VerfasserIn  |4 aut 
700 1 |a Grossman, Robert L.  |e VerfasserIn  |4 aut 
700 1 |a Pearson, Alexander T.  |e VerfasserIn  |4 aut 
773 0 8 |i Enthalten in  |t Nature Communications  |d [London] : Springer Nature, 2010  |g 12(2021), Artikel-ID 4423, Seite 1-13  |h Online-Ressource  |w (DE-627)626457688  |w (DE-600)2553671-0  |w (DE-576)331555905  |x 2041-1723  |7 nnas  |a The impact of site-specific digital histology signatures on deep learning model accuracy and bias 
773 1 8 |g volume:12  |g year:2021  |g elocationid:4423  |g pages:1-13  |g extent:13  |a The impact of site-specific digital histology signatures on deep learning model accuracy and bias 
856 4 0 |u https://doi.org/10.1038/s41467-021-24698-1  |x Verlag  |x Resolving-System  |z lizenzpflichtig  |3 Volltext 
856 4 0 |u https://www.nature.com/articles/s41467-021-24698-1  |x Verlag  |z lizenzpflichtig  |3 Volltext 
951 |a AR 
992 |a 20211007 
993 |a Article 
994 |a 2021 
998 |g 1064064914  |a Kather, Jakob Nikolas  |m 1064064914:Kather, Jakob Nikolas  |d 910000  |d 910100  |e 910000PK1064064914  |e 910100PK1064064914  |k 0/910000/  |k 1/910000/910100/  |p 10 
999 |a KXP-PPN1772743518  |e 3985104050 
BIB |a Y 
SER |a journal 
JSO |a {"note":["Gesehen am 07.10.2021"],"language":["eng"],"type":{"bibl":"article-journal","media":"Online-Ressource"},"title":[{"title":"The impact of site-specific digital histology signatures on deep learning model accuracy and bias","title_sort":"impact of site-specific digital histology signatures on deep learning model accuracy and bias"}],"relHost":[{"pubHistory":["2010-"],"origin":[{"dateIssuedDisp":"[2010]-","publisherPlace":"[London] ; [London]","publisher":"Springer Nature ; Nature Publishing Group UK"}],"note":["Gesehen am 13.06.24"],"language":["eng"],"type":{"media":"Online-Ressource","bibl":"periodical"},"title":[{"title_sort":"Nature Communications","title":"Nature Communications"}],"part":{"pages":"1-13","year":"2021","volume":"12","text":"12(2021), Artikel-ID 4423, Seite 1-13","extent":"13"},"id":{"issn":["2041-1723"],"eki":["626457688"],"zdb":["2553671-0"]},"physDesc":[{"extent":"Online-Ressource"}],"recId":"626457688","disp":"The impact of site-specific digital histology signatures on deep learning model accuracy and biasNature Communications"}],"person":[{"display":"Howard, Frederick","given":"Frederick","role":"aut","family":"Howard"},{"display":"Dolezal, James","role":"aut","given":"James","family":"Dolezal"},{"display":"Kochanny, Sara","given":"Sara","role":"aut","family":"Kochanny"},{"family":"Schulte","role":"aut","given":"Jefree","display":"Schulte, Jefree"},{"given":"Heather","role":"aut","family":"Chen","display":"Chen, Heather"},{"display":"Heij, Lara","given":"Lara","role":"aut","family":"Heij"},{"display":"Huo, Dezheng","given":"Dezheng","role":"aut","family":"Huo"},{"display":"Nanda, Rita","family":"Nanda","role":"aut","given":"Rita"},{"role":"aut","given":"Olufunmilayo I.","family":"Olopade","display":"Olopade, Olufunmilayo I."},{"display":"Kather, Jakob Nikolas","given":"Jakob Nikolas","role":"aut","family":"Kather"},{"display":"Cipriani, Nicole","family":"Cipriani","given":"Nicole","role":"aut"},{"family":"Grossman","role":"aut","given":"Robert L.","display":"Grossman, Robert L."},{"display":"Pearson, Alexander T.","role":"aut","given":"Alexander T.","family":"Pearson"}],"origin":[{"dateIssuedKey":"2021","dateIssuedDisp":"20 July 2021"}],"physDesc":[{"extent":"13 S."}],"recId":"1772743518","id":{"eki":["1772743518"],"doi":["10.1038/s41467-021-24698-1"]},"name":{"displayForm":["Frederick M. Howard, James Dolezal, Sara Kochanny, Jefree Schulte, Heather Chen, Lara Heij, Dezheng Huo, Rita Nanda, Olufunmilayo I. Olopade, Jakob N. Kather, Nicole Cipriani, Robert L. Grossman & Alexander T. Pearson"]}} 
SRT |a HOWARDFREDIMPACTOFSI2020