Does sex matter?: Analysis of sex-related differences in the diagnostic performance of a market-approved convolutional neural network for skin cancer detection

Background - Advances in biomedical artificial intelligence may introduce or perpetuate sex and gender discriminations. Convolutional neural networks (CNN) have proven a dermatologist-level performance in image classification tasks but have not been assessed for sex and gender biases that may affect...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Kommoss, Katharina (VerfasserIn) , Winkler, Julia K. (VerfasserIn) , Müller-Christmann, Christine (VerfasserIn) , Niedermair, Felicitas (VerfasserIn) , Toberer, Ferdinand (VerfasserIn) , Buhl, Timo (VerfasserIn) , Enk, Alexander (VerfasserIn) , Blum, Andreas (VerfasserIn) , Stolz, Wilhelm (VerfasserIn) , Rosenberger, Albert (VerfasserIn) , Hänßle, Holger (VerfasserIn)
Dokumenttyp: Article (Journal)
Sprache:Englisch
Veröffentlicht: 16 February 2022
In: European journal of cancer
Year: 2022, Jahrgang: 164, Pages: 88-94
ISSN:1879-0852
DOI:10.1016/j.ejca.2021.12.034
Online-Zugang:Verlag, lizenzpflichtig, Volltext: https://doi.org/10.1016/j.ejca.2021.12.034
Verlag, lizenzpflichtig, Volltext: https://www.sciencedirect.com/science/article/pii/S095980492200020X
Volltext
Verfasserangaben:Katharina Sies, Julia K. Winkler, Christine Fink, Felicitas Bardehle, Ferdinand Toberer, Timo Buhl, Alexander Enk, Andreas Blum, Wilhelm Stolz, Albert Rosenberger, Holger A. Haenssle
Beschreibung
Zusammenfassung:Background - Advances in biomedical artificial intelligence may introduce or perpetuate sex and gender discriminations. Convolutional neural networks (CNN) have proven a dermatologist-level performance in image classification tasks but have not been assessed for sex and gender biases that may affect training data and diagnostic performance. In this study, we investigated sex-related imbalances in training data and diagnostic performance of a market-approved CNN for skin cancer classification (Moleanalyzer Pro®, Fotofinder Systems GmbH, Bad Birnbach, Germany). - Methods - We screened open-access dermoscopic image repositories widely used for CNN training for distribution of sex. Moreover, the sex-related diagnostic performance of the market-approved CNN was tested in 1549 dermoscopic images stratified by sex (female n = 773; male n = 776). - Results - Most open-access repositories showed a marked under-representation of images originating from female (40%) versus male (60%) patients. Despite these imbalances and well-known sex-related differences in skin anatomy or skin-directed behaviour, the tested CNN achieved a comparable sensitivity of 87.0% [80.9%-91.3%] versus 87.1% [81.1%-91.4%], specificity of 98.7% [97.4%-99.3%] versus 96.9% [95.2%-98.0%] and ROC-AUC of 0.984 [0.975-0.993] versus 0.979 [0.969-0.988] in dermoscopic images of female versus male origin, respectively. In the sample at hand, sex-related differences in ROC-AUCs were not statistically significant in the per-image analysis nor in an additional per-individual analysis (p ≥ 0.59). - Conclusion - Design and training of artificial intelligence algorithms for medical applications should generally acknowledge sex and gender dimensions. Despite sex-related imbalances in open-access training data, the diagnostic performance of the tested CNN showed no sex-related bias in the classification of skin lesions.
Beschreibung:Gesehen am 24.06.2022
Beschreibung:Online Resource
ISSN:1879-0852
DOI:10.1016/j.ejca.2021.12.034