Reducing the impact of confounding factors on skin cancer classification via image segmentation: technical model study

Background: Studies have shown that artificial intelligence achieves similar or better performance than dermatologists in specific dermoscopic image classification tasks. However, artificial intelligence is susceptible to the influence of confounding factors within images (eg, skin markings), which...

Full description

Saved in:
Bibliographic Details
Main Authors: Maron, Roman C. (Author) , Hekler, Achim (Author) , Krieghoff-Henning, Eva (Author) , Schmitt, Max (Author) , Schlager, Justin Gabriel (Author) , Utikal, Jochen (Author) , Brinker, Titus Josef (Author)
Format: Article (Journal)
Language:English
Published: 2021
In: Journal of medical internet research
Year: 2021, Volume: 23, Issue: 3, Pages: 1-10
ISSN:1438-8871
DOI:10.2196/21695
Online Access:Verlag, lizenzpflichtig, Volltext: https://doi.org/10.2196/21695
Verlag, lizenzpflichtig, Volltext: https://www.jmir.org/2021/3/e21695
Get full text
Author Notes:Roman C Maron, MSc; Achim Hekler, MSc; Eva Krieghoff-Henning, PhD; Max Schmitt, MSc; Justin G Schlager, MD; Jochen S Utikal, MD; Titus J Brinker, MD
Description
Summary:Background: Studies have shown that artificial intelligence achieves similar or better performance than dermatologists in specific dermoscopic image classification tasks. However, artificial intelligence is susceptible to the influence of confounding factors within images (eg, skin markings), which can lead to false diagnoses of cancerous skin lesions. Image segmentation can remove lesion-adjacent confounding factors but greatly change the image representation. - Objective: The aim of this study was to compare the performance of 2 image classification workflows where images were either segmented or left unprocessed before the subsequent training and evaluation of a binary skin lesion classifier. - Methods: Separate binary skin lesion classifiers (nevus vs melanoma) were trained and evaluated on segmented and unsegmented dermoscopic images. For a more informative result, separate classifiers were trained on 2 distinct training data sets (human against machine [HAM] and International Skin Imaging Collaboration [ISIC]). Each training run was repeated 5 times. The mean performance of the 5 runs was evaluated on a multi-source test set (n=688) consisting of a holdout and an external component. - Results: Our findings showed that when trained on HAM, the segmented classifiers showed a higher overall balanced accuracy (75.6% [SD 1.1%]) than the unsegmented classifiers (66.7% [SD 3.2%]), which was significant in 4 out of 5 runs (P<.001). The overall balanced accuracy was numerically higher for the unsegmented ISIC classifiers (78.3% [SD 1.8%]) than for the segmented ISIC classifiers (77.4% [SD 1.5%]), which was significantly different in 1 out of 5 runs (P=.004). - Conclusions: Image segmentation does not result in overall performance decrease but it causes the beneficial removal of lesion-adjacent confounding factors. Thus, it is a viable option to address the negative impact that confounding factors have on deep learning models in dermatology. However, the segmentation step might introduce new pitfalls, which require further investigations.
Item Description:Gesehen am 19.05.2021
First published: June 22, 2020
Physical Description:Online Resource
ISSN:1438-8871
DOI:10.2196/21695