Lightweight visual transformers outperform convolutional neural networks for gram-stained image classification: an empirical study

We aimed to automate Gram-stain analysis to speed up the detection of bacterial strains in patients suffering from infections. We performed comparative analyses of visual transformers (VT) using various configurations including model size (small vs. large), training epochs (1 vs. 100), and quantizat...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Kim, Hee Eun (VerfasserIn) , Maros, Máté E. (VerfasserIn) , Miethke, Thomas (VerfasserIn) , Kittel, Maximilian (VerfasserIn) , Siegel, Fabian (VerfasserIn) , Ganslandt, Thomas (VerfasserIn)
Dokumenttyp: Article (Journal)
Sprache:Englisch
Veröffentlicht: 30 April 2023
In: Biomedicines
Year: 2023, Jahrgang: 11, Heft: 5, Pages: 1-14
ISSN:2227-9059
DOI:10.3390/biomedicines11051333
Online-Zugang:Verlag, kostenfrei, Volltext: https://doi.org/10.3390/biomedicines11051333
Verlag, kostenfrei, Volltext: https://www.mdpi.com/2227-9059/11/5/1333
Volltext
Verfasserangaben:Hee E. Kim, Mate E. Maros, Thomas Miethke, Maximilian Kittel, Fabian Siegel and Thomas Ganslandt
Beschreibung
Zusammenfassung:We aimed to automate Gram-stain analysis to speed up the detection of bacterial strains in patients suffering from infections. We performed comparative analyses of visual transformers (VT) using various configurations including model size (small vs. large), training epochs (1 vs. 100), and quantization schemes (tensor- or channel-wise) using float32 or int8 on publicly available (DIBaS, n = 660) and locally compiled (n = 8500) datasets. Six VT models (BEiT, DeiT, MobileViT, PoolFormer, Swin and ViT) were evaluated and compared to two convolutional neural networks (CNN), ResNet and ConvNeXT. The overall overview of performances including accuracy, inference time and model size was also visualized. Frames per second (FPS) of small models consistently surpassed their large counterparts by a factor of 1-2×. DeiT small was the fastest VT in int8 configuration (6.0 FPS). In conclusion, VTs consistently outperformed CNNs for Gram-stain classification in most settings even on smaller datasets.
Beschreibung:Gesehen am 15.08.2023
Beschreibung:Online Resource
ISSN:2227-9059
DOI:10.3390/biomedicines11051333