Bridging vision and text: applications and challenges of vision-language models in urological surgery$dmini review

Vision-language models (VLMs) integrate visual data, such as surgical videos and medical images, with textual information for advanced artificial intelligence (AI) capabilities in surgery. This mini review highlights recent developments in the application of VLMs to surgical tasks in urology, such a...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Bogaert, Wouter (VerfasserIn) , Carl, Nicolas (VerfasserIn) , Kowalewski, Karl-Friedrich (VerfasserIn) , Michel, Maurice Stephan (VerfasserIn) , Mottrie, Alexandre (VerfasserIn) , De Backer, Pieter (VerfasserIn)
Dokumenttyp: Article (Journal)
Sprache:Englisch
Veröffentlicht: January 2025
In: European urology focus
Year: 2025, Jahrgang: 11, Heft: 1, Pages: 18-21
ISSN:2405-4569
DOI:10.1016/j.euf.2025.04.031
Online-Zugang:Verlag, lizenzpflichtig, Volltext: https://doi.org/10.1016/j.euf.2025.04.031
Verlag, lizenzpflichtig, Volltext: https://www.sciencedirect.com/science/article/pii/S2405456925001105
Volltext
Verfasserangaben:Wouter Bogaert, Nicolas Carl, Karl-Friedrich Kowalewski, Maurice Stephan Michel, Alexandre Mottrie, Pieter De Backer
Beschreibung
Zusammenfassung:Vision-language models (VLMs) integrate visual data, such as surgical videos and medical images, with textual information for advanced artificial intelligence (AI) capabilities in surgery. This mini review highlights recent developments in the application of VLMs to surgical tasks in urology, such as answering clinical questions about surgical images, recognizing surgical instruments, identifying surgical phases, and detecting errors during procedures. Despite the potential of VLMs, significant challenges remain, particularly the limited availability of high-quality data sets. Future progress depends on overcoming these limitations, enhancing the robustness and reliability of VLMs, and creating standardized data sets. Ultimately, VLMs represent a promising advance towards integrated, multimodal AI systems capable of supporting surgeons via automated guidance, educational support, and performance evaluation. - Patient summary - Our mini review explores new artificial intelligence (AI) tools that combine visual images and text to assist surgeons during operations. These AI tools can recognize instruments, identify surgical phases, and answer questions about surgery. Improved versions could help in making surgery safer and more efficient in the future.
Beschreibung:Online verfügbar: 16. Mai 2025, Artikelversion: 12. Juni 2025
Gesehen am 11.11.2025
Beschreibung:Online Resource
ISSN:2405-4569
DOI:10.1016/j.euf.2025.04.031