Bridging vision and text: applications and challenges of vision-language models in urological surgery$dmini review
Vision-language models (VLMs) integrate visual data, such as surgical videos and medical images, with textual information for advanced artificial intelligence (AI) capabilities in surgery. This mini review highlights recent developments in the application of VLMs to surgical tasks in urology, such a...
Saved in:
| Main Authors: | , , , , , |
|---|---|
| Format: | Article (Journal) |
| Language: | English |
| Published: |
January 2025
|
| In: |
European urology focus
Year: 2025, Volume: 11, Issue: 1, Pages: 18-21 |
| ISSN: | 2405-4569 |
| DOI: | 10.1016/j.euf.2025.04.031 |
| Online Access: | Verlag, lizenzpflichtig, Volltext: https://doi.org/10.1016/j.euf.2025.04.031 Verlag, lizenzpflichtig, Volltext: https://www.sciencedirect.com/science/article/pii/S2405456925001105 |
| Author Notes: | Wouter Bogaert, Nicolas Carl, Karl-Friedrich Kowalewski, Maurice Stephan Michel, Alexandre Mottrie, Pieter De Backer |
| Summary: | Vision-language models (VLMs) integrate visual data, such as surgical videos and medical images, with textual information for advanced artificial intelligence (AI) capabilities in surgery. This mini review highlights recent developments in the application of VLMs to surgical tasks in urology, such as answering clinical questions about surgical images, recognizing surgical instruments, identifying surgical phases, and detecting errors during procedures. Despite the potential of VLMs, significant challenges remain, particularly the limited availability of high-quality data sets. Future progress depends on overcoming these limitations, enhancing the robustness and reliability of VLMs, and creating standardized data sets. Ultimately, VLMs represent a promising advance towards integrated, multimodal AI systems capable of supporting surgeons via automated guidance, educational support, and performance evaluation. - Patient summary - Our mini review explores new artificial intelligence (AI) tools that combine visual images and text to assist surgeons during operations. These AI tools can recognize instruments, identify surgical phases, and answer questions about surgery. Improved versions could help in making surgery safer and more efficient in the future. |
|---|---|
| Item Description: | Online verfügbar: 16. Mai 2025, Artikelversion: 12. Juni 2025 Gesehen am 11.11.2025 |
| Physical Description: | Online Resource |
| ISSN: | 2405-4569 |
| DOI: | 10.1016/j.euf.2025.04.031 |