GPT versus ERNIE for National Traditional Chinese Medicine Licensing Examination: Does cultural background matter?

Purpose: This study evaluates the performance of large language models (LLMs) in the context of the Chinese National Traditional Chinese Medicine Licensing Examination (TCMLE). - Materials and Methods: We compared the performances of different versions of Generative Pre-trained Transformer (GPT) and...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Ghanad, Erfan (VerfasserIn) , Weiß, Christel (VerfasserIn) , Gao, Hui (VerfasserIn) , Reißfelder, Christoph (VerfasserIn) , Hummedah, Kamal (VerfasserIn) , Han, Lei (VerfasserIn) , Tong, Leihui (VerfasserIn) , Li, Chengpeng (VerfasserIn) , Yang, Cui (VerfasserIn)
Dokumenttyp: Article (Journal)
Sprache:Englisch
Veröffentlicht: 2 July 2025
In: Journal of integrative and complementary medicine
Year: 2025, Pages: ?
ISSN:2768-3613
DOI:10.1089/jicm.2024.0902
Online-Zugang:Verlag, lizenzpflichtig, Volltext: https://doi.org/10.1089/jicm.2024.0902
Verlag, lizenzpflichtig, Volltext: http://www.liebertpub.com/doi/10.1089/jicm.2024.0902
Volltext
Verfasserangaben:Erfan Ghanad, Christel Weiß, Hui Gao, Christoph Reißfelder, Kamal Hummedah, Lei Han, Leihui Tong, Chengpeng Li, and Cui Yang
Beschreibung
Zusammenfassung:Purpose: This study evaluates the performance of large language models (LLMs) in the context of the Chinese National Traditional Chinese Medicine Licensing Examination (TCMLE). - Materials and Methods: We compared the performances of different versions of Generative Pre-trained Transformer (GPT) and Enhanced Representation through Knowledge Integration (ERNIE) using historical TCMLE questions. - Results: ERNIE-4.0 outperformed all other models with an accuracy of 81.7%, followed by ERNIE-3.5 (75.2%), GPT-4o (74.8%), and GPT-4 turbo (50.7%). For questions related to Western internal medicine, all models showed high accuracy above 86.7%. - Conclusion: The study highlights the significance of cultural context in training data, influencing the performance of LLMs in specific medical examinations.
Beschreibung:Gesehen am 02.09.2025
Beschreibung:Online Resource
ISSN:2768-3613
DOI:10.1089/jicm.2024.0902