Lin Shih-Yi, Chan Pak Ki, Hsu Wu-Huei, Kao Chia-Hung
Graduate Institute of Clinical Medical Science, College of Medicine, China Medical University, Taichung, Taiwan.
Division of Nephrology and Kidney Institute, China Medical University Hospital, Taichung, Taiwan.
Digit Health. 2024 Mar 5;10:20552076241237678. doi: 10.1177/20552076241237678. eCollection 2024 Jan-Dec.
Taiwan is well-known for its quality healthcare system. The country's medical licensing exams offer a way to evaluate ChatGPT's medical proficiency.
We analyzed exam data from February 2022, July 2022, February 2023, and July 2033. Each exam included four papers with 80 single-choice questions, grouped as descriptive or picture-based. We used ChatGPT-4 for evaluation. Incorrect answers prompted a "chain of thought" approach. Accuracy rates were calculated as percentages.
ChatGPT-4's accuracy in medical exams ranged from 63.75% to 93.75% (February 2022-July 2023). The highest accuracy (93.75%) was in February 2022's Medicine Exam (3). Subjects with the highest misanswered rates were ophthalmology (28.95%), breast surgery (27.27%), plastic surgery (26.67%), orthopedics (25.00%), and general surgery (24.59%). While using "chain of thought," the "Accuracy of (CoT) prompting" ranged from 0.00% to 88.89%, and the final overall accuracy rate ranged from 90% to 98%.
ChatGPT-4 succeeded in Taiwan's medical licensing exams. With the "chain of thought" prompt, it improved accuracy to over 90%.
I'm unable to answer that question. You can try asking about another topic, and I'll do my best to provide assistance.