Wei Qiuhong, Wang Yanqin, Yao Zhengxiong, Cui Ying, Wei Bo, Li Tingyu, Xu Ximing
Children Nutrition Research Center Children's Hospital of Chongqing Medical University National Clinical Research Center for Child Health and Disorders Ministry of Education Key Laboratory of Child Development and Disorders China International Science and Technology Cooperation Base of Child Development and Critical Disorders Chongqing Key Laboratory of Childhood Nutrition and Health Chongqing China.
College of Medical Informatics Medical Data Science Academy Chongqing Engineering Research Center for Clinical Big-Data and Drug Evaluation Chongqing Medical University Chongqing China.
Pediatr Discov. 2023 Nov 20;1(3):e42. doi: 10.1002/pdi3.42. eCollection 2023 Dec.
With the advance of artificial intelligence technology, large language models such as ChatGPT are drawing substantial interest in the healthcare field. A growing body of research has evaluated ChatGPT's performance in various medical departments, yet its potential in pediatrics remains under-studied. In this study, we presented ChatGPT with a total of 4160 clinical consultation questions in both English and Chinese, covering 104 pediatric conditions, and repeated each question independently 10 times to assess the accuracy of its responses in pediatric disease treatment recommendations. ChatGPT achieved an overall accuracy of 82.2% (95% CI: 81.0%-83.4%), with superior performance in addressing common diseases (84.4%, 95% CI: 83.2%-85.7%), offering general treatment advice (83.5%, 95% CI: 81.9%-85.1%), and responding in English (93.0%, 95% CI: 91.9%-94.1%). However, it was prone to errors in disease definitions, medications, and surgical treatment. In conclusion, while ChatGPT shows promise in pediatric treatment recommendations with notable accuracy, cautious optimism is warranted regarding the potential application of large language models in enhancing patient care.
随着人工智能技术的发展,ChatGPT等大型语言模型在医疗保健领域引起了广泛关注。越来越多的研究评估了ChatGPT在各个医学科室的表现,但其在儿科的潜力仍研究不足。在本研究中,我们向ChatGPT提出了总共4160个中英文临床咨询问题,涵盖104种儿科疾病,并对每个问题独立重复10次,以评估其在儿科疾病治疗建议中回答的准确性。ChatGPT的总体准确率为82.2%(95%置信区间:81.0%-83.4%),在处理常见疾病(84.4%,95%置信区间:83.2%-85.7%)、提供一般治疗建议(83.5%,95%置信区间:81.9%-85.1%)以及用英语回答(93.0%,95%置信区间:91.9%-94.1%)方面表现出色。然而,它在疾病定义、药物治疗和手术治疗方面容易出错。总之,虽然ChatGPT在儿科治疗建议方面显示出有前景的准确性,但对于大型语言模型在改善患者护理方面的潜在应用,仍需谨慎乐观。
Pediatr Discov. 2023-11-20
Clin Orthop Relat Res. 2024-12-1
Lancet Digit Health. 2024-8
Comput Methods Programs Biomed. 2024-3
J Med Syst. 2023-8-15
Eur J Hum Genet. 2024-4
J Chin Med Assoc. 2023-7-1
AJR Am J Roentgenol. 2023-10
Gastroenterology. 2023-8