Chen Jingcheng, Ge Xiangyu, Yuan Chenyang, Chen Yanan, Li Xiangyu, Zhang Xi, Chen Shixiang, Zheng WeiYing, Miao Chunqin
Jiaxing Nanhu District People's Hospital, Jiaxing, Zhejiang Province, 314000, People's Republic of China.
The Second Hospital of Jiaxing, Jiaxing, Zhejiang Province, 314000, People's Republic of China.
BMC Oral Health. 2025 May 28;25(1):838. doi: 10.1186/s12903-025-06246-1.
This study collected and screened the 50 most common pre-treatment consultation questions from adult orthodontic patients through clinical practice. Responses to these questions were generated using three large language models: Ernie Bot, ChatGPT, and Gemini. The responses were evaluated across six dimensions: Professional Accuracy (PA), Accuracy of Content(AC), Clarity and Comprehensibility (CC), Personalization and Relevance (PR), Information Completeness (IC), and Empathy and Patient-Centeredness (EHC). Results indicated that scores for each group in various dimensions primarily fell within the range of 3-4 points, with relatively few high-quality scores (5 points). While large language models demonstrate some capability in addressing open-ended questions, their use in medical consultation, particularly in orthodontic medicine, requires caution and further integration with professional guidance and verification. Future research and technological improvements should focus on enhancing AI(Artificial Intelligence) performance in accuracy, information completeness, and humanistic care to better meet the needs of diverse clinical scenarios.
本研究通过临床实践收集并筛选了成人正畸患者最常见的50个治疗前咨询问题。使用三个大语言模型(文心一言、ChatGPT和Gemini)生成了这些问题的答案。从六个维度对答案进行了评估:专业准确性(PA)、内容准确性(AC)、清晰度和可理解性(CC)、个性化和相关性(PR)、信息完整性(IC)以及同理心和以患者为中心(EHC)。结果表明,各维度中每组的得分主要落在3 - 4分的范围内,高质量得分(5分)相对较少。虽然大语言模型在解决开放性问题方面展现出一定能力,但在医疗咨询,尤其是正畸医学中的应用需要谨慎,并进一步与专业指导和验证相结合。未来的研究和技术改进应专注于提高人工智能在准确性、信息完整性和人文关怀方面的表现,以更好地满足不同临床场景的需求。