ChatGPT 在口腔修复学中的表现：评估其在回答生成中的准确性和可重复性。

ChatGPT performance in prosthodontics: Assessment of accuracy and repeatability in answer generation.

机构信息

Assistant Professor, Department of Pre-Clinic Dentistry, Faculty of Biomedical and Health Sciences, European University of Madrid (UEM), Madrid, Spain.

Assistant Professor, Vice Dean of Dentistry, Department of Pre-Clinic Dentistry and Clinical Dentistry, Faculty of Biomedical and Health Sciences, European University of Madrid (UEM), Madrid, Spain.

出版信息

J Prosthet Dent. 2024 Apr;131(4):659.e1-659.e6. doi: 10.1016/j.prosdent.2024.01.018. Epub 2024 Feb 2.

DOI:10.1016/j.prosdent.2024.01.018

PMID:38310063

Abstract

STATEMENT OF PROBLEM

The artificial intelligence (AI) software program ChatGPT is based on large language models (LLMs) and is widely accessible. However, in prosthodontics, little is known about its performance in generating answers.

PURPOSE

The purpose of this study was to determine the performance of ChatGPT in generating answers about removable dental prostheses (RDPs) and tooth-supported fixed dental prostheses (FDPs).

MATERIAL AND METHODS

Thirty short questions were designed about RDPs and tooth-supported FDP, and 30 answers were generated for each of the questions using ChatGPT-4 in October 2023. The 900 generated answers were independently graded by experts using a 3-point Likert scale. The relative frequency and absolute percentage of answers were described. Accuracy was assessed using the Wald binomial method, while repeatability was evaluated using percentage agreement, Brennan and Prediger coefficient, Conger generalized Cohen kappa, Fleiss kappa, Gwet AC, and Krippendorff alpha methods. Confidence intervals were set at 95%. Statistical analysis was performed using the STATA software program.

RESULTS

The performance of ChatGPT in generating answers related to RDP and tooth-supported FDP was limited. The answers showed a reliability of 25.6%, with a confidence range between 22.9% and 28.6%. The repeatability ranged from substantial to moderate.

CONCLUSIONS

The results show that currently ChatGPT has limited ability to generate answers related to RDPs and tooth-supported FDPs. Therefore, ChatGPT cannot replace a dentist, and, if professionals were to use it, they should be aware of its limitations.

摘要

问题陈述

人工智能（AI）软件程序 ChatGPT 基于大型语言模型（LLMs），且广泛可用。然而，在口腔修复学领域，人们对其生成答案的能力知之甚少。

目的

本研究旨在确定 ChatGPT 在生成关于可摘局部义齿（RDP）和牙支持固定义齿（FDP）相关问题答案方面的性能。

材料与方法

2023 年 10 月，设计了 30 个关于 RDP 和牙支持 FDP 的简短问题，并使用 ChatGPT-4 为每个问题生成 30 个答案。900 个生成的答案由专家使用 3 分李克特量表进行独立评分。通过描述相对频率和绝对百分比来描述答案。使用 Wald 二项式方法评估准确性，通过百分比一致性、Brennan 和 Prediger 系数、Conger 广义 Cohen kappa、Fleiss kappa、Gwet AC 和 Krippendorff alpha 方法评估可重复性。置信区间设定为 95%。使用 STATA 软件程序进行统计分析。