Angle Orthod. 2024 Nov 1;94(6):602-607. doi: 10.2319/031224-207.1.
To evaluate the reliability of information produced by the artificial intelligence-based program ChatGPT in terms of accuracy and relevance, as assessed by orthodontists, dental students, and individuals seeking orthodontic treatment.
Frequently asked and curious questions in four basic areas related to orthodontics were prepared and asked in ChatGPT (Version 4.0), and answers were evaluated by three different groups (senior dental students, individuals seeking orthodontic treatment, orthodontists). Questions asked in these basic areas of orthodontics were about: clear aligners (CA), lingual orthodontics (LO), esthetic braces (EB), and temporomandibular disorders (TMD). The answers were evaluated by the Global Quality Scale (GQS) and Quality Criteria for Consumer Health Information (DISCERN) scale.
The total mean DISCERN score for answers on CA for students was 51.7 ± 9.38, for patients was 57.2 ± 10.73 and, for orthodontists was 47.4 ± 4.78 (P = .001). Comparison of GQS scores for LO among groups: students (3.53 ± 0.78), patients (4.40 ± 0.72), and orthodontists (3.63 ± 0.72) (P < .001). Intergroup comparison of ChatGPT evaluations about TMD was examined in terms of the DISCERN scale, with the highest value given in the patients group (57.83 ± 11.47) and lowest value in the orthodontist group (45.90 ± 11.84). When information quality evaluation about EB was examined, it GQS scores were >3 in all three groups (students: 3.50 ± 0.78; patients: 4.17 ± 0.87; orthodontists: 3.50 ± 0.82).
ChatGPT has significant potential in terms of usability for patient information and education in the field of orthodontics if it is developed and necessary updates are made.
评估人工智能程序 ChatGPT 生成的信息在准确性和相关性方面的可靠性,评估人员为正畸医生、牙科学学生和寻求正畸治疗的个体。
准备了与正畸相关的四个基本领域的常见问题和疑问,并在 ChatGPT(版本 4.0)中进行了询问,然后由三个不同的小组(高级牙科学生、寻求正畸治疗的个体、正畸医生)对答案进行评估。在正畸的这些基本领域中提出的问题是关于:透明矫正器(CA)、舌侧正畸(LO)、美学牙套(EB)和颞下颌关节紊乱(TMD)。答案由全球质量量表(GQS)和消费者健康信息质量标准(DISCERN)量表进行评估。
学生对 CA 的回答的总平均 DISCERN 得分为 51.7±9.38,患者为 57.2±10.73,正畸医生为 47.4±4.78(P=0.001)。对 GQS 评分进行组间比较:学生(3.53±0.78)、患者(4.40±0.72)和正畸医生(3.63±0.72)(P<.001)。使用 DISCERN 量表检查关于 TMD 的 ChatGPT 评价的组间比较,患者组给出的评分最高(57.83±11.47),正畸医生组最低(45.90±11.84)。当检查有关 EB 的信息质量评估时,三个组的 GQS 评分均>3(学生:3.50±0.78;患者:4.17±0.87;正畸医生:3.50±0.82)。
如果对 ChatGPT 进行开发并进行必要的更新,它在正畸领域为患者提供信息和教育方面具有很大的潜力。