ChatGPT与谷歌Gemini在回答正颌外科相关问题中的应用：一项比较研究。

The use of ChatGPT and Google Gemini in responding to orthognathic surgery-related questions: A comparative study.

作者信息

Aziz Ahmed A Abdel, Abdelrahman Hams H, Hassan Mohamed G

机构信息

Department of Orthodontics, Faculty of Dentistry, Assiut University, Assiut, Egypt.

Department of Pediatric Dentistry and Dental Public Health, Faculty of Dentistry, Alexandria University, Alexandria, Egypt.

出版信息

J World Fed Orthod. 2025 Feb;14(1):20-26. doi: 10.1016/j.ejwf.2024.09.004. Epub 2024 Oct 28.

DOI:10.1016/j.ejwf.2024.09.004

PMID:39490358

Abstract

AIM

This study employed a quantitative approach to compare the reliability of responses provided by ChatGPT-3.5, ChatGPT-4, and Google Gemini in response to orthognathic surgery-related questions.

MATERIAL AND METHODS

The authors adapted a set of 64 questions encompassing all of the domains and aspects related to orthognathic surgery. One author submitted the questions to ChatGPT3.5, ChatGPT4, and Google Gemini. The AI-generated responses from the three platforms were recorded and evaluated by 2 blinded and independent experts. The reliability of AI-generated responses was evaluated using a tool for accuracy of information and completeness. In addition, the provision of definitive answers to close-ended questions, references, graphical elements, and advice to schedule consultations with a specialist were collected.

RESULTS

Although ChatGPT-3.5 achieved the highest information reliability score, the 3 LLMs showed similar reliability scores in providing responses to orthognathic surgery-related inquiries. Moreover, Google Gemini significantly included physician recommendations and provided graphical elements. Both ChatGPT-3.5 and -4 lacked these features.

CONCLUSION

This study shows that ChatGPT-3.5, ChatGPT-4, and Google Gemini can provide reliable responses to inquires about orthognathic surgery. However, Google Gemini stood out by incorporating additional references and illustrations within its responses. These findings highlight the need for an additional evaluation of AI capabilities across different healthcare domains.

摘要

目的

本研究采用定量方法比较ChatGPT-3.5、ChatGPT-4和谷歌Gemini在回答正颌外科相关问题时提供的回答的可靠性。

材料与方法

作者改编了一组64个问题，涵盖与正颌外科相关的所有领域和方面。一位作者将这些问题提交给ChatGPT3.5、ChatGPT4和谷歌Gemini。来自这三个平台的人工智能生成的回答由两名不知情的独立专家进行记录和评估。使用信息准确性和完整性工具评估人工智能生成的回答的可靠性。此外，还收集了对封闭式问题的明确答案、参考文献、图形元素以及安排与专家会诊的建议。

结果

虽然ChatGPT-3.5获得了最高的信息可靠性分数，但这三个大语言模型在回答正颌外科相关询问时显示出相似的可靠性分数。此外，谷歌Gemini显著包含了医生建议并提供了图形元素。ChatGPT-3.5和-4都缺乏这些功能。

结论

本研究表明，ChatGPT-3.5、ChatGPT-4和谷歌Gemini能够对正颌外科相关询问提供可靠的回答。然而，谷歌Gemini在其回答中纳入额外的参考文献和插图方面表现突出。这些发现凸显了对不同医疗领域人工智能能力进行额外评估的必要性。

相似文献

The use of ChatGPT and Google Gemini in responding to orthognathic surgery-related questions: A comparative study.ChatGPT与谷歌Gemini在回答正颌外科相关问题中的应用：一项比较研究。

J World Fed Orthod. 2025 Feb;14(1):20-26. doi: 10.1016/j.ejwf.2024.09.004. Epub 2024 Oct 28.

Performance of AI-Chatbots to Common Temporomandibular Joint Disorders (TMDs) Patient Queries: Accuracy, Completeness, Reliability and Readability.人工智能聊天机器人对常见颞下颌关节紊乱病（TMDs）患者问题的回答：准确性、完整性、可靠性和可读性。

Orthod Craniofac Res. 2025 May 7. doi: 10.1111/ocr.12939.

Performance of the ChatGPT-3.5, ChatGPT-4, and Google Gemini large language models in responding to dental implantology inquiries.ChatGPT-3.5、ChatGPT-4和谷歌Gemini大型语言模型在回答牙种植学相关问题方面的表现。

J Prosthet Dent. 2025 Jan 4. doi: 10.1016/j.prosdent.2024.12.016.

Comparing answers of ChatGPT and Google Gemini to common questions on benign anal conditions.比较ChatGPT和谷歌Gemini对常见肛门良性疾病问题的回答。

Tech Coloproctol. 2025 Jan 26;29(1):57. doi: 10.1007/s10151-024-03096-x.

Comparative performance of artificial intelligence models in rheumatology board-level questions: evaluating Google Gemini and ChatGPT-4o.人工智能模型在风湿病委员会级问题中的比较性能：评估 Google Gemini 和 ChatGPT-4o。

Clin Rheumatol. 2024 Nov;43(11):3507-3513. doi: 10.1007/s10067-024-07154-5. Epub 2024 Sep 28.

Performance of three artificial intelligence (AI)-based large language models in standardized testing; implications for AI-assisted dental education.三种基于人工智能（AI）的大语言模型在标准化测试中的表现；对人工智能辅助牙科教育的启示。

J Periodontal Res. 2025 Feb;60(2):121-133. doi: 10.1111/jre.13323. Epub 2024 Jul 18.

Comparative evaluation of ChatGPT-4, ChatGPT-3.5 and Google Gemini on PCOS assessment and management based on recommendations from the 2023 guideline.基于2023年指南建议对ChatGPT-4、ChatGPT-3.5和谷歌Gemini在多囊卵巢综合征评估与管理方面的比较评估

Endocrine. 2025 Apr;88(1):315-322. doi: 10.1007/s12020-024-04121-7. Epub 2024 Dec 2.

Evaluating the Potential of Large Language Models for Vestibular Rehabilitation Education: A Comparison of ChatGPT, Google Gemini, and Clinicians.评估大语言模型用于前庭康复教育的潜力：ChatGPT、谷歌Gemini与临床医生的比较

Phys Ther. 2025 Apr 2;105(4). doi: 10.1093/ptj/pzaf010.

Evaluación de la fiabilidad y legibilidad de las respuestas de los chatbots como recurso de información al paciente para las exploraciones PET-TC más communes.评估聊天机器人回复作为常见PET-CT检查患者信息资源的可靠性和可读性。

Rev Esp Med Nucl Imagen Mol (Engl Ed). 2025 Jan-Feb;44(1):500065. doi: 10.1016/j.remnie.2024.500065. Epub 2024 Sep 28.

Performance of Artificial Intelligence Chatbots in Responding to Patient Queries Related to Traumatic Dental Injuries: A Comparative Study.人工智能聊天机器人在回应与创伤性牙损伤相关的患者咨询中的表现：一项比较研究。

Dent Traumatol. 2025 Jun;41(3):338-347. doi: 10.1111/edt.13020. Epub 2024 Nov 22.

引用本文的文献

Enhancing patient-centered information on implant dentistry through prompt engineering: a comparison of four large language models.通过提示工程增强种植牙科以患者为中心的信息：四种大语言模型的比较

Front Oral Health. 2025 Apr 7;6:1566221. doi: 10.3389/froh.2025.1566221. eCollection 2025.

Artificial intelligence performance in answering multiple-choice oral pathology questions: a comparative analysis.人工智能在回答口腔病理学选择题方面的表现：一项对比分析。

BMC Oral Health. 2025 Apr 15;25(1):573. doi: 10.1186/s12903-025-05926-2.

Artificial Intelligence Applications in Pediatric Craniofacial Surgery.人工智能在小儿颅颌面外科的应用

Diagnostics (Basel). 2025 Mar 25;15(7):829. doi: 10.3390/diagnostics15070829.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

ChatGPT与谷歌Gemini在回答正颌外科相关问题中的应用：一项比较研究。

The use of ChatGPT and Google Gemini in responding to orthognathic surgery-related questions: A comparative study.

作者信息

机构信息

出版信息

AIM

MATERIAL AND METHODS

RESULTS

CONCLUSION

目的

材料与方法

结果

结论

相似文献

引用本文的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献