Suppr超能文献

ChatGPT生成的关于头颈及口腔颌面外科信息的准确性:一项多中心协作分析

Accuracy of ChatGPT-Generated Information on Head and Neck and Oromaxillofacial Surgery: A Multicenter Collaborative Analysis.

作者信息

Vaira Luigi Angelo, Lechien Jerome R, Abbate Vincenzo, Allevi Fabiana, Audino Giovanni, Beltramini Giada Anna, Bergonzani Michela, Bolzoni Alessandro, Committeri Umberto, Crimi Salvatore, Gabriele Guido, Lonardi Fabio, Maglitto Fabio, Petrocelli Marzia, Pucci Resi, Saponaro Gianmarco, Tel Alessandro, Vellone Valentino, Chiesa-Estomba Carlos Miguel, Boscolo-Rizzo Paolo, Salzano Giovanni, De Riu Giacomo

机构信息

Maxillofacial Surgery Operative Unit, Department of Medicine, Surgery and Pharmacy, University of Sassari, Sassari, Italy.

Biomedical Sciences Department, PhD School of Biomedical Science, University of Sassari, Sassari, Italy.

出版信息

Otolaryngol Head Neck Surg. 2024 Jun;170(6):1492-1503. doi: 10.1002/ohn.489. Epub 2023 Aug 18.

Abstract

OBJECTIVE

To investigate the accuracy of Chat-Based Generative Pre-trained Transformer (ChatGPT) in answering questions and solving clinical scenarios of head and neck surgery.

STUDY DESIGN

Observational and valuative study.

SETTING

Eighteen surgeons from 14 Italian head and neck surgery units.

METHODS

A total of 144 clinical questions encompassing different subspecialities of head and neck surgery and 15 comprehensive clinical scenarios were developed. Questions and scenarios were inputted into ChatGPT4, and the resulting answers were evaluated by the researchers using accuracy (range 1-6), completeness (range 1-3), and references' quality Likert scales.

RESULTS

The overall median score of open-ended questions was 6 (interquartile range[IQR]: 5-6) for accuracy and 3 (IQR: 2-3) for completeness. Overall, the reviewers rated the answer as entirely or nearly entirely correct in 87.2% of cases and as comprehensive and covering all aspects of the question in 73% of cases. The artificial intelligence (AI) model achieved a correct response in 84.7% of the closed-ended questions (11 wrong answers). As for the clinical scenarios, ChatGPT provided a fully or nearly fully correct diagnosis in 81.7% of cases. The proposed diagnostic or therapeutic procedure was judged to be complete in 56.7% of cases. The overall quality of the bibliographic references was poor, and sources were nonexistent in 46.4% of the cases.

CONCLUSION

The results generally demonstrate a good level of accuracy in the AI's answers. The AI's ability to resolve complex clinical scenarios is promising, but it still falls short of being considered a reliable support for the decision-making process of specialists in head-neck surgery.

摘要

目的

探讨基于聊天的生成式预训练变换器(ChatGPT)在回答问题及解决头颈外科临床场景方面的准确性。

研究设计

观察性和评估性研究。

研究地点

来自14个意大利头颈外科单位的18名外科医生。

方法

共设计了144个涵盖头颈外科不同亚专业的临床问题以及15个综合临床场景。将问题和场景输入ChatGPT4,研究人员使用准确性(范围1 - 6)、完整性(范围1 - 3)和参考文献质量李克特量表对生成的答案进行评估。

结果

开放式问题的总体中位数得分在准确性方面为6(四分位间距[IQR]:5 - 6),在完整性方面为3(IQR:2 - 3)。总体而言,评审人员在87.2%的案例中认为答案完全或几乎完全正确,在73%的案例中认为答案全面且涵盖了问题的所有方面。人工智能(AI)模型在84.7%的封闭式问题中给出了正确答案(11个错误答案)。至于临床场景,ChatGPT在81.7%的案例中提供了完全或几乎完全正确的诊断。在56.7%的案例中,所提出的诊断或治疗程序被认为是完整的。参考文献的总体质量较差,46.4%的案例中没有参考文献来源。

结论

结果总体表明人工智能答案的准确性处于良好水平。人工智能解决复杂临床场景的能力很有前景,但仍不足以被视为对头颈外科专家决策过程的可靠支持。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验