文献检索，用中文搜 PubMed

Bridging the Gap: Can Large Language Models Match Human Expertise in Writing Neurosurgical Operative Notes?

作者信息

Ali Abdullah, Kumar Rohit Prem, Polavarapu Hanish, Lavadi Raj Swaroop, Mahavadi Anil, Legarreta Andrew D, Hudson Joseph S, Shah Manan, Paul David, Mooney James, Dietz Nicholas, Fields Daryl P, Hamilton D Kojo, Agarwal Nitin

机构信息

Department of Neurological Surgery, University of Pittsburgh School of Medicine, Pittsburgh, Pennsylvania, USA.

Department of Neurosurgery, SUNY Upstate Medical University, Syracuse, New York, USA.

出版信息

World Neurosurg. 2024 Dec;192:e34-e41. doi: 10.1016/j.wneu.2024.08.062. Epub 2024 Aug 15.

BACKGROUND

Proper documentation is essential for patient care. The popularity of artificial intelligence (AI) offers the potential for improvements in neurosurgical note-writing. This study aimed to assess how AI can optimize documentation in neurosurgical procedures.

METHODS

Thirty-six operative notes were included. All identifiable data were removed. Essential information, such as perioperative data and diagnosis, was sourced from these notes. ChatGPT 4.0 was trained to draft notes from surgical vignettes using each surgeon's note template. One hundred forty-four surveys with a surgeon or AI note were shared with 3 surgeons to evaluate accuracy, content, and organization using a 5-point scale. Accuracy was defined as the factual correctness; content, as the comprehensiveness; and organization, as the arrangement of the note. Flesch-Kincaid Grade Level (FKGL) and Flesch Reading Ease (FRE) scores quantified each note's readability.

RESULTS

The mean AI accuracy was not different from the mean surgeon accuracy (4.44 vs. 4.33; P = 0.512), the mean AI content was lower than the mean surgeon content (3.73 vs. 4.42; P < 0.001). The mean AI note FKGL was greater than the mean surgeon FKGL (13.13 vs. 9.99; P < 0.001) and the mean AI FRE was lower than the mean surgeon FRE (21.42 vs. 41.70; P < 0.001).

CONCLUSIONS

AI notes were on par with surgeon notes in terms of accuracy and organization but lacked in content. Additionally, AI notes used language at an advanced reading level. These findings support the potential for ChatGPT to enhance the efficiency of neurosurgery documentation.

背景

妥善记录对于患者护理至关重要。人工智能（AI）的普及为改善神经外科手术记录提供了潜力。本研究旨在评估人工智能如何优化神经外科手术中的记录。

方法

纳入36份手术记录。去除所有可识别的数据。围手术期数据和诊断等基本信息来源于这些记录。使用每位外科医生的记录模板，训练ChatGPT 4.0根据手术案例起草记录。将144份包含外科医生记录或人工智能记录的调查问卷分发给3位外科医生，使用5分制评估准确性、内容和组织情况。准确性定义为事实正确性；内容定义为全面性；组织定义为记录的编排。弗莱什-金凯德年级水平（FKGL）和弗莱什阅读简易度（FRE）分数量化每份记录的可读性。

结果

人工智能记录的平均准确性与外科医生记录的平均准确性无差异（4.44对4.33；P = 0.512），人工智能记录的平均内容低于外科医生记录的平均内容（3.73对4.42；P < 0.001）。人工智能记录的平均FKGL大于外科医生记录的平均FKGL（13.13对9.99；P < 0.001），人工智能记录的平均FRE低于外科医生记录的平均FRE（21.42对41.70；P < 0.001）。

结论

人工智能记录在准确性和组织方面与外科医生记录相当，但在内容方面有所欠缺。此外，人工智能记录使用的语言阅读水平较高。这些发现支持了ChatGPT提高神经外科手术记录效率的潜力。

Suppr 超能文献

文献检索

文件翻译

深度研究

Suppr 超能文献

文献检索

文件翻译

深度研究

弥合差距：大语言模型在撰写神经外科手术记录方面能否与人类专家相媲美？

Bridging the Gap: Can Large Language Models Match Human Expertise in Writing Neurosurgical Operative Notes?

作者信息

机构信息

出版信息

BACKGROUND

METHODS

RESULTS

CONCLUSIONS

相似文献

引用本文的文献

弥合差距：大语言模型在撰写神经外科手术记录方面能否与人类专家相媲美？

Bridging the Gap: Can Large Language Models Match Human Expertise in Writing Neurosurgical Operative Notes?

作者信息

机构信息

出版信息

BACKGROUND

METHODS

RESULTS

CONCLUSIONS

背景

方法

结果

结论

相似文献

引用本文的文献