• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

评估大语言模型在颅面外科手术CPT编码中的有效性:一项比较分析。

Evaluating the Efficacy of Large Language Models in CPT Coding for Craniofacial Surgery: A Comparative Analysis.

作者信息

Isch Emily L, Sarikonda Advith, Sambangi Abhijeet, Carreras Angeleah, Sircar Adrija, Self D Mitchell, Habarth-Morales Theodore E, Caterson E J, Aycart Mario

机构信息

Department of General Surgery, Thomas Jefferson University.

Sidney Kimmel Medical College at Thomas Jefferson University.

出版信息

J Craniofac Surg. 2025 May 1;36(3):831-835. doi: 10.1097/SCS.0000000000010575. Epub 2024 Sep 2.

DOI:10.1097/SCS.0000000000010575
PMID:39221924
Abstract

BACKGROUND

The advent of Large Language Models (LLMs) like ChatGPT has introduced significant advancements in various surgical disciplines. These developments have led to an increased interest in the utilization of LLMs for Current Procedural Terminology (CPT) coding in surgery. With CPT coding being a complex and time-consuming process, often exacerbated by the scarcity of professional coders, there is a pressing need for innovative solutions to enhance coding efficiency and accuracy.

METHODS

This observational study evaluated the effectiveness of 5 publicly available large language models-Perplexity.AI, Bard, BingAI, ChatGPT 3.5, and ChatGPT 4.0-in accurately identifying CPT codes for craniofacial procedures. A consistent query format was employed to test each model, ensuring the inclusion of detailed procedure components where necessary. The responses were classified as correct, partially correct, or incorrect based on their alignment with established CPT coding for the specified procedures.

RESULTS

The results indicate that while there is no overall significant association between the type of AI model and the correctness of CPT code identification, there are notable differences in performance for simple and complex CPT codes among the models. Specifically, ChatGPT 4.0 showed higher accuracy for complex codes, whereas Perplexity.AI and Bard were more consistent with simple codes.

DISCUSSION

The use of AI chatbots for CPT coding in craniofacial surgery presents a promising avenue for reducing the administrative burden and associated costs of manual coding. Despite the lower accuracy rates compared with specialized, trained algorithms, the accessibility and minimal training requirements of the AI chatbots make them attractive alternatives. The study also suggests that priming AI models with operative notes may enhance their accuracy, offering a resource-efficient strategy for improving CPT coding in clinical practice.

CONCLUSIONS

This study highlights the feasibility and potential benefits of integrating LLMs into the CPT coding process for craniofacial surgery. The findings advocate for further refinement and training of AI models to improve their accuracy and practicality, suggesting a future where AI-assisted coding could become a standard component of surgical workflows, aligning with the ongoing digital transformation in health care.

摘要

背景

像ChatGPT这样的大语言模型(LLMs)的出现给各个外科学科带来了重大进展。这些发展使得人们对在外科手术中利用大语言模型进行当前操作术语(CPT)编码的兴趣增加。由于CPT编码是一个复杂且耗时的过程,专业编码人员的短缺往往会加剧这一问题,因此迫切需要创新解决方案来提高编码效率和准确性。

方法

这项观察性研究评估了5个公开可用的大语言模型——Perplexity.AI、Bard、BingAI、ChatGPT 3.5和ChatGPT 4.0——在准确识别颅面手术CPT编码方面的有效性。采用一致的查询格式来测试每个模型,确保在必要时纳入详细的手术组成部分。根据回复与指定手术既定CPT编码的一致性,将回复分类为正确、部分正确或不正确。

结果

结果表明,虽然人工智能模型的类型与CPT编码识别的正确性之间没有总体显著关联,但各模型在简单和复杂CPT编码的性能上存在显著差异。具体而言,ChatGPT 4.0在复杂编码方面显示出更高的准确性,而Perplexity.AI和Bard在简单编码方面更一致。

讨论

在颅面外科手术中使用人工智能聊天机器人进行CPT编码为减轻手工编码的管理负担和相关成本提供了一条有前景的途径。尽管与经过专门训练的算法相比准确率较低,但人工智能聊天机器人的可及性和最低培训要求使其成为有吸引力的替代方案。该研究还表明,用手术记录引导人工智能模型可能会提高其准确性,为在临床实践中改进CPT编码提供一种资源高效的策略。

结论

本研究强调了将大语言模型整合到颅面外科手术CPT编码过程中的可行性和潜在益处。研究结果主张进一步完善和训练人工智能模型以提高其准确性和实用性,这表明未来人工智能辅助编码可能成为手术工作流程的标准组成部分,与医疗保健领域正在进行的数字转型相一致。

相似文献

1
Evaluating the Efficacy of Large Language Models in CPT Coding for Craniofacial Surgery: A Comparative Analysis.评估大语言模型在颅面外科手术CPT编码中的有效性:一项比较分析。
J Craniofac Surg. 2025 May 1;36(3):831-835. doi: 10.1097/SCS.0000000000010575. Epub 2024 Sep 2.
2
Artificial Intelligence in Surgical Coding: Evaluating Large Language Models for Current Procedural Terminology Accuracy in Hand Surgery.手术编码中的人工智能:评估大型语言模型对手外科手术当前操作术语准确性的表现
J Hand Surg Glob Online. 2025 Jan 9;7(2):181-185. doi: 10.1016/j.jhsg.2024.11.013. eCollection 2025 Mar.
3
What is the value of routinely testing full blood count, electrolytes and urea, and pulmonary function tests before elective surgery in patients with no apparent clinical indication and in subgroups of patients with common comorbidities: a systematic review of the clinical and cost-effective literature.在没有明显临床指征的患者和常见合并症患者亚组中,在择期手术前常规检测全血细胞计数、电解质和尿素以及肺功能测试的价值:对临床和成本效益文献的系统评价。
Health Technol Assess. 2012 Dec;16(50):i-xvi, 1-159. doi: 10.3310/hta16500.
4
Home treatment for mental health problems: a systematic review.心理健康问题的居家治疗:一项系统综述
Health Technol Assess. 2001;5(15):1-139. doi: 10.3310/hta5150.
5
Intravenous magnesium sulphate and sotalol for prevention of atrial fibrillation after coronary artery bypass surgery: a systematic review and economic evaluation.静脉注射硫酸镁和索他洛尔预防冠状动脉搭桥术后房颤:系统评价与经济学评估
Health Technol Assess. 2008 Jun;12(28):iii-iv, ix-95. doi: 10.3310/hta12280.
6
A rapid and systematic review of the clinical effectiveness and cost-effectiveness of topotecan for ovarian cancer.拓扑替康治疗卵巢癌的临床有效性和成本效益的快速系统评价。
Health Technol Assess. 2001;5(28):1-110. doi: 10.3310/hta5280.
7
Comparison of the effectiveness of inhaler devices in asthma and chronic obstructive airways disease: a systematic review of the literature.吸入装置在哮喘和慢性阻塞性气道疾病中的有效性比较:文献系统评价
Health Technol Assess. 2001;5(26):1-149. doi: 10.3310/hta5260.
8
Behavioral interventions to reduce risk for sexual transmission of HIV among men who have sex with men.降低男男性行为者中艾滋病毒性传播风险的行为干预措施。
Cochrane Database Syst Rev. 2008 Jul 16(3):CD001230. doi: 10.1002/14651858.CD001230.pub2.
9
Artificial Intelligence Shows Limited Success in Improving Readability Levels of Spanish-language Orthopaedic Patient Education Materials.人工智能在提高西班牙语骨科患者教育材料的可读性方面成效有限。
Clin Orthop Relat Res. 2025 Feb 11. doi: 10.1097/CORR.0000000000003413.
10
A rapid and systematic review of the clinical effectiveness and cost-effectiveness of paclitaxel, docetaxel, gemcitabine and vinorelbine in non-small-cell lung cancer.对紫杉醇、多西他赛、吉西他滨和长春瑞滨在非小细胞肺癌中的临床疗效和成本效益进行的快速系统评价。
Health Technol Assess. 2001;5(32):1-195. doi: 10.3310/hta5320.

引用本文的文献

1
A comparative analysis of DeepSeek R1, DeepSeek-R1-Lite, OpenAi o1 Pro, and Grok 3 performance on ophthalmology board-style questions.DeepSeek R1、DeepSeek-R1-Lite、OpenAi o1 Pro和Grok 3在眼科委员会式问题上的性能比较分析。
Sci Rep. 2025 Jul 2;15(1):23101. doi: 10.1038/s41598-025-08601-2.