• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

人工智能聊天机器人对骨肉瘤常见患者问题的回答评估

Assessment of Artificial Intelligence Chatbot Responses to Common Patient Questions on Bone Sarcoma.

作者信息

Khabaz Kameel, Newman-Hung Nicole J, Kallini Jennifer R, Kendal Joseph, Christ Alexander B, Bernthal Nicholas M, Wessel Lauren E

机构信息

David Geffen School of Medicine at UCLA, Los Angeles, California, USA.

Department of Orthopaedic Surgery, University of California, Los Angeles, California, USA.

出版信息

J Surg Oncol. 2025 Mar;131(4):719-724. doi: 10.1002/jso.27966. Epub 2024 Oct 29.

DOI:10.1002/jso.27966
PMID:39470681
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC12065442/
Abstract

BACKGROUND AND OBJECTIVES

The potential impacts of artificial intelligence (AI) chatbots on care for patients with bone sarcoma is poorly understood. Elucidating potential risks and benefits would allow surgeons to define appropriate roles for these tools in clinical care.

METHODS

Eleven questions on bone sarcoma diagnosis, treatment, and recovery were inputted into three AI chatbots. Answers were assessed on a 5-point Likert scale for five clinical accuracy metrics: relevance to the question, balance and lack of bias, basis on established data, factual accuracy, and completeness in scope. Responses were quantitatively assessed for empathy and readability. The Patient Education Materials Assessment Tool (PEMAT) was assessed for understandability and actionability.

RESULTS

Chatbots scored highly on relevance (4.24) and balance/lack of bias (4.09) but lower on basing responses on established data (3.77), completeness (3.68), and factual accuracy (3.66). Responses generally scored well on understandability (84.30%), while actionability scores were low for questions on treatment (64.58%) and recovery (60.64%). GPT-4 exhibited the highest empathy (4.12). Readability scores averaged between 10.28 for diagnosis questions to 11.65 for recovery questions.

CONCLUSIONS

While AI chatbots are promising tools, current limitations in factual accuracy and completeness, as well as concerns of inaccessibility to populations with lower health literacy, may significantly limit their clinical utility.

摘要

背景与目的

人工智能(AI)聊天机器人对骨肉瘤患者护理的潜在影响尚不清楚。阐明潜在风险和益处将有助于外科医生确定这些工具在临床护理中的适当作用。

方法

将11个关于骨肉瘤诊断、治疗和康复的问题输入三个AI聊天机器人。根据相关性、平衡与无偏差、基于既定数据、事实准确性和范围完整性这五个临床准确性指标,采用5分李克特量表对答案进行评估。对回答的同理心和可读性进行定量评估。使用患者教育材料评估工具(PEMAT)评估其可理解性和可操作性。

结果

聊天机器人在相关性(4.24)和平衡/无偏差(4.09)方面得分较高,但在基于既定数据的回答(3.77)、完整性(3.68)和事实准确性(3.66)方面得分较低。回答在可理解性方面总体得分良好(84.30%),而关于治疗(64.58%)和康复(60.64%)问题的可操作性得分较低。GPT-4表现出最高的同理心(4.12)。可读性得分从诊断问题的平均10.28到康复问题的11.65不等。

结论

虽然AI聊天机器人是很有前景的工具,但目前在事实准确性和完整性方面的局限性,以及对健康素养较低人群难以获取信息的担忧,可能会显著限制其临床效用。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/14e4/12065442/37b0ee33f10a/JSO-131-719-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/14e4/12065442/0437bd5dd00e/JSO-131-719-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/14e4/12065442/37b0ee33f10a/JSO-131-719-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/14e4/12065442/0437bd5dd00e/JSO-131-719-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/14e4/12065442/37b0ee33f10a/JSO-131-719-g002.jpg

相似文献

1
Assessment of Artificial Intelligence Chatbot Responses to Common Patient Questions on Bone Sarcoma.人工智能聊天机器人对骨肉瘤常见患者问题的回答评估
J Surg Oncol. 2025 Mar;131(4):719-724. doi: 10.1002/jso.27966. Epub 2024 Oct 29.
2
Is Information About Musculoskeletal Malignancies From Large Language Models or Web Resources at a Suitable Reading Level for Patients?来自大语言模型或网络资源的关于肌肉骨骼恶性肿瘤的信息对患者来说是否处于合适的阅读水平?
Clin Orthop Relat Res. 2025 Feb 1;483(2):306-315. doi: 10.1097/CORR.0000000000003263. Epub 2024 Sep 25.
3
Information about labor epidural analgesia: an updated evaluation on the readability, accuracy, and quality of ChatGPT responses incorporating patient preferences and complex clinical scenarios.关于分娩硬膜外镇痛的信息:对结合患者偏好和复杂临床场景的ChatGPT回复的可读性、准确性和质量的最新评估。
Int J Obstet Anesth. 2025 Aug;63:104688. doi: 10.1016/j.ijoa.2025.104688. Epub 2025 May 20.
4
Most Patients With Bone Sarcomas Seek Emotional Support and Information About Other Patients' Experiences: A Thematic Analysis.大多数骨肉瘤患者寻求情感支持和其他患者经验的信息:主题分析。
Clin Orthop Relat Res. 2024 Jan 1;482(1):161-171. doi: 10.1097/CORR.0000000000002761. Epub 2023 Jul 11.
5
Development and Validation of a Large Language Model-Powered Chatbot for Neurosurgery: Mixed Methods Study on Enhancing Perioperative Patient Education.用于神经外科手术的基于大语言模型的聊天机器人的开发与验证:关于加强围手术期患者教育的混合方法研究
J Med Internet Res. 2025 Jul 15;27:e74299. doi: 10.2196/74299.
6
Thyroid Eye Disease and Artificial Intelligence: A Comparative Study of ChatGPT-3.5, ChatGPT-4o, and Gemini in Patient Information Delivery.甲状腺眼病与人工智能:ChatGPT-3.5、ChatGPT-4o和Gemini在患者信息传递方面的比较研究
Ophthalmic Plast Reconstr Surg. 2024 Dec 24. doi: 10.1097/IOP.0000000000002882.
7
Performance of Multimodal Artificial Intelligence Chatbots Evaluated on Clinical Oncology Cases.多模态人工智能聊天机器人在临床肿瘤病例中的性能评估。
JAMA Netw Open. 2024 Oct 1;7(10):e2437711. doi: 10.1001/jamanetworkopen.2024.37711.
8
Can ChatGPT provide parent education for oral immunotherapy?ChatGPT能为口服免疫疗法提供家长教育吗?
Ann Allergy Asthma Immunol. 2025 Jul;135(1):87-90. doi: 10.1016/j.anai.2025.04.011. Epub 2025 Apr 24.
9
Assessing ChatGPT responses to frequently asked patient questions in reconstructive urology.评估ChatGPT对重建泌尿外科常见患者问题的回答。
Urol Pract. 2025 Feb 12:101097UPJ0000000000000792. doi: 10.1097/UPJ.0000000000000792.
10
Evaluating the readability, quality, and reliability of responses generated by ChatGPT, Gemini, and Perplexity on the most commonly asked questions about Ankylosing spondylitis.评估ChatGPT、Gemini和Perplexity针对强直性脊柱炎最常见问题生成的回答的可读性、质量和可靠性。
PLoS One. 2025 Jun 18;20(6):e0326351. doi: 10.1371/journal.pone.0326351. eCollection 2025.

引用本文的文献

1
[AI-enabled clinical decision support systems: challenges and opportunities].[人工智能驱动的临床决策支持系统:挑战与机遇]
Bundesgesundheitsblatt Gesundheitsforschung Gesundheitsschutz. 2025 Jun 25. doi: 10.1007/s00103-025-04092-8.
2
Assessing the Accuracy of Artificial Intelligence Chatbots in the Diagnosis and Management of Meniscal Tears.评估人工智能聊天机器人在半月板撕裂诊断与管理中的准确性。
Cureus. 2025 May 14;17(5):e84124. doi: 10.7759/cureus.84124. eCollection 2025 May.

本文引用的文献

1
Large language models in health care: Development, applications, and challenges.医疗保健领域的大语言模型:发展、应用与挑战。
Health Care Sci. 2023 Jul 24;2(4):255-263. doi: 10.1002/hcs2.61. eCollection 2023 Aug.
2
The performance of ChatGPT on orthopaedic in-service training exams: A comparative study of the GPT-3.5 turbo and GPT-4 models in orthopaedic education.ChatGPT在骨科在职培训考试中的表现:GPT-3.5 turbo和GPT-4模型在骨科教育中的比较研究。
J Orthop. 2023 Nov 23;50:70-75. doi: 10.1016/j.jor.2023.11.056. eCollection 2024 Apr.
3
ChatGPT and large language models in orthopedics: from education and surgery to research.
骨科领域的ChatGPT和大语言模型:从教育、手术到研究
J Exp Orthop. 2023 Dec 1;10(1):128. doi: 10.1186/s40634-023-00700-1.
4
Arthrosis diagnosis and treatment recommendations in clinical practice: an exploratory investigation with the generative AI model GPT-4.在临床实践中进行关节病诊断和治疗的建议:使用生成式人工智能模型 GPT-4 进行的探索性研究。
J Orthop Traumatol. 2023 Nov 28;24(1):61. doi: 10.1186/s10195-023-00740-4.
5
A pilot study on the efficacy of GPT-4 in providing orthopedic treatment recommendations from MRI reports.GPT-4 在提供 MRI 报告中的骨科治疗建议方面的功效的初步研究。
Sci Rep. 2023 Nov 17;13(1):20159. doi: 10.1038/s41598-023-47500-2.
6
Assessing Biases in Medical Decisions via Clinician and AI Chatbot Responses to Patient Vignettes.通过临床医生和人工智能聊天机器人对患者病例的回答评估医疗决策中的偏差
JAMA Netw Open. 2023 Oct 2;6(10):e2338050. doi: 10.1001/jamanetworkopen.2023.38050.
7
Large Language Model-Based Chatbot vs Surgeon-Generated Informed Consent Documentation for Common Procedures.基于大语言模型的聊天机器人与外科医生生成的常见手术知情同意书文档。
JAMA Netw Open. 2023 Oct 2;6(10):e2336997. doi: 10.1001/jamanetworkopen.2023.36997.
8
The Rapid Development of Artificial Intelligence: GPT-4's Performance on Orthopedic Surgery Board Questions.人工智能的快速发展:GPT-4 在骨科手术委员会问题上的表现。
Orthopedics. 2024 Mar-Apr;47(2):e85-e89. doi: 10.3928/01477447-20230922-05. Epub 2023 Sep 27.
9
Applications of large language models in cancer care: current evidence and future perspectives.大语言模型在癌症护理中的应用:当前证据与未来展望。
Front Oncol. 2023 Sep 4;13:1268915. doi: 10.3389/fonc.2023.1268915. eCollection 2023.
10
Fabrication and errors in the bibliographic citations generated by ChatGPT.ChatGPT生成的文献引用中的编造与错误。
Sci Rep. 2023 Sep 7;13(1):14045. doi: 10.1038/s41598-023-41032-5.