文献检索文档翻译深度研究
Suppr Zotero 插件Zotero 插件
邀请有礼套餐&价格历史记录

新学期,新优惠

限时优惠:9月1日-9月22日

30天高级会员仅需29元

1天体验卡首发特惠仅需5.99元

了解详情
不再提醒
插件&应用
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
高级版
套餐订阅购买积分包
AI 工具
文献检索文档翻译深度研究
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2025

人工智能聊天机器人对骨肉瘤常见患者问题的回答评估

Assessment of Artificial Intelligence Chatbot Responses to Common Patient Questions on Bone Sarcoma.

作者信息

Khabaz Kameel, Newman-Hung Nicole J, Kallini Jennifer R, Kendal Joseph, Christ Alexander B, Bernthal Nicholas M, Wessel Lauren E

机构信息

David Geffen School of Medicine at UCLA, Los Angeles, California, USA.

Department of Orthopaedic Surgery, University of California, Los Angeles, California, USA.

出版信息

J Surg Oncol. 2025 Mar;131(4):719-724. doi: 10.1002/jso.27966. Epub 2024 Oct 29.


DOI:10.1002/jso.27966
PMID:39470681
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC12065442/
Abstract

BACKGROUND AND OBJECTIVES: The potential impacts of artificial intelligence (AI) chatbots on care for patients with bone sarcoma is poorly understood. Elucidating potential risks and benefits would allow surgeons to define appropriate roles for these tools in clinical care. METHODS: Eleven questions on bone sarcoma diagnosis, treatment, and recovery were inputted into three AI chatbots. Answers were assessed on a 5-point Likert scale for five clinical accuracy metrics: relevance to the question, balance and lack of bias, basis on established data, factual accuracy, and completeness in scope. Responses were quantitatively assessed for empathy and readability. The Patient Education Materials Assessment Tool (PEMAT) was assessed for understandability and actionability. RESULTS: Chatbots scored highly on relevance (4.24) and balance/lack of bias (4.09) but lower on basing responses on established data (3.77), completeness (3.68), and factual accuracy (3.66). Responses generally scored well on understandability (84.30%), while actionability scores were low for questions on treatment (64.58%) and recovery (60.64%). GPT-4 exhibited the highest empathy (4.12). Readability scores averaged between 10.28 for diagnosis questions to 11.65 for recovery questions. CONCLUSIONS: While AI chatbots are promising tools, current limitations in factual accuracy and completeness, as well as concerns of inaccessibility to populations with lower health literacy, may significantly limit their clinical utility.

摘要

背景与目的:人工智能(AI)聊天机器人对骨肉瘤患者护理的潜在影响尚不清楚。阐明潜在风险和益处将有助于外科医生确定这些工具在临床护理中的适当作用。 方法:将11个关于骨肉瘤诊断、治疗和康复的问题输入三个AI聊天机器人。根据相关性、平衡与无偏差、基于既定数据、事实准确性和范围完整性这五个临床准确性指标,采用5分李克特量表对答案进行评估。对回答的同理心和可读性进行定量评估。使用患者教育材料评估工具(PEMAT)评估其可理解性和可操作性。 结果:聊天机器人在相关性(4.24)和平衡/无偏差(4.09)方面得分较高,但在基于既定数据的回答(3.77)、完整性(3.68)和事实准确性(3.66)方面得分较低。回答在可理解性方面总体得分良好(84.30%),而关于治疗(64.58%)和康复(60.64%)问题的可操作性得分较低。GPT-4表现出最高的同理心(4.12)。可读性得分从诊断问题的平均10.28到康复问题的11.65不等。 结论:虽然AI聊天机器人是很有前景的工具,但目前在事实准确性和完整性方面的局限性,以及对健康素养较低人群难以获取信息的担忧,可能会显著限制其临床效用。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/14e4/12065442/37b0ee33f10a/JSO-131-719-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/14e4/12065442/0437bd5dd00e/JSO-131-719-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/14e4/12065442/37b0ee33f10a/JSO-131-719-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/14e4/12065442/0437bd5dd00e/JSO-131-719-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/14e4/12065442/37b0ee33f10a/JSO-131-719-g002.jpg

相似文献

[1]
Assessment of Artificial Intelligence Chatbot Responses to Common Patient Questions on Bone Sarcoma.

J Surg Oncol. 2025-3

[2]
Is Information About Musculoskeletal Malignancies From Large Language Models or Web Resources at a Suitable Reading Level for Patients?

Clin Orthop Relat Res. 2025-2-1

[3]
Information about labor epidural analgesia: an updated evaluation on the readability, accuracy, and quality of ChatGPT responses incorporating patient preferences and complex clinical scenarios.

Int J Obstet Anesth. 2025-8

[4]
Most Patients With Bone Sarcomas Seek Emotional Support and Information About Other Patients' Experiences: A Thematic Analysis.

Clin Orthop Relat Res. 2024-1-1

[5]
Development and Validation of a Large Language Model-Powered Chatbot for Neurosurgery: Mixed Methods Study on Enhancing Perioperative Patient Education.

J Med Internet Res. 2025-7-15

[6]
Thyroid Eye Disease and Artificial Intelligence: A Comparative Study of ChatGPT-3.5, ChatGPT-4o, and Gemini in Patient Information Delivery.

Ophthalmic Plast Reconstr Surg. 2024-12-24

[7]
Performance of Multimodal Artificial Intelligence Chatbots Evaluated on Clinical Oncology Cases.

JAMA Netw Open. 2024-10-1

[8]
Can ChatGPT provide parent education for oral immunotherapy?

Ann Allergy Asthma Immunol. 2025-7

[9]
Assessing ChatGPT responses to frequently asked patient questions in reconstructive urology.

Urol Pract. 2025-2-12

[10]
Evaluating the readability, quality, and reliability of responses generated by ChatGPT, Gemini, and Perplexity on the most commonly asked questions about Ankylosing spondylitis.

PLoS One. 2025-6-18

引用本文的文献

[1]
[AI-enabled clinical decision support systems: challenges and opportunities].

Bundesgesundheitsblatt Gesundheitsforschung Gesundheitsschutz. 2025-6-25

[2]
Assessing the Accuracy of Artificial Intelligence Chatbots in the Diagnosis and Management of Meniscal Tears.

Cureus. 2025-5-14

本文引用的文献

[1]
Large language models in health care: Development, applications, and challenges.

Health Care Sci. 2023-7-24

[2]
The performance of ChatGPT on orthopaedic in-service training exams: A comparative study of the GPT-3.5 turbo and GPT-4 models in orthopaedic education.

J Orthop. 2023-11-23

[3]
ChatGPT and large language models in orthopedics: from education and surgery to research.

J Exp Orthop. 2023-12-1

[4]
Arthrosis diagnosis and treatment recommendations in clinical practice: an exploratory investigation with the generative AI model GPT-4.

J Orthop Traumatol. 2023-11-28

[5]
A pilot study on the efficacy of GPT-4 in providing orthopedic treatment recommendations from MRI reports.

Sci Rep. 2023-11-17

[6]
Assessing Biases in Medical Decisions via Clinician and AI Chatbot Responses to Patient Vignettes.

JAMA Netw Open. 2023-10-2

[7]
Large Language Model-Based Chatbot vs Surgeon-Generated Informed Consent Documentation for Common Procedures.

JAMA Netw Open. 2023-10-2

[8]
The Rapid Development of Artificial Intelligence: GPT-4's Performance on Orthopedic Surgery Board Questions.

Orthopedics. 2024

[9]
Applications of large language models in cancer care: current evidence and future perspectives.

Front Oncol. 2023-9-4

[10]
Fabrication and errors in the bibliographic citations generated by ChatGPT.

Sci Rep. 2023-9-7

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

推荐工具

医学文档翻译智能文献检索