评估ChatGPT在为老年女性提供乳腺钼靶筛查建议方面的准确性：人工智能与癌症沟通

Evaluating ChatGPT's Accuracy in Providing Screening Mammography Recommendations among Older Women: Artificial Intelligence and Cancer Communication.

作者信息

Braithwaite Dejana, Karanth Shama D, Divaker Joel, Schoenborn Nancy, Lin Kenneth, Richman Ilana, Hochhegger Bruno, O'Neill Suzanne, Schonberg Mara

出版信息

Res Sq. 2024 Jan 31:rs.3.rs-3911155. doi: 10.21203/rs.3.rs-3911155/v1.

DOI:10.21203/rs.3.rs-3911155/v1

PMID:38352437

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10862946/

Abstract

The U.S. Preventive Services Task Force (USPSTF) recommends biennial screening mammography through age 74. Guidelines vary as to whether or not they recommended mammography screening to women aged 75 and older. This study aims to determine the ability of ChatGPT to provide appropriate recommendations for breast cancer screening in patients aged 75 years and older. 12 questions and 4 clinical vignettes addressing fundamental concepts about breast cancer screening and prevention in patients aged 75 years and older were created and asked to ChatGPT three consecutive times to generate 3 sets of responses. The responses were graded by a multi-disciplinary panel of experts in the intersection of breast cancer screening and aging The responses were graded as 'appropriate', 'inappropriate', or 'unreliable' based on the reviewer's clinical judgment, content of the response, and whether the content was consistent across the three responses Appropriateness was determined through a majority consensus. The responses generated by ChatGPT were appropriate for 11/17 questions (64%). Three questions were graded as inappropriate (18%) and 2 questions were graded as unreliable (12%). A consensus was not reached on one question (6%) and was graded as no consensus. While recognizing the limitations of ChatGPT, it has potential to provide accurate health care information and could be utilized by healthcare professionals to assist in providing recommendations for breast cancer screening in patients age 75 years and older. Physician oversight will be necessary, due to the possibility of ChatGPT to provide inappropriate and unreliable responses, and the importance of accuracy in medicine.

摘要

美国预防服务工作组（USPSTF）建议，74岁及以下女性应每两年进行一次乳腺钼靶筛查。对于75岁及以上女性是否推荐进行乳腺钼靶筛查，各指南的规定有所不同。本研究旨在确定ChatGPT为75岁及以上患者提供乳腺癌筛查适当建议的能力。针对75岁及以上患者乳腺癌筛查和预防的基本概念，设计了12个问题和4个临床病例，并连续三次向ChatGPT提问，以生成3组回答。由乳腺癌筛查与老龄化交叉领域的多学科专家小组对这些回答进行评分。根据评审人员的临床判断、回答内容以及三个回答的内容是否一致，将回答评为“适当”、“不适当”或“不可靠”。适当性通过多数共识确定。ChatGPT生成的回答在17个问题中有11个是适当的（64%）。三个问题被评为不适当（18%），两个问题被评为不可靠（12%）。有一个问题（6%）未达成共识，被评为无共识。虽然认识到ChatGPT的局限性，但它有潜力提供准确的医疗保健信息，医疗保健专业人员可利用它协助为75岁及以上患者提供乳腺癌筛查建议。由于ChatGPT可能提供不适当和不可靠的回答，以及医学中准确性的重要性，医生的监督将是必要的。

相似文献

Evaluating ChatGPT's Accuracy in Providing Screening Mammography Recommendations among Older Women: Artificial Intelligence and Cancer Communication.评估ChatGPT在为老年女性提供乳腺钼靶筛查建议方面的准确性：人工智能与癌症沟通

Res Sq. 2024 Jan 31:rs.3.rs-3911155. doi: 10.21203/rs.3.rs-3911155/v1.

An assessment of ChatGPT's responses to frequently asked questions about cervical and breast cancer.评估 ChatGPT 对宫颈癌和乳腺癌常见问题的回答。

BMC Womens Health. 2024 Sep 2;24(1):482. doi: 10.1186/s12905-024-03320-8.

ChatGPT's Attitude, Knowledge, and Clinical Application in Geriatrics Practice and Education: Exploratory Observational Study.ChatGPT在老年医学实践与教育中的态度、知识及临床应用：探索性观察研究

JMIR Form Res. 2025 Jan 3;9:e63494. doi: 10.2196/63494.

Assessing question characteristic influences on ChatGPT's performance and response-explanation consistency: Insights from Taiwan's Nursing Licensing Exam.评估问题特征对 ChatGPT 表现和回应解释一致性的影响：来自台湾护理执照考试的见解。

Int J Nurs Stud. 2024 May;153:104717. doi: 10.1016/j.ijnurstu.2024.104717. Epub 2024 Feb 8.

Use and Application of Large Language Models for Patient Questions Following Total Knee Arthroplasty.全膝关节置换术后患者问题的大语言模型应用与实践

J Arthroplasty. 2024 Sep;39(9):2289-2294. doi: 10.1016/j.arth.2024.03.017. Epub 2024 Mar 13.

Evaluating the Accuracy of Large Language Model (ChatGPT) in Providing Information on Metastatic Breast Cancer.评估大语言模型（ChatGPT）在提供转移性乳腺癌信息方面的准确性。

Adv Pharm Bull. 2024 Oct;14(3):499-503. doi: 10.34172/apb.2024.060. Epub 2024 Jul 31.

Generative Artificial Intelligence in Patient Education: ChatGPT Takes on Hypertension Questions.患者教育中的生成式人工智能：ChatGPT 应对高血压问题。

Cureus. 2024 Feb 2;16(2):e53441. doi: 10.7759/cureus.53441. eCollection 2024 Feb.

Appropriateness and Reliability of an Online Artificial Intelligence Platform's Responses to Common Questions Regarding Distal Radius Fractures.在线人工智能平台对桡骨远端骨折常见问题的回答的适宜性和可靠性。

J Hand Surg Am. 2024 Feb;49(2):91-98. doi: 10.1016/j.jhsa.2023.10.019. Epub 2023 Dec 8.

Evaluating ChatGPT's effectiveness and tendencies in Japanese internal medicine.评估 ChatGPT 在日本内科学中的有效性和倾向。

J Eval Clin Pract. 2024 Sep;30(6):1017-1023. doi: 10.1111/jep.14011. Epub 2024 May 19.

ChatGPT versus NASS clinical guidelines for degenerative spondylolisthesis: a comparative analysis.ChatGPT 与 NASS 退行性脊柱滑脱临床指南比较分析。

Eur Spine J. 2024 Nov;33(11):4182-4203. doi: 10.1007/s00586-024-08198-6. Epub 2024 Mar 15.

引用本文的文献

Assessing the Quality and Reliability of ChatGPT's Responses to Radiotherapy-Related Patient Queries: Comparative Study With GPT-3.5 and GPT-4.评估ChatGPT对放疗相关患者问题回答的质量和可靠性：与GPT-3.5和GPT-4的比较研究

JMIR Cancer. 2025 Apr 16;11:e63677. doi: 10.2196/63677.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验