• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

人工智能语言模型ChatGPT能否提供关于前列腺癌的准确且高质量的患者信息?

Can ChatGPT, an Artificial Intelligence Language Model, Provide Accurate and High-quality Patient Information on Prostate Cancer?

作者信息

Coskun Burhan, Ocakoglu Gokhan, Yetemen Melih, Kaygisiz Onur

机构信息

Bursa Uludag University, Department of Urology, Nilüfer, Bursa, Turkey.

Bursa Uludag University, Department of Biostatistics, Nilüfer, Bursa, Turkey.

出版信息

Urology. 2023 Oct;180:35-58. doi: 10.1016/j.urology.2023.05.040. Epub 2023 Jul 4.

DOI:10.1016/j.urology.2023.05.040
PMID:37406864
Abstract

OBJECTIVE

To evaluate the performance of ChatGPT, an artificial intelligence (AI) language model, in providing patient information on prostate cancer, and to compare the accuracy, similarity, and quality of the information to a reference source.

METHODS

Patient information material on prostate cancer was used as a reference source from the website of the European Association of Urology Patient Information. This was used to generate 59 queries. The accuracy of the model's content was determined with F1, precision, and recall scores. The similarity was assessed with cosine similarity, and the quality was evaluated using a 5-point Likert scale named General Quality Score (GQS).

RESULTS

ChatGPT was able to respond to all prostate cancer-related queries. The average F1 score was 0.426 (range: 0-1), precision score was 0.349 (range: 0-1), recall score was 0.549 (range: 0-1), and cosine similarity was 0.609 (range: 0-1). The average GQS was 3.62 ± 0.49 (range: 1-5), with no answers achieving the maximum GQS of 5. While ChatGPT produced a larger amount of information compared to the reference, the accuracy and quality of the content were not optimal, with all scores indicating need for improvement in the model's performance.

CONCLUSION

Caution should be exercised when using ChatGPT as a patient information source for prostate cancer due to limitations in its performance, which may lead to inaccuracies and potential misunderstandings. Further studies, using different topics and language models, are needed to fully understand the capabilities and limitations of AI-generated patient information.

摘要

目的

评估人工智能(AI)语言模型ChatGPT在提供前列腺癌患者信息方面的表现,并将信息的准确性、相似性和质量与参考来源进行比较。

方法

将欧洲泌尿外科学会患者信息网站上的前列腺癌患者信息材料用作参考来源。以此生成59个问题。通过F1值、精确率和召回率得分来确定模型内容的准确性。用余弦相似度评估相似性,并使用名为总体质量评分(GQS)的5点李克特量表来评估质量。

结果

ChatGPT能够回答所有与前列腺癌相关的问题。平均F1值为0.426(范围:0 - 1),精确率得分为0.349(范围:0 - 1),召回率得分为0.549(范围:0 - 1),余弦相似度为0.609(范围:0 - 1)。平均GQS为3.62 ± 0.49(范围:1 - 5),没有答案达到最高的GQS 5分。虽然与参考资料相比,ChatGPT生成的信息量更大,但内容的准确性和质量并不理想,所有得分都表明该模型的性能需要改进。

结论

由于ChatGPT在性能上存在局限性,可能导致信息不准确和潜在的误解,因此在将其用作前列腺癌患者信息来源时应谨慎。需要使用不同主题和语言模型进行进一步研究,以全面了解人工智能生成的患者信息的能力和局限性。

相似文献

1
Can ChatGPT, an Artificial Intelligence Language Model, Provide Accurate and High-quality Patient Information on Prostate Cancer?人工智能语言模型ChatGPT能否提供关于前列腺癌的准确且高质量的患者信息?
Urology. 2023 Oct;180:35-58. doi: 10.1016/j.urology.2023.05.040. Epub 2023 Jul 4.
2
Can large language models provide accurate and quality information to parents regarding chronic kidney diseases?大语言模型能否为家长提供关于慢性肾脏病的准确、高质量信息?
J Eval Clin Pract. 2024 Dec;30(8):1556-1564. doi: 10.1111/jep.14084. Epub 2024 Jul 3.
3
Urological Cancers and ChatGPT: Assessing the Quality of Information and Possible Risks for Patients.泌尿系统癌症与ChatGPT:评估信息质量及对患者的潜在风险
Clin Genitourin Cancer. 2024 Apr;22(2):454-457.e4. doi: 10.1016/j.clgc.2023.12.017. Epub 2024 Jan 5.
4
Using ChatGPT-4 to Create Structured Medical Notes From Audio Recordings of Physician-Patient Encounters: Comparative Study.利用 ChatGPT-4 从医患对话的音频记录中创建结构化的医疗记录:比较研究。
J Med Internet Res. 2024 Apr 22;26:e54419. doi: 10.2196/54419.
5
Potential and Limitations of ChatGPT 3.5 and 4.0 as a Source of COVID-19 Information: Comprehensive Comparative Analysis of Generative and Authoritative Information.ChatGPT 3.5 和 4.0 作为 COVID-19 信息来源的潜力和局限性:生成信息和权威信息的综合比较分析。
J Med Internet Res. 2023 Dec 14;25:e49771. doi: 10.2196/49771.
6
Comparisons of Quality, Correctness, and Similarity Between ChatGPT-Generated and Human-Written Abstracts for Basic Research: Cross-Sectional Study.ChatGPT 生成的和人工撰写的基础研究摘要在质量、正确性和相似性方面的比较:横断面研究。
J Med Internet Res. 2023 Dec 25;25:e51229. doi: 10.2196/51229.
7
Assessing the Capability of ChatGPT in Answering First- and Second-Order Knowledge Questions on Microbiology as per Competency-Based Medical Education Curriculum.根据基于能力的医学教育课程评估ChatGPT回答微生物学一阶和二阶知识问题的能力。
Cureus. 2023 Mar 12;15(3):e36034. doi: 10.7759/cureus.36034. eCollection 2023 Mar.
8
Evaluating ChatGPT as a patient resource for frequently asked questions about lung cancer surgery-a pilot study.评估ChatGPT作为肺癌手术常见问题患者资源的可行性——一项试点研究。
J Thorac Cardiovasc Surg. 2025 Apr;169(4):1174-1180.e18. doi: 10.1016/j.jtcvs.2024.09.030. Epub 2024 Sep 24.
9
Performance of ChatGPT in providing patient information about upper tract urothelial carcinoma.ChatGPT在提供上尿路尿路上皮癌患者信息方面的表现。
Contemp Oncol (Pozn). 2024;28(2):172-181. doi: 10.5114/wo.2024.141567. Epub 2024 Aug 23.
10
ChatGPT in Answering Queries Related to Lifestyle-Related Diseases and Disorders.ChatGPT在回答与生活方式相关疾病和失调有关的问题时的应用。
Cureus. 2023 Nov 5;15(11):e48296. doi: 10.7759/cureus.48296. eCollection 2023 Nov.

引用本文的文献

1
Artificial intelligence across the cancer care continuum.贯穿癌症护理全过程的人工智能
Cancer. 2025 Aug 15;131(16):e70050. doi: 10.1002/cncr.70050.
2
Development and evaluation of large-language models (LLMs) for oncology: A scoping review.用于肿瘤学的大语言模型的开发与评估:一项范围综述。
PLOS Digit Health. 2025 Aug 7;4(8):e0000980. doi: 10.1371/journal.pdig.0000980. eCollection 2025 Aug.
3
Generative AI/LLMs for Plain Language Medical Information for Patients, Caregivers and General Public: Opportunities, Risks and Ethics.
用于为患者、护理人员和普通公众提供通俗易懂的医学信息的生成式人工智能/大型语言模型:机遇、风险与伦理
Patient Prefer Adherence. 2025 Jul 31;19:2227-2249. doi: 10.2147/PPA.S527922. eCollection 2025.
4
Artificial intelligence for diagnosing bladder pathophysiology: An updated review and future prospects.用于诊断膀胱病理生理学的人工智能:最新综述与未来展望。
Bladder (San Franc). 2025 Apr 10;12(2):e21200042. doi: 10.14440/bladder.2024.0054. eCollection 2025.
5
Large language model integrations in cancer decision-making: a systematic review and meta-analysis.大型语言模型在癌症决策中的应用:一项系统综述和荟萃分析。
NPJ Digit Med. 2025 Jul 17;8(1):450. doi: 10.1038/s41746-025-01824-7.
6
Evaluating the readability, quality, and reliability of responses generated by ChatGPT, Gemini, and Perplexity on the most commonly asked questions about Ankylosing spondylitis.评估ChatGPT、Gemini和Perplexity针对强直性脊柱炎最常见问题生成的回答的可读性、质量和可靠性。
PLoS One. 2025 Jun 18;20(6):e0326351. doi: 10.1371/journal.pone.0326351. eCollection 2025.
7
Evaluating large language models as an educational tool for meningioma patients: patient and clinician perspectives.将大语言模型评估为脑膜瘤患者的教育工具:患者和临床医生的观点。
Radiat Oncol. 2025 Jun 14;20(1):101. doi: 10.1186/s13014-025-02671-2.
8
Artificial Intelligence Use in Medical Education: Best Practices and Future Directions.人工智能在医学教育中的应用:最佳实践与未来方向。
Curr Urol Rep. 2025 May 29;26(1):45. doi: 10.1007/s11934-025-01277-1.
9
Large Language Models as a Consulting Hotline for Patients With Breast Cancer and Specialists in China: Cross-Sectional Questionnaire Study.在中国,大型语言模型作为乳腺癌患者和专科医生的咨询热线:横断面问卷调查研究
JMIR Med Inform. 2025 May 27;13:e66429. doi: 10.2196/66429.
10
Evaluating an AI Chatbot "Prostate Cancer Info" for Providing Quality Prostate Cancer Screening Information: Cross-Sectional Study.评估人工智能聊天机器人“前列腺癌信息”以提供高质量前列腺癌筛查信息:横断面研究。
JMIR Cancer. 2025 May 21;11:e72522. doi: 10.2196/72522.