勃起功能障碍常见问题：在专家指导下评估人工智能给出的答案

Frequently asked questions on erectile dysfunction: evaluating artificial intelligence answers with expert mentorship.

作者信息

Baturu Muharrem, Solakhan Mehmet, Kazaz Tanyeli Guneyligil, Bayrak Omer

机构信息

Department of Urology, University of Gaziantep, Gaziantep, Turkey.

Department of Urology, Hasan Kalyoncu University, Gaziantep, Turkey.

出版信息

Int J Impot Res. 2025 Apr;37(4):310-314. doi: 10.1038/s41443-024-00898-3. Epub 2024 May 7.

DOI:10.1038/s41443-024-00898-3

PMID:38714784

Abstract

The present study assessed the accuracy of artificiaI intelligence-generated responses to frequently asked questions on erectile dysfunction. A cross-sectional analysis involved 56 erectile dysfunction-related questions searched on Google, categorized into nine sections: causes, diagnosis, treatment options, treatment complications, protective measures, relationship with other illnesses, treatment costs, treatment with herbal agents, and appointments. Responses from ChatGPT 3.5, ChatGPT 4, and BARD were evaluated by two experienced urology experts using the F1 and global quality scores (GQS) for accuracy, relevance, and comprehensibility. ChatGPT 3.5 and ChatGPT 4 achieved higher GQS than BARD in categories such as causes (4.5 ± 0.54, 4.5 ± 0.51, 3.15 ± 1.01, respectively, p < 0.001), treatment options (4.35 ± 0.6, 4.5 ± 0.43, 2.71 ± 1.38, respectively, p < 0.001), protective measures (5.0 ± 0, 5.0 ± 0, 4 ± 0.5, respectively, p = 0.013), relationships with other illnesses (4.58 ± 0.58, 4.83 ± 0.25, 3.58 ± 0.8, respectively, p = 0.006), and treatment with herbal agents (3 ± 0.61, 3.33 ± 0.83, 1.8 ± 1.09, respectively, p = 0.043). F1 scores in categories: causes (1), diagnosis (0.857), treatment options (0.726), and protective measures (1), indicated their alignment with the guidelines. There was no significant difference between ChatGPT 3.5 and ChatGPT 4 regarding answer quality, but both outperformed BARD in the GQS. These results emphasize the need to continually enhance and validate AI-generated medical information, underscoring the importance of artificiaI intelligence systems in delivering reliable information on erectile dysfunction.

摘要

本研究评估了人工智能生成的关于勃起功能障碍常见问题回答的准确性。一项横断面分析涉及在谷歌上搜索的56个与勃起功能障碍相关的问题，分为九个部分：病因、诊断、治疗选择、治疗并发症、保护措施、与其他疾病的关系、治疗费用、草药治疗以及预约。两位经验丰富的泌尿外科专家使用F1和全球质量评分（GQS）对ChatGPT 3.5、ChatGPT 4和BARD的回答进行准确性、相关性和可理解性评估。在病因（分别为4.5±0.54、4.5±0.51、3.15±1.01，p<0.001）、治疗选择（分别为4.35±0.6、4.5±0.43、2.71±1.38，p<0.001）、保护措施（分别为5.0±0、5.0±0、4±0.5，p = 0.013）、与其他疾病的关系（分别为4.58±0.58、4.83±0.25、3.58±0.8，p = 0.006）以及草药治疗（分别为3±0.61、3.33±0.83、1.8±1.09，p = 0.043）等类别中，ChatGPT 3.5和ChatGPT 4的GQS得分高于BARD。各分类的F1得分：病因（1）、诊断（0.857）、治疗选择（0.726）和保护措施（1），表明它们与指南相符。ChatGPT 3.5和ChatGPT 4在回答质量方面没有显著差异，但在GQS上均优于BARD。这些结果强调了持续增强和验证人工智能生成的医学信息的必要性，凸显了人工智能系统在提供关于勃起功能障碍可靠信息方面的重要性。

相似文献

Frequently asked questions on erectile dysfunction: evaluating artificial intelligence answers with expert mentorship.勃起功能障碍常见问题：在专家指导下评估人工智能给出的答案

Int J Impot Res. 2025 Apr;37(4):310-314. doi: 10.1038/s41443-024-00898-3. Epub 2024 May 7.

Evaluation of the Current Status of Artificial Intelligence for Endourology Patient Education: A Blind Comparison of ChatGPT and Google Bard Against Traditional Information Resources.评估人工智能在泌尿内镜患者教育中的现状：ChatGPT 和 Google Bard 与传统信息资源的盲对比。

J Endourol. 2024 Aug;38(8):843-851. doi: 10.1089/end.2023.0696. Epub 2024 May 17.

Assessing the Accuracy of Information on Medication Abortion: A Comparative Analysis of ChatGPT and Google Bard AI.评估药物流产信息的准确性：ChatGPT与谷歌巴德人工智能的比较分析

Cureus. 2024 Jan 2;16(1):e51544. doi: 10.7759/cureus.51544. eCollection 2024 Jan.

Performance of ChatGPT-4 and Bard chatbots in responding to common patient questions on prostate cancer Lu-PSMA-617 therapy.ChatGPT-4和Bard聊天机器人在回答关于前列腺癌Lu-PSMA-617疗法常见患者问题方面的表现

Front Oncol. 2024 Jul 12;14:1386718. doi: 10.3389/fonc.2024.1386718. eCollection 2024.

How artificial intelligence can provide information about subdural hematoma: Assessment of readability, reliability, and quality of ChatGPT, BARD, and perplexity responses.人工智能如何提供关于硬膜下血肿的信息：对ChatGPT、BARD和Perplexity回答的可读性、可靠性和质量评估。

Medicine (Baltimore). 2024 May 3;103(18):e38009. doi: 10.1097/MD.0000000000038009.

Comparing answers of artificial intelligence systems and clinical toxicologists to questions about poisoning: Can their answers be distinguished?比较人工智能系统和临床毒理学家对中毒问题的回答：能否区分他们的答案？

Emergencias. 2024 Oct;36(5):351-358. doi: 10.55633/s3me/082.2024.

Evaluating ChatGPT ability to answer urinary tract Infection-Related questions.评估 ChatGPT 回答尿路感染相关问题的能力。

Infect Dis Now. 2024 Jun;54(4):104884. doi: 10.1016/j.idnow.2024.104884. Epub 2024 Mar 8.

Is ChatGPT knowledgeable of acute coronary syndromes and pertinent European Society of Cardiology Guidelines?ChatGPT 是否了解急性冠状动脉综合征和相关的欧洲心脏病学会指南？

Minerva Cardiol Angiol. 2024 Jun;72(3):299-303. doi: 10.23736/S2724-5683.24.06517-7. Epub 2024 Feb 23.

Appropriateness and readability of Google Bard and ChatGPT-3.5 generated responses for surgical treatment of glaucoma.谷歌巴德和 ChatGPT-3.5 生成的青光眼手术治疗回复的适宜性和可读性。

Rom J Ophthalmol. 2024 Jul-Sep;68(3):243-248. doi: 10.22336/rjo.2024.45.

Comparison of Large Language Models in Answering Immuno-Oncology Questions: A Cross-Sectional Study.大型语言模型在回答免疫肿瘤学问题中的比较：一项横断面研究。

Oncologist. 2024 May 3;29(5):407-414. doi: 10.1093/oncolo/oyae009.

引用本文的文献

Utilization of artificial intelligence in Men's Health: Opportunities for innovation and quality improvement.人工智能在男性健康领域的应用：创新与质量提升的机遇。

Int J Impot Res. 2025 Jun 27. doi: 10.1038/s41443-025-01112-8.

Artificial Intelligence-Based Clinical Decision-Making in Erectile Dysfunction: a Narrative Review.基于人工智能的勃起功能障碍临床决策：一项叙述性综述

Curr Urol Rep. 2024 Dec 11;26(1):22. doi: 10.1007/s11934-024-01251-3.

Commentary on: Frequently asked questions on erectile dysfunction: evaluating artificial intelligence answers with expert mentorship.关于《勃起功能障碍常见问题解答：在专家指导下评估人工智能的回答》的评论

Int J Impot Res. 2025 Apr;37(4):340-341. doi: 10.1038/s41443-024-00901-x. Epub 2024 May 24.

Can CHATGPT provides reliable technical medical information about phimosis?ChatGPT能提供关于包茎的可靠医学技术信息吗？

Int Braz J Urol. 2024 Sep-Oct;50(5):651-654. doi: 10.1590/S1677-5538.IBJU.2024.9913.

本文引用的文献

Radiology artificial intelligence: a systematic review and evaluation of methods (RAISE).放射学人工智能：方法的系统评价和评估（RAISE）。

Eur Radiol. 2022 Nov;32(11):7998-8007. doi: 10.1007/s00330-022-08784-6. Epub 2022 Apr 14.

European Association of Urology Guidelines on Sexual and Reproductive Health-2021 Update: Male Sexual Dysfunction.欧洲泌尿外科学会 2021 年性与生殖健康指南更新：男性性功能障碍。

Eur Urol. 2021 Sep;80(3):333-357. doi: 10.1016/j.eururo.2021.06.007. Epub 2021 Jun 26.

Current and future applications of machine and deep learning in urology: a review of the literature on urolithiasis, renal cell carcinoma, and bladder and prostate cancer.目前和未来机器和深度学习在泌尿科中的应用：对肾结石、肾细胞癌以及膀胱癌和前列腺癌文献的综述。

World J Urol. 2020 Oct;38(10):2329-2347. doi: 10.1007/s00345-019-03000-5. Epub 2019 Nov 5.

Artificial intelligence in healthcare: past, present and future.人工智能在医疗保健中的应用：过去、现在和未来。

Stroke Vasc Neurol. 2017 Jun 21;2(4):230-243. doi: 10.1136/svn-2017-000101. eCollection 2017 Dec.

Statistical-learning strategies generate only modestly performing predictive models for urinary symptoms following external beam radiotherapy of the prostate: A comparison of conventional and machine-learning methods.统计学习策略生成的预测模型在前列腺外照射放疗后对泌尿系统症状的预测表现一般：传统方法与机器学习方法的比较

Med Phys. 2016 May;43(5):2040. doi: 10.1118/1.4944738.

Intraoperative registered transrectal ultrasound guidance for robot-assisted laparoscopic radical prostatectomy.术中配准经直肠超声引导机器人辅助腹腔镜根治性前列腺切除术。

J Urol. 2015 Jan;193(1):302-12. doi: 10.1016/j.juro.2014.05.124. Epub 2014 Aug 21.

J Sex Med. 2010 Apr;7(4 Pt 1):1362-80. doi: 10.1111/j.1743-6109.2009.01601.x. Epub 2009 Nov 19.

A systematic review of patient inflammatory bowel disease information resources on the World Wide Web.对万维网上患者炎症性肠病信息资源的系统评价。

Am J Gastroenterol. 2007 Sep;102(9):2070-7. doi: 10.1111/j.1572-0241.2007.01325.x. Epub 2007 May 19.

Lower urinary tract symptoms and erectile dysfunction: co-morbidity or typical "Aging Male" symptoms? Results of the "Cologne Male Survey".下尿路症状与勃起功能障碍：共病还是典型的“老年男性”症状？“科隆男性调查”结果

Eur Urol. 2003 Nov;44(5):588-94. doi: 10.1016/s0302-2838(03)00358-0.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

勃起功能障碍常见问题：在专家指导下评估人工智能给出的答案

Frequently asked questions on erectile dysfunction: evaluating artificial intelligence answers with expert mentorship.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献