人工智能平台在 Mohs 显微外科中用于患者生成问题的效用：一项多国家、盲法专家小组评估。

The utility of artificial intelligence platforms for patient-generated questions in Mohs micrographic surgery: a multi-national, blinded expert panel evaluation.

机构信息

Baylor University Medical Center, Dallas, TX, USA.

Texas A&M College of Medicine, Dallas, TX, USA.

出版信息

Int J Dermatol. 2024 Nov;63(11):1592-1598. doi: 10.1111/ijd.17382. Epub 2024 Aug 9.

DOI:10.1111/ijd.17382

PMID:39123288

Abstract

BACKGROUND

Artificial intelligence (AI) and large language models (LLMs) transform how patients inform themselves. LLMs offer potential as educational tools, but their quality depends upon the information generated. Current literature examining AI as an informational tool in dermatology has been limited in evaluating AI's multifaceted roles and diversity of opinions. Here, we evaluate LLMs as a patient-educational tool for Mohs micrographic surgery (MMS) in and out of the clinic utilizing an international expert panel.

METHODS

The most common patient MMS questions were extracted from Google and transposed into two LLMs and Google's search engine. 15 MMS surgeons evaluated the generated responses, examining their appropriateness as a patient-facing informational platform, sufficiency of response in a clinical environment, and accuracy of content generated. Validated scales were employed to assess the comprehensibility of each response.

RESULTS

The majority of reviewers deemed all LLM responses appropriate. 75% of responses were rated as mostly accurate or higher. ChatGPT had the highest mean accuracy. The majority of the panel deemed 33% of responses sufficient for clinical practice. The mean comprehensibility scores for all platforms indicated a required 10th-grade reading level.

CONCLUSIONS

LLM-generated responses were rated as appropriate patient informational sources and mostly accurate in their content. However, these platforms may not provide sufficient information to function in a clinical environment, and complex comprehensibility may represent a barrier to utilization. As the popularity of these platforms increases, it is important for dermatologists to be aware of these limitations.

摘要

背景

人工智能（AI）和大型语言模型（LLM）改变了患者获取信息的方式。LLM 作为教育工具具有一定的潜力，但它们的质量取决于生成的信息。目前，评估 AI 在皮肤科作为信息工具的文献有限，无法充分评估 AI 的多方面作用和多样化的观点。在此，我们利用国际专家小组评估 LLM 作为 Mohs 显微外科手术（MMS）在诊所内外的患者教育工具。

方法

从 Google 中提取最常见的患者 MMS 问题，并将其转换为两个 LLM 和 Google 的搜索引擎。15 名 MMS 外科医生评估生成的回复，考察其作为面向患者的信息平台的适当性、在临床环境中的回复充分性以及生成内容的准确性。采用经过验证的量表来评估每个回复的理解度。

结果

大多数评审员认为所有 LLM 回复都是合适的。75%的回复被评为基本准确或更高。ChatGPT 的准确率最高。大多数专家小组认为 33%的回复足以满足临床实践的需要。所有平台的平均理解度得分表明需要 10 年级的阅读水平。

结论

LLM 生成的回复被评为合适的患者信息来源，其内容基本准确。然而，这些平台在临床环境中可能无法提供足够的信息，并且复杂的理解度可能是利用的障碍。随着这些平台的普及，皮肤科医生了解这些限制非常重要。

相似文献

The utility of artificial intelligence platforms for patient-generated questions in Mohs micrographic surgery: a multi-national, blinded expert panel evaluation.人工智能平台在 Mohs 显微外科中用于患者生成问题的效用：一项多国家、盲法专家小组评估。

Int J Dermatol. 2024 Nov;63(11):1592-1598. doi: 10.1111/ijd.17382. Epub 2024 Aug 9.

Dr. Google vs. Dr. ChatGPT: Exploring the Use of Artificial Intelligence in Ophthalmology by Comparing the Accuracy, Safety, and Readability of Responses to Frequently Asked Patient Questions Regarding Cataracts and Cataract Surgery.谷歌医生与ChatGPT医生：通过比较关于白内障及白内障手术的常见患者问题的回答的准确性、安全性和可读性，探索人工智能在眼科领域的应用。

Semin Ophthalmol. 2024 Aug;39(6):472-479. doi: 10.1080/08820538.2024.2326058. Epub 2024 Mar 22.

Assessing the Application of Large Language Models in Generating Dermatologic Patient Education Materials According to Reading Level: Qualitative Study.评估大语言模型在根据阅读水平生成皮肤科患者教育材料方面的应用：定性研究。

JMIR Dermatol. 2024 May 16;7:e55898. doi: 10.2196/55898.

Assessing the accuracy, usefulness, and readability of artificial-intelligence-generated responses to common dermatologic surgery questions for patient education: A double-blinded comparative study of ChatGPT and Google Bard.评估人工智能生成的针对常见皮肤科手术问题的患者教育回复的准确性、实用性和可读性：ChatGPT与谷歌巴德的双盲比较研究

J Am Acad Dermatol. 2024 May;90(5):1078-1080. doi: 10.1016/j.jaad.2024.01.037. Epub 2024 Feb 1.

Harnessing artificial intelligence in bariatric surgery: comparative analysis of ChatGPT-4, Bing, and Bard in generating clinician-level bariatric surgery recommendations.利用人工智能在减重手术中的应用：ChatGPT-4、Bing 和 Bard 在生成临床医生水平的减重手术建议方面的比较分析。

Surg Obes Relat Dis. 2024 Jul;20(7):603-608. doi: 10.1016/j.soard.2024.03.011. Epub 2024 Mar 24.

Evaluation of the Current Status of Artificial Intelligence for Endourology Patient Education: A Blind Comparison of ChatGPT and Google Bard Against Traditional Information Resources.评估人工智能在泌尿内镜患者教育中的现状：ChatGPT 和 Google Bard 与传统信息资源的盲对比。

J Endourol. 2024 Aug;38(8):843-851. doi: 10.1089/end.2023.0696. Epub 2024 May 17.

Artificial Intelligence-Generated Patient Education Materials for Helicobacter pylori Infection: A Comparative Analysis.人工智能生成的幽门螺杆菌感染患者教育材料：比较分析。

Helicobacter. 2024 Jul-Aug;29(4):e13115. doi: 10.1111/hel.13115.

AAD/ACMS/ASDSA/ASMS 2012 appropriate use criteria for Mohs micrographic surgery: a report of the American Academy of Dermatology, American College of Mohs Surgery, American Society for Dermatologic Surgery Association, and the American Society for Mohs Surgery.AAD/ACMS/ASDSA/ASMS 2012 适合 Mohs 显微外科手术的使用标准：美国皮肤病学会、美国 Mohs 外科学会、美国皮肤外科学会协会和美国 Mohs 外科学会的报告。

J Am Acad Dermatol. 2012 Oct;67(4):531-50. doi: 10.1016/j.jaad.2012.06.009. Epub 2012 Sep 5.

Artificial Intelligence for Mohs and Dermatologic Surgery: A Systematic Review and Meta-Analysis.人工智能在 Mohs 显微外科和皮肤科手术中的应用：系统评价和荟萃分析。

Dermatol Surg. 2024 Sep 1;50(9):799-806. doi: 10.1097/DSS.0000000000004297. Epub 2024 Jul 11.

Evaluation of the Performance of Generative AI Large Language Models ChatGPT, Google Bard, and Microsoft Bing Chat in Supporting Evidence-Based Dentistry: Comparative Mixed Methods Study.评估生成式 AI 大语言模型 ChatGPT、Google Bard 和 Microsoft Bing Chat 在支持循证牙科方面的性能：比较混合方法研究。

J Med Internet Res. 2023 Dec 28;25:e51580. doi: 10.2196/51580.

引用本文的文献

Comparison of the readability of ChatGPT and Bard in medical communication: a meta-analysis.ChatGPT与Bard在医学交流中的可读性比较：一项荟萃分析。

BMC Med Inform Decis Mak. 2025 Sep 1;25(1):325. doi: 10.1186/s12911-025-03035-2.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

人工智能平台在 Mohs 显微外科中用于患者生成问题的效用：一项多国家、盲法专家小组评估。

The utility of artificial intelligence platforms for patient-generated questions in Mohs micrographic surgery: a multi-national, blinded expert panel evaluation.

机构信息

出版信息

BACKGROUND

METHODS

RESULTS

CONCLUSIONS

背景

方法

结果

结论

相似文献

引用本文的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献