AI 生成的皮肤科文献综述评估。

Appraisal of AI-generated dermatology literature reviews.

机构信息

University Hospitals Birmingham NHS Foundation Trust, Solihull, UK.

University Hospitals Birmingham NHS Foundation Trust, Birmingham, UK.

出版信息

J Eur Acad Dermatol Venereol. 2024 Dec;38(12):2235-2239. doi: 10.1111/jdv.20237. Epub 2024 Jul 12.

DOI:10.1111/jdv.20237

PMID:38994876

Abstract

BACKGROUND

Artificial intelligence (AI) tools have the potential to revolutionize many facets of medicine and medical sciences research. Numerous AI tools have been developed and are in continuous states of iterative improvement in their functionality.

OBJECTIVES

This study aimed to assess the performance of three AI tools: The Literature, Microsoft's Copilot and Google's Gemini in performing literature reviews on a range of dermatology topics.

METHODS

Each tool was asked to write a literature review on five topics. The topics chosen have recently had peer-reviewed systematic reviews published. The outputs of each took were graded on their evidence and analysis, conclusions and references on a 5-point Likert scale by three dermatologists who are working in clinical practice, have completed the UK dermatology postgraduate training examination and are partaking in continued professional development.

RESULTS

Across all five topics chosen, the literature reviews written by Gemini scored the highest. The mean score for Gemini for each review was 10.53, significantly higher than the mean scores achieved by The Literature (7.73) and Copilot (7.4) (p < 0.001).

CONCLUSIONS

This paper shows that AI-generated literature reviews can provide real-time summaries of medical literature across a range of dermatology topics, but limitations to their comprehensiveness and accuracy are apparent.

摘要

背景

人工智能（AI）工具有可能彻底改变医学和医学科学研究的许多方面。已经开发出许多 AI 工具，并且它们的功能正在不断迭代改进。

目的

本研究旨在评估三款 AI 工具（The Literature、Microsoft 的 Copilot 和 Google 的 Gemini）在皮肤科多个主题的文献综述方面的性能。

方法

要求每个工具撰写五个主题的文献综述。选择的主题最近有经过同行评审的系统评价发表。由三位正在临床工作、已完成英国皮肤科研究生培训考试并参与持续专业发展的皮肤科医生，根据证据和分析、结论和参考文献对每个工具的输出进行五级李克特量表评分。

结果

在所选择的五个主题中，由 Gemini 撰写的文献综述得分最高。对于每个综述，Gemini 的平均得分为 10.53，明显高于 The Literature（7.73）和 Copilot（7.4）的平均得分（p<0.001）。

结论

本文表明，AI 生成的文献综述可以实时总结皮肤科多个主题的医学文献，但它们的全面性和准确性存在明显的局限性。

相似文献

Appraisal of AI-generated dermatology literature reviews.AI 生成的皮肤科文献综述评估。

J Eur Acad Dermatol Venereol. 2024 Dec;38(12):2235-2239. doi: 10.1111/jdv.20237. Epub 2024 Jul 12.

An evaluation of AI generated literature reviews in musculoskeletal radiology.评估人工智能生成的肌肉骨骼放射学文献综述。

Surgeon. 2024 Jun;22(3):194-197. doi: 10.1016/j.surge.2023.12.005. Epub 2024 Jan 12.

Assessing the quality and readability of patient education materials on chemotherapy cardiotoxicity from artificial intelligence chatbots: An observational cross-sectional study.评估人工智能聊天机器人提供的关于化疗心脏毒性的患者教育材料的质量和可读性：一项观察性横断面研究。

Medicine (Baltimore). 2025 Apr 11;104(15):e42135. doi: 10.1097/MD.0000000000042135.

Can artificial intelligence models serve as patient information consultants in orthodontics?人工智能模型能否在正畸学中充当患者信息顾问？

BMC Med Inform Decis Mak. 2024 Jul 29;24(1):211. doi: 10.1186/s12911-024-02619-8.

Gemini AI vs. ChatGPT: A comprehensive examination alongside ophthalmology residents in medical knowledge.Gemini人工智能与ChatGPT对比：与眼科住院医师一起对医学知识进行的全面考察

Graefes Arch Clin Exp Ophthalmol. 2025 Feb;263(2):527-536. doi: 10.1007/s00417-024-06625-4. Epub 2024 Sep 15.

Can AI Answer My Questions? Utilizing Artificial Intelligence in the Perioperative Assessment for Abdominoplasty Patients.人工智能能回答我的问题吗？腹部整形手术患者围手术期评估中人工智能的应用。

Aesthetic Plast Surg. 2024 Nov;48(22):4712-4724. doi: 10.1007/s00266-024-04157-0. Epub 2024 Jun 19.

Artificial intelligence in healthcare education: evaluating the accuracy of ChatGPT, Copilot, and Google Gemini in cardiovascular pharmacology.医疗保健教育中的人工智能：评估ChatGPT、Copilot和谷歌Gemini在心血管药理学方面的准确性。

Front Med (Lausanne). 2025 Feb 19;12:1495378. doi: 10.3389/fmed.2025.1495378. eCollection 2025.

Comparative accuracy of ChatGPT-4, Microsoft Copilot and Google Gemini in the Italian entrance test for healthcare sciences degrees: a cross-sectional study.ChatGPT-4、微软 Copilot 和谷歌 Gemini 在意大利医疗科学学位入学考试中的比较准确性：一项横断面研究。

BMC Med Educ. 2024 Jun 26;24(1):694. doi: 10.1186/s12909-024-05630-9.

Can large language models provide accurate and quality information to parents regarding chronic kidney diseases?大语言模型能否为家长提供关于慢性肾脏病的准确、高质量信息？

J Eval Clin Pract. 2024 Dec;30(8):1556-1564. doi: 10.1111/jep.14084. Epub 2024 Jul 3.

Claude, ChatGPT, Copilot, and Gemini performance versus students in different topics of neuroscience.克劳德、ChatGPT、Copilot和Gemini在神经科学不同主题上与学生的表现对比。

Adv Physiol Educ. 2025 Jun 1;49(2):430-437. doi: 10.1152/advan.00093.2024. Epub 2025 Jan 17.

引用本文的文献

Categorization of Novel Research Ideas Regarding Adolescent Idiopathic Scoliosis Generated by Artificial Intelligence.人工智能生成的关于青少年特发性脊柱侧凸的新颖研究思路分类

Cureus. 2024 Nov 27;16(11):e74574. doi: 10.7759/cureus.74574. eCollection 2024 Nov.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

AI 生成的皮肤科文献综述评估。

Appraisal of AI-generated dermatology literature reviews.

机构信息

出版信息

BACKGROUND

OBJECTIVES

METHODS

RESULTS

CONCLUSIONS

背景

目的

方法

结果

结论

相似文献

引用本文的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献