• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

将人工智能的新工具与我们全球健康专业学生的真实智能进行比较。

Comparing new tools of artificial intelligence to the authentic intelligence of our global health students.

作者信息

Thandla Shilpa R, Armstrong Grace Q, Menon Adil, Shah Aashna, Gueye David L, Harb Clara, Hernandez Estefania, Iyer Yasaswini, Hotchner Abigail R, Modi Riddhi, Mudigonda Anusha, Prokos Maria A, Rao Tharun M, Thomas Olivia R, Beltran Camilo A, Guerrieri Taylor, LeBlanc Sydney, Moorthy Skanda, Yacoub Sara G, Gardner Jacob E, Greenberg Benjamin M, Hubal Alyssa, Lapina Yuliana P, Moran Jacqueline, O'Brien Joseph P, Winnicki Anna C, Yoka Christina, Zhang Junwei, Zimmerman Peter A

机构信息

Master of Public Health Program, Department of Population and Quantitative Health Sciences, Case Western Reserve University School of Medicine, Cleveland, OH, USA.

School of Medicine, Case Western Reserve University, Cleveland, OH, USA.

出版信息

BioData Min. 2024 Dec 18;17(1):58. doi: 10.1186/s13040-024-00408-7.

DOI:10.1186/s13040-024-00408-7
PMID:39696442
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11656723/
Abstract

INTRODUCTION

The transformative feature of Artificial Intelligence (AI) is the massive capacity for interpreting and transforming unstructured data into a coherent and meaningful context. In general, the potential that AI will alter traditional approaches to student research and its evaluation appears to be significant. With regard to research in global health, it is important for students and research experts to assess strengths and limitations of GenAI within this space. Thus, the goal of our research was to evaluate the information literacy of GenAI compared to expectations that graduate students meet in writing research papers.

METHODS

After completing the course, Fundamentals of Global Health (INTH 401) at Case Western Reserve University (CWRU), Graduate students who successfully completed their required research paper were recruited to compare their original papers with a paper they generated by ChatGPT-4o using the original assignment prompt. Students also completed a Google Forms survey to evaluate different sections of the AI-generated paper (e.g., Adherence to Introduction guidelines, Presentation of three perspectives, Conclusion) and their original papers and their overall satisfaction with the AI work. The original student to ChatGPT-4o comparison also enabled evaluation of narrative elements and references.

RESULTS

Of the 54 students who completed the required research paper, 28 (51.8%) agreed to collaborate in the comparison project. A summary of the survey responses suggested that students evaluated the AI-generated paper as inferior or similar to their own paper (overall satisfaction average = 2.39 (1.61-3.17); Likert scale: 1 to 5 with lower scores indicating inferiority). Evaluating the average individual student responses for 5 Likert item queries showed that 17 scores were < 2.9; 7 scores were between 3.0 to 3.9; 4 scores were ≥ 4.0, consistent with inferiority of the AI-generated paper. Evaluation of reference selection by ChatGPT-4o (n = 729 total references) showed that 54% (n = 396) were authentic, 46% (n = 333) did not exist. Of the authentic references, 26.5% (105/396) were relevant to the paper narrative; 14.4% of the 729 total references.

DISCUSSION

Our findings reveal strengths and limitations on the potential of AI tools to assist in understanding the complexities of global health topics. Strengths mentioned by students included the ability of ChatGPT-4o to produce content very quickly and to suggest topics that they had not considered in the 3-perspective sections of their papers. Consistently presenting up-to-date facts and references, as well as further examining or summarizing the complexities of global health topics, appears to be a current limitation of ChatGPT-4o. Because ChatGPT-4o generated references from highly credible biomedical research journals that did not exist, our findings conclude that ChatGPT-4o failed an important component in using information effectively. Moreover, misrepresenting trusted sources of public health information is highly concerning, particularly given recent experiences from the COVID-19 pandemic and more recently in reporting on the impact of, and response to natural disasters. This is a significant limitation of GenAI's ability to meet information literacy standards expected of graduate students.

摘要

引言

人工智能(AI)的变革性特征在于其具有强大的能力,能够将非结构化数据进行解读并转化为连贯且有意义的内容。总体而言,人工智能改变学生研究及其评估传统方法的潜力似乎很大。在全球健康研究方面,学生和研究专家评估生成式人工智能(GenAI)在此领域的优势和局限性非常重要。因此,我们研究的目的是将GenAI的信息素养与研究生撰写研究论文时应达到的期望进行比较评估。

方法

在凯斯西储大学(CWRU)完成“全球健康基础”(INTH 401)课程后,招募成功完成必修研究论文的研究生,将他们的原创论文与使用原始作业提示由ChatGPT-4o生成的论文进行比较。学生们还完成了一份谷歌表单调查问卷,以评估人工智能生成论文的不同部分(例如,是否符合引言指南、三个观点的阐述、结论)以及他们的原创论文,以及他们对人工智能工作的总体满意度。学生与ChatGPT-4o的原始比较还能够评估叙述元素和参考文献。

结果

在完成必修研究论文的54名学生中,28名(51.8%)同意参与比较项目。调查回复总结表明,学生们认为人工智能生成的论文不如或类似于他们自己的论文(总体满意度平均为2.39(1.61 - 3.17);李克特量表:1至5分,分数越低表明质量越差)。对5个李克特项目问题的学生个人平均回复进行评估显示,17个分数<2.9;7个分数在3.0至3.9之间;4个分数≥4.0,这与人工智能生成的论文质量较差一致。对ChatGPT-4o选择的参考文献(共729条参考文献)进行评估发现,54%(n = 396)是真实的,46%(n = 333)不存在。在真实参考文献中,26.5%(105/396)与论文叙述相关;占729条参考文献总数的14.4%。

讨论

我们的研究结果揭示了人工智能工具在协助理解全球健康主题复杂性方面的优势和局限性。学生提到的优势包括ChatGPT-4o能够非常快速地生成内容,并能在论文的三个观点部分提出他们未曾考虑过的主题。持续呈现最新事实和参考文献,以及进一步审视或总结全球健康主题的复杂性,似乎是ChatGPT-4o目前的一个局限性。由于ChatGPT-4o生成了不存在的来自高度可信生物医学研究期刊的参考文献,我们的研究结果表明ChatGPT-4o在有效利用信息方面未能通过一个重要环节。此外,歪曲公共卫生信息的可靠来源令人高度担忧,特别是考虑到新冠疫情期间的近期经历以及最近在报道自然灾害的影响和应对情况时。这是GenAI在满足研究生应具备的信息素养标准方面的一个重大局限性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4b2a/11656723/6227b88066a8/13040_2024_408_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4b2a/11656723/9d738cef18f4/13040_2024_408_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4b2a/11656723/fa0075220382/13040_2024_408_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4b2a/11656723/a264ec494bb9/13040_2024_408_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4b2a/11656723/6227b88066a8/13040_2024_408_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4b2a/11656723/9d738cef18f4/13040_2024_408_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4b2a/11656723/fa0075220382/13040_2024_408_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4b2a/11656723/a264ec494bb9/13040_2024_408_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4b2a/11656723/6227b88066a8/13040_2024_408_Fig4_HTML.jpg

相似文献

1
Comparing new tools of artificial intelligence to the authentic intelligence of our global health students.将人工智能的新工具与我们全球健康专业学生的真实智能进行比较。
BioData Min. 2024 Dec 18;17(1):58. doi: 10.1186/s13040-024-00408-7.
2
A Comparative Analysis of Artificial Intelligence Platforms: ChatGPT-4o and Google Gemini in Answering Questions About Birth Control Methods.人工智能平台的比较分析:ChatGPT-4o与谷歌Gemini在回答避孕方法相关问题方面的表现
Cureus. 2025 Jan 1;17(1):e76745. doi: 10.7759/cureus.76745. eCollection 2025 Jan.
3
Can deepseek and ChatGPT be used in the diagnosis of oral pathologies?DeepSeek和ChatGPT能用于口腔病理学诊断吗?
BMC Oral Health. 2025 Apr 25;25(1):638. doi: 10.1186/s12903-025-06034-x.
4
Performance of AI-Chatbots to Common Temporomandibular Joint Disorders (TMDs) Patient Queries: Accuracy, Completeness, Reliability and Readability.人工智能聊天机器人对常见颞下颌关节紊乱病(TMDs)患者问题的回答:准确性、完整性、可靠性和可读性。
Orthod Craniofac Res. 2025 May 7. doi: 10.1111/ocr.12939.
5
Exploring artificial intelligence literacy and the use of ChatGPT and copilot in instruction on nursing academic report writing.探索人工智能素养以及ChatGPT和Copilot在护理学术报告写作教学中的应用。
Nurse Educ Today. 2025 Apr;147:106570. doi: 10.1016/j.nedt.2025.106570. Epub 2025 Jan 14.
6
Evaluating AI-generated patient education materials for spinal surgeries: Comparative analysis of readability and DISCERN quality across ChatGPT and deepseek models.评估用于脊柱手术的人工智能生成的患者教育材料:ChatGPT和DeepSeek模型之间可读性和DISCERN质量的比较分析。
Int J Med Inform. 2025 Jun;198:105871. doi: 10.1016/j.ijmedinf.2025.105871. Epub 2025 Mar 13.
7
Information from digital and human sources: A comparison of chatbot and clinician responses to orthodontic questions.来自数字和人工来源的信息:聊天机器人与临床医生对正畸问题回答的比较。
Am J Orthod Dentofacial Orthop. 2025 May 6. doi: 10.1016/j.ajodo.2025.04.008.
8
Comparative performance of artificial intelligence models in rheumatology board-level questions: evaluating Google Gemini and ChatGPT-4o.人工智能模型在风湿病委员会级问题中的比较性能:评估 Google Gemini 和 ChatGPT-4o。
Clin Rheumatol. 2024 Nov;43(11):3507-3513. doi: 10.1007/s10067-024-07154-5. Epub 2024 Sep 28.
9
Comparative analysis of ChatGPT-4o mini, ChatGPT-4o and Gemini Advanced in the treatment of postmenopausal osteoporosis.ChatGPT-4o mini、ChatGPT-4o与Gemini Advanced在绝经后骨质疏松症治疗中的对比分析。
BMC Musculoskelet Disord. 2025 Apr 16;26(1):369. doi: 10.1186/s12891-025-08601-3.
10
Comparing diagnostic skills in endodontic cases: dental students versus ChatGPT-4o.比较牙髓病病例中的诊断技能:牙科学生与ChatGPT-4o。
BMC Oral Health. 2025 Mar 29;25(1):457. doi: 10.1186/s12903-025-05857-y.

引用本文的文献

1
Evaluation of large language models in patient education and clinical decision support for rotator cuff injury: a two-phase benchmarking study.大型语言模型在肩袖损伤患者教育和临床决策支持中的评估:一项两阶段基准研究。
BMC Med Inform Decis Mak. 2025 Aug 4;25(1):289. doi: 10.1186/s12911-025-03105-5.

本文引用的文献

1
Utilizing large language models in infectious disease transmission modelling for public health preparedness.在传染病传播建模中利用大语言模型进行公共卫生防范。
Comput Struct Biotechnol J. 2024 Aug 8;23:3254-3257. doi: 10.1016/j.csbj.2024.08.006. eCollection 2024 Dec.
2
Reference Hallucination Score for Medical Artificial Intelligence Chatbots: Development and Usability Study.医学人工智能聊天机器人的参考幻觉评分:开发与可用性研究。
JMIR Med Inform. 2024 Jul 31;12:e54345. doi: 10.2196/54345.
3
Using GPT-4 to write a scientific review article: a pilot evaluation study.
使用GPT-4撰写一篇科学综述文章:一项初步评估研究。
BioData Min. 2024 Jun 18;17(1):16. doi: 10.1186/s13040-024-00371-3.
4
Hope for global pandemic treaty rises - despite missed deadline.尽管错过了最后期限,但全球大流行条约仍有希望。
Nature. 2024 Jun;630(8016):282. doi: 10.1038/d41586-024-01658-5.
5
Africa steps up battle against mpox outbreaks.非洲加紧应对猴痘疫情。
Science. 2024 Apr 26;384(6694):373-374. doi: 10.1126/science.adq0315. Epub 2024 Apr 25.
6
The World Health Organization was born as a normative agency: Seventy-five years of global health law under WHO governance.世界卫生组织诞生时是一个规范性机构:在世卫组织治理下的75年全球卫生法。
PLOS Glob Public Health. 2024 Apr 11;4(4):e0002928. doi: 10.1371/journal.pgph.0002928. eCollection 2024.
7
Extracting symptoms from free-text responses using ChatGPT among COVID-19 cases in Hong Kong.利用 ChatGPT 从香港 COVID-19 病例的自由文本回复中提取症状。
Clin Microbiol Infect. 2024 Jan;30(1):142.e1-142.e3. doi: 10.1016/j.cmi.2023.11.002. Epub 2023 Nov 8.
8
Predicting Future Pandemics and Formulating Prevention Strategies: The Role of ChatGPT.预测未来大流行并制定预防策略:ChatGPT的作用。
Cureus. 2023 Sep 7;15(9):e44825. doi: 10.7759/cureus.44825. eCollection 2023 Sep.
9
Talk with ChatGPT About the Outbreak of Mpox in 2022: Reflections and Suggestions from AI Dimensions.与 ChatGPT 谈 2022 年猴痘爆发:人工智能视角的反思与建议。
Ann Biomed Eng. 2023 May;51(5):870-874. doi: 10.1007/s10439-023-03196-z. Epub 2023 Apr 8.
10
ChatGPT Output Regarding Compulsory Vaccination and COVID-19 Vaccine Conspiracy: A Descriptive Study at the Outset of a Paradigm Shift in Online Search for Information.关于强制接种疫苗与新冠疫苗阴谋论的ChatGPT输出:在线信息搜索范式转变初期的一项描述性研究
Cureus. 2023 Feb 15;15(2):e35029. doi: 10.7759/cureus.35029. eCollection 2023 Feb.