• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

ChatGPT 参加 DFPH 考试:大型语言模型的表现及其支持公共卫生学习的潜力。

ChatGPT sits the DFPH exam: large language model performance and potential to support public health learning.

机构信息

Nottingham Centre for Public Health and Epidemiology, University of Nottingham, Nottingham City Hospital, Hucknall Rd, Nottingham, NG5 1PB, England.

NHS England, Seaton House, City Link, London Road, Nottingham, NG2 4LA, England.

出版信息

BMC Med Educ. 2024 Jan 11;24(1):57. doi: 10.1186/s12909-024-05042-9.

DOI:10.1186/s12909-024-05042-9
PMID:38212802
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10782695/
Abstract

BACKGROUND

Artificial intelligence-based large language models, like ChatGPT, have been rapidly assessed for both risks and potential in health-related assessment and learning. However, their applications in public health professional exams have not yet been studied. We evaluated the performance of ChatGPT in part of the Faculty of Public Health's Diplomat exam (DFPH).

METHODS

ChatGPT was provided with a bank of 119 publicly available DFPH question parts from past papers. Its performance was assessed by two active DFPH examiners. The degree of insight and level of understanding apparently displayed by ChatGPT was also assessed.

RESULTS

ChatGPT passed 3 of 4 papers, surpassing the current pass rate. It performed best on questions relating to research methods. Its answers had a high floor. Examiners identified ChatGPT answers with 73.6% accuracy and human answers with 28.6% accuracy. ChatGPT provided a mean of 3.6 unique insights per question and appeared to demonstrate a required level of learning on 71.4% of occasions.

CONCLUSIONS

Large language models have rapidly increasing potential as a learning tool in public health education. However, their factual fallibility and the difficulty of distinguishing their responses from that of humans pose potential threats to teaching and learning.

摘要

背景

基于人工智能的大型语言模型,如 ChatGPT,已在健康评估和学习方面迅速被评估其风险和潜力。然而,它们在公共卫生专业考试中的应用尚未得到研究。我们评估了 ChatGPT 在公共卫生专业人员考试(DFPH)部分内容中的表现。

方法

为 ChatGPT 提供了 119 份来自过去试卷的公开可用的 DFPH 问题部分。由两名活跃的 DFPH 考官评估其表现。还评估了 ChatGPT 显然表现出的洞察力程度和理解水平。

结果

ChatGPT 通过了 4 份试卷中的 3 份,超过了目前的通过率。它在与研究方法相关的问题上表现最佳。它的答案基础很高。考官以 73.6%的准确率识别出 ChatGPT 的答案和以 28.6%的准确率识别出人类的答案。ChatGPT 为每个问题平均提供了 3.6 个独特的见解,并且在 71.4%的情况下似乎表现出了所需的学习水平。

结论

大型语言模型在公共卫生教育中作为学习工具的潜力迅速增加。然而,它们在事实方面的错误和难以将其与人类的回答区分开来,对教学和学习构成了潜在威胁。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6007/10782695/6ccc35f006db/12909_2024_5042_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6007/10782695/00be67074026/12909_2024_5042_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6007/10782695/e247611aa219/12909_2024_5042_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6007/10782695/e6ec646951cf/12909_2024_5042_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6007/10782695/6ccc35f006db/12909_2024_5042_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6007/10782695/00be67074026/12909_2024_5042_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6007/10782695/e247611aa219/12909_2024_5042_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6007/10782695/e6ec646951cf/12909_2024_5042_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6007/10782695/6ccc35f006db/12909_2024_5042_Fig4_HTML.jpg

相似文献

1
ChatGPT sits the DFPH exam: large language model performance and potential to support public health learning.ChatGPT 参加 DFPH 考试:大型语言模型的表现及其支持公共卫生学习的潜力。
BMC Med Educ. 2024 Jan 11;24(1):57. doi: 10.1186/s12909-024-05042-9.
2
Assessing question characteristic influences on ChatGPT's performance and response-explanation consistency: Insights from Taiwan's Nursing Licensing Exam.评估问题特征对 ChatGPT 表现和回应解释一致性的影响:来自台湾护理执照考试的见解。
Int J Nurs Stud. 2024 May;153:104717. doi: 10.1016/j.ijnurstu.2024.104717. Epub 2024 Feb 8.
3
Evaluating capabilities of large language models: Performance of GPT-4 on surgical knowledge assessments.评估大语言模型的能力:GPT-4在外科知识评估中的表现。
Surgery. 2024 Apr;175(4):936-942. doi: 10.1016/j.surg.2023.12.014. Epub 2024 Jan 20.
4
Will ChatGPT pass the Polish specialty exam in radiology and diagnostic imaging? Insights into strengths and limitations.ChatGPT能通过波兰放射学与诊断成像专业考试吗?对其优势与局限的洞察。
Pol J Radiol. 2023 Sep 18;88:e430-e434. doi: 10.5114/pjr.2023.131215. eCollection 2023.
5
Assessment of Artificial Intelligence Performance on the Otolaryngology Residency In-Service Exam.人工智能在耳鼻咽喉科住院医师在职考试中的表现评估。
OTO Open. 2023 Nov 29;7(4):e98. doi: 10.1002/oto2.98. eCollection 2023 Oct-Dec.
6
ChatGPT's performance in German OB/GYN exams - paving the way for AI-enhanced medical education and clinical practice.ChatGPT在德国妇产科考试中的表现——为人工智能强化医学教育和临床实践铺平道路。
Front Med (Lausanne). 2023 Dec 13;10:1296615. doi: 10.3389/fmed.2023.1296615. eCollection 2023.
7
Success of ChatGPT, an AI language model, in taking the French language version of the European Board of Ophthalmology examination: A novel approach to medical knowledge assessment.ChatGPT 人工智能语言模型成功通过欧洲眼科委员会法语考试:医学知识评估的新方法。
J Fr Ophtalmol. 2023 Sep;46(7):706-711. doi: 10.1016/j.jfo.2023.05.006. Epub 2023 Aug 1.
8
Trialling a Large Language Model (ChatGPT) in General Practice With the Applied Knowledge Test: Observational Study Demonstrating Opportunities and Limitations in Primary Care.在全科医疗中使用应用知识测试对大型语言模型(ChatGPT)进行试验:观察性研究揭示初级保健中的机遇与局限
JMIR Med Educ. 2023 Apr 21;9:e46599. doi: 10.2196/46599.
9
Performance of ChatGPT-3.5 and ChatGPT-4 on the European Board of Urology (EBU) exams: a comparative analysis.ChatGPT-3.5 和 ChatGPT-4 在欧洲泌尿外科学会(EBU)考试中的表现:比较分析。
World J Urol. 2024 Jul 26;42(1):445. doi: 10.1007/s00345-024-05137-4.
10
Unveiling the ChatGPT phenomenon: Evaluating the consistency and accuracy of endodontic question answers.揭开ChatGPT现象的面纱:评估牙髓病学问题答案的一致性和准确性。
Int Endod J. 2024 Jan;57(1):108-113. doi: 10.1111/iej.13985. Epub 2023 Oct 9.

引用本文的文献

1
Effectiveness of generative artificial intelligence-based teaching versus traditional teaching methods in medical education: a meta-analysis of randomized controlled trials.生成式人工智能辅助教学与传统教学方法在医学教育中的有效性:一项随机对照试验的荟萃分析
BMC Med Educ. 2025 Aug 19;25(1):1175. doi: 10.1186/s12909-025-07750-2.
2
Large Language Model Architectures in Health Care: Scoping Review of Research Perspectives.医疗保健中的大语言模型架构:研究视角的范围综述
J Med Internet Res. 2025 Jun 19;27:e70315. doi: 10.2196/70315.
3
The role of artificial intelligence in medical education: an evaluation of Large Language Models (LLMs) on the Turkish Medical Specialty Training Entrance Exam.

本文引用的文献

1
Evaluating Artificial Intelligence Responses to Public Health Questions.评估人工智能对公共卫生问题的回答。
JAMA Netw Open. 2023 Jun 1;6(6):e2317517. doi: 10.1001/jamanetworkopen.2023.17517.
2
Practical Applications of ChatGPT in Undergraduate Medical Education.ChatGPT在本科医学教育中的实际应用
J Med Educ Curric Dev. 2023 May 24;10:23821205231178449. doi: 10.1177/23821205231178449. eCollection 2023 Jan-Dec.
3
Performance of ChatGPT on the pharmacist licensing examination in Taiwan.ChatGPT 在台湾药剂师执照考试中的表现。
人工智能在医学教育中的作用:对土耳其医学专科培训入学考试中大型语言模型的评估
BMC Med Educ. 2025 Apr 25;25(1):609. doi: 10.1186/s12909-025-07148-0.
4
Can a large language model create acceptable dental board-style examination questions? A cross-sectional prospective study.大型语言模型能否创建可接受的牙科学术委员会风格的考试问题?一项横断面前瞻性研究。
J Dent Sci. 2025 Apr;20(2):895-900. doi: 10.1016/j.jds.2024.08.020. Epub 2024 Sep 11.
5
Generative AI Decision-Making Attributes in Complex Health Services: A Rapid Review.复杂医疗服务中的生成式人工智能决策属性:快速综述
Cureus. 2025 Jan 30;17(1):e78257. doi: 10.7759/cureus.78257. eCollection 2025 Jan.
6
Using ChatGPT for medical education: the technical perspective.将ChatGPT用于医学教育:技术视角
BMC Med Educ. 2025 Feb 7;25(1):201. doi: 10.1186/s12909-025-06785-9.
7
Comparative Analysis of the Response Accuracies of Large Language Models in the Korean National Dental Hygienist Examination Across Korean and English Questions.韩国国家口腔卫生士考试中韩语和英语问题的大语言模型回答准确率的比较分析
Int J Dent Hyg. 2025 May;23(2):267-276. doi: 10.1111/idh.12848. Epub 2024 Oct 16.
8
eHealth Assistant AI Chatbot Using a Large Language Model to Provide Personalized Answers through Secure Decentralized Communication.使用大型语言模型的电子健康助手 AI 聊天机器人,通过安全的去中心化通信提供个性化答案。
Sensors (Basel). 2024 Sep 23;24(18):6140. doi: 10.3390/s24186140.
9
Opportunities, challenges, and future directions of large language models, including ChatGPT in medical education: a systematic scoping review.大型语言模型(包括 ChatGPT 在医学教育中的应用)的机遇、挑战及未来发展方向:系统范围界定综述。
J Educ Eval Health Prof. 2024;21:6. doi: 10.3352/jeehp.2024.21.6. Epub 2024 Mar 15.
10
ChatGPT's Accuracy on Magnetic Resonance Imaging Basics: Characteristics and Limitations Depending on the Question Type.ChatGPT在磁共振成像基础知识方面的准确性:取决于问题类型的特征与局限性
Diagnostics (Basel). 2024 Jan 12;14(2):171. doi: 10.3390/diagnostics14020171.
J Chin Med Assoc. 2023 Jul 1;86(7):653-658. doi: 10.1097/JCMA.0000000000000942. Epub 2023 Jul 5.
4
ChatGPT and the rise of large language models: the new AI-driven infodemic threat in public health.ChatGPT 和大型语言模型的兴起:公共卫生领域新的 AI 驱动的信息疫情威胁。
Front Public Health. 2023 Apr 25;11:1166120. doi: 10.3389/fpubh.2023.1166120. eCollection 2023.
5
ChatGPT goes to the operating room: evaluating GPT-4 performance and its potential in surgical education and training in the era of large language models.ChatGPT走进手术室:在大语言模型时代评估GPT-4在外科教育与培训中的表现及其潜力。
Ann Surg Treat Res. 2023 May;104(5):269-273. doi: 10.4174/astr.2023.104.5.269. Epub 2023 Apr 28.
6
ChatGPT Is Equivalent to First-Year Plastic Surgery Residents: Evaluation of ChatGPT on the Plastic Surgery In-Service Examination.ChatGPT 相当于第一年整形外科住院医师:ChatGPT 在整形外科住院医师年度考核中的评估。
Aesthet Surg J. 2023 Nov 16;43(12):NP1085-NP1089. doi: 10.1093/asj/sjad130.
7
Performance of ChatGPT on UK Standardized Admission Tests: Insights From the BMAT, TMUA, LNAT, and TSA Examinations.ChatGPT在英国标准化入学考试中的表现:来自生物医学入学考试、大学数学入学测试、全国法律入学考试和思维技能评估考试的见解
JMIR Med Educ. 2023 Apr 26;9:e47737. doi: 10.2196/47737.
8
Performance of ChatGPT on USMLE: Potential for AI-assisted medical education using large language models.ChatGPT在美国医师执照考试中的表现:使用大语言模型进行人工智能辅助医学教育的潜力。
PLOS Digit Health. 2023 Feb 9;2(2):e0000198. doi: 10.1371/journal.pdig.0000198. eCollection 2023 Feb.
9
Artificial Hallucinations in ChatGPT: Implications in Scientific Writing.ChatGPT中的人工幻觉:对科学写作的影响
Cureus. 2023 Feb 19;15(2):e35179. doi: 10.7759/cureus.35179. eCollection 2023 Feb.
10
AI for life: Trends in artificial intelligence for biotechnology.为生命服务的人工智能:生物技术领域人工智能的发展趋势
N Biotechnol. 2023 May 25;74:16-24. doi: 10.1016/j.nbt.2023.02.001. Epub 2023 Feb 6.