• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

在识别儿童焦虑复杂病例方面,大型语言模型的表现优于全科医生。

Large language models outperform general practitioners in identifying complex cases of childhood anxiety.

作者信息

Levkovich Inbar, Rabin Eyal, Brann Michal, Elyoseph Zohar

机构信息

The Faculty of Education, Tel Hai College, Upper Galilee, Israel.

Department of Psychology and Education, The Open University of Israel, Ra'anana, Israel.

出版信息

Digit Health. 2024 Dec 15;10:20552076241294182. doi: 10.1177/20552076241294182. eCollection 2024 Jan-Dec.

DOI:10.1177/20552076241294182
PMID:39687523
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11648044/
Abstract

OBJECTIVE

Anxiety is prevalent in childhood but often remains undiagnosed due to its physical manifestations and significant comorbidity. Despite the availability of effective treatments, including medication and psychotherapy, research indicates that physicians struggle to identify childhood anxiety, particularly in complex and challenging cases. This study aims to explore the potential effectiveness of artificial intelligence (AI) language models in diagnosing childhood anxiety compared to general practitioners (GPs).

METHODS

During February 2024, we evaluated the ability of several large language models (LLMs; ChatGPT-3.5 and ChatGPT-4, Claude.AI, Gemini) to identify cases childhood anxiety disorder, compared with reports of GPs.

RESULTS

AI tools exhibited significantly higher rates of identifying anxiety than GPs. Each AI tool accurately identified anxiety in at least one case: Claude.AI and Gemini identified at least four cases, ChatGPT-3 identified three cases, and ChatGPT-4 identified one or two cases. Additionally, 40% of GPs preferred to manage the cases within their practice, often with the help of a practice nurse, whereas AI tools generally recommended referral to specialized mental or somatic health services.

CONCLUSION

Preliminary findings indicate that LLMs, specifically Claude.AI and Gemini, exhibit notable diagnostic capabilities in identifying child anxiety, demonstrating a comparative advantage over GPs.

摘要

目的

焦虑症在儿童中很常见,但由于其身体表现和显著的共病情况,往往仍未得到诊断。尽管有包括药物治疗和心理治疗在内的有效治疗方法,但研究表明,医生在识别儿童焦虑症方面存在困难,尤其是在复杂且具有挑战性的病例中。本研究旨在探讨与全科医生(GP)相比,人工智能(AI)语言模型在诊断儿童焦虑症方面的潜在有效性。

方法

2024年2月期间,我们评估了几个大语言模型(LLM;ChatGPT-3.5、ChatGPT-4、Claude.AI、Gemini)识别儿童焦虑症病例的能力,并与全科医生的报告进行了比较。

结果

人工智能工具识别焦虑症的准确率显著高于全科医生。每个人工智能工具至少在一个病例中准确识别出了焦虑症:Claude.AI和Gemini识别出至少四个病例,ChatGPT-3识别出三个病例,ChatGPT-4识别出一两个病例。此外,40%的全科医生倾向于在其诊所内处理这些病例,通常在实习护士的帮助下进行,而人工智能工具通常建议转诊至专门的精神或躯体健康服务机构。

结论

初步研究结果表明,大语言模型,特别是Claude.AI和Gemini,在识别儿童焦虑症方面具有显著的诊断能力,显示出相对于全科医生的比较优势。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d6a6/11648044/8b5d5290945d/10.1177_20552076241294182-fig3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d6a6/11648044/2033278c01b7/10.1177_20552076241294182-fig1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d6a6/11648044/cd0c96687757/10.1177_20552076241294182-fig2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d6a6/11648044/8b5d5290945d/10.1177_20552076241294182-fig3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d6a6/11648044/2033278c01b7/10.1177_20552076241294182-fig1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d6a6/11648044/cd0c96687757/10.1177_20552076241294182-fig2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d6a6/11648044/8b5d5290945d/10.1177_20552076241294182-fig3.jpg

相似文献

1
Large language models outperform general practitioners in identifying complex cases of childhood anxiety.在识别儿童焦虑复杂病例方面,大型语言模型的表现优于全科医生。
Digit Health. 2024 Dec 15;10:20552076241294182. doi: 10.1177/20552076241294182. eCollection 2024 Jan-Dec.
2
Assessing prognosis in depression: comparing perspectives of AI models, mental health professionals and the general public.评估抑郁症预后:比较人工智能模型、心理健康专业人员和公众的观点。
Fam Med Community Health. 2024 Jan 9;12(Suppl 1):e002583. doi: 10.1136/fmch-2023-002583.
3
Evaluating text and visual diagnostic capabilities of large language models on questions related to the Breast Imaging Reporting and Data System Atlas 5 edition.评估大语言模型在与《乳腺影像报告和数据系统》第5版相关问题上的文本和视觉诊断能力。
Diagn Interv Radiol. 2025 Mar 3;31(2):111-129. doi: 10.4274/dir.2024.242876. Epub 2024 Sep 9.
4
GP or ChatGPT? Ability of large language models (LLMs) to support general practitioners when prescribing antibiotics.全科医生还是ChatGPT?大型语言模型在开具抗生素处方时支持全科医生的能力。
J Antimicrob Chemother. 2025 May 2;80(5):1324-1330. doi: 10.1093/jac/dkaf077.
5
Comparing the Perspectives of Generative AI, Mental Health Experts, and the General Public on Schizophrenia Recovery: Case Vignette Study.生成式人工智能、心理健康专家和公众对精神分裂症康复的看法比较:案例情节研究。
JMIR Ment Health. 2024 Mar 18;11:e53043. doi: 10.2196/53043.
6
Benchmarking the performance of large language models in uveitis: a comparative analysis of ChatGPT-3.5, ChatGPT-4.0, Google Gemini, and Anthropic Claude3.葡萄膜炎中大型语言模型性能的基准测试:ChatGPT-3.5、ChatGPT-4.0、谷歌Gemini和Anthropic Claude3的比较分析
Eye (Lond). 2025 Apr;39(6):1132-1137. doi: 10.1038/s41433-024-03545-9. Epub 2024 Dec 17.
7
From open-ended to multiple-choice: evaluating diagnostic performance and consistency of ChatGPT, Google Gemini and Claude AI.从开放式到多项选择题:评估ChatGPT、谷歌Gemini和Claude AI的诊断性能与一致性。
Wiad Lek. 2024;77(10):1852-1856. doi: 10.36740/WLek/195125.
8
Gemini AI vs. ChatGPT: A comprehensive examination alongside ophthalmology residents in medical knowledge.Gemini人工智能与ChatGPT对比:与眼科住院医师一起对医学知识进行的全面考察
Graefes Arch Clin Exp Ophthalmol. 2025 Feb;263(2):527-536. doi: 10.1007/s00417-024-06625-4. Epub 2024 Sep 15.
9
Evaluating Large Language Models in Dental Anesthesiology: A Comparative Analysis of ChatGPT-4, Claude 3 Opus, and Gemini 1.0 on the Japanese Dental Society of Anesthesiology Board Certification Exam.评估牙科麻醉学中的大语言模型:ChatGPT-4、Claude 3 Opus和Gemini 1.0在日本麻醉学牙科协会委员会认证考试中的比较分析。
Cureus. 2024 Sep 27;16(9):e70302. doi: 10.7759/cureus.70302. eCollection 2024 Sep.
10
Performance of three artificial intelligence (AI)-based large language models in standardized testing; implications for AI-assisted dental education.三种基于人工智能(AI)的大语言模型在标准化测试中的表现;对人工智能辅助牙科教育的启示。
J Periodontal Res. 2025 Feb;60(2):121-133. doi: 10.1111/jre.13323. Epub 2024 Jul 18.

引用本文的文献

1
Large Language Models in Medical Diagnostics: Scoping Review With Bibliometric Analysis.医学诊断中的大语言模型:基于文献计量分析的综述
J Med Internet Res. 2025 Jun 9;27:e72062. doi: 10.2196/72062.
2
Evaluating Diagnostic Accuracy and Treatment Efficacy in Mental Health: A Comparative Analysis of Large Language Model Tools and Mental Health Professionals.评估心理健康领域的诊断准确性和治疗效果:大语言模型工具与心理健康专业人员的比较分析
Eur J Investig Health Psychol Educ. 2025 Jan 18;15(1):9. doi: 10.3390/ejihpe15010009.

本文引用的文献

1
AI in medical diagnosis: AI prediction & human judgment.人工智能在医疗诊断中的应用:人工智能预测与人类判断。
Artif Intell Med. 2024 Mar;149:102769. doi: 10.1016/j.artmed.2024.102769. Epub 2024 Jan 20.
2
Potential applications and implications of large language models in primary care.大语言模型在初级保健中的潜在应用和影响。
Fam Med Community Health. 2024 Jan 30;12(Suppl 1):e002602. doi: 10.1136/fmch-2023-002602.
3
Health Care Professionals' Views on the Use of Passive Sensing, AI, and Machine Learning in Mental Health Care: Systematic Review With Meta-Synthesis.
卫生保健专业人员对被动感知、人工智能和机器学习在精神卫生保健中的应用的看法:系统评价与元综合。
JMIR Ment Health. 2024 Jan 23;11:e49577. doi: 10.2196/49577.
4
Beyond Personhood: Ethical Paradigms in the Generative Artificial Intelligence Era.超越人格:生成式人工智能时代的伦理范式
Am J Bioeth. 2024 Jan;24(1):57-59. doi: 10.1080/15265161.2023.2278546. Epub 2024 Jan 18.
5
Assessing prognosis in depression: comparing perspectives of AI models, mental health professionals and the general public.评估抑郁症预后:比较人工智能模型、心理健康专业人员和公众的观点。
Fam Med Community Health. 2024 Jan 9;12(Suppl 1):e002583. doi: 10.1136/fmch-2023-002583.
6
Identifying depression and its determinants upon initiating treatment: ChatGPT versus primary care physicians.在开始治疗时识别抑郁及其决定因素:ChatGPT 与初级保健医生。
Fam Med Community Health. 2023 Sep;11(4). doi: 10.1136/fmch-2023-002391.
7
Revolutionizing healthcare: the role of artificial intelligence in clinical practice.人工智能在临床实践中的应用:医疗保健的革命。
BMC Med Educ. 2023 Sep 22;23(1):689. doi: 10.1186/s12909-023-04698-z.
8
Suicide Risk Assessments Through the Eyes of ChatGPT-3.5 Versus ChatGPT-4: Vignette Study.通过ChatGPT-3.5与ChatGPT-4视角进行的自杀风险评估:案例研究
JMIR Ment Health. 2023 Sep 20;10:e51232. doi: 10.2196/51232.
9
The plasticity of ChatGPT's mentalizing abilities: personalization for personality structures.ChatGPT心理理论能力的可塑性:针对人格结构的个性化
Front Psychiatry. 2023 Sep 1;14:1234397. doi: 10.3389/fpsyt.2023.1234397. eCollection 2023.
10
Temporal patterns in the recorded annual incidence of common mental disorders over two decades in the United Kingdom: a primary care cohort study.二十年来英国常见精神障碍记录发病率的时间模式:一项初级保健队列研究。
Psychol Med. 2024 Mar;54(4):663-674. doi: 10.1017/S0033291723002349. Epub 2023 Aug 22.