Institute of Health Informatics University College, London, London, United Kingdom.
Great Ormond Street Institute of Child Health, University College London, London, United Kingdom.
JMIR Ment Health. 2024 Oct 18;11:e57400. doi: 10.2196/57400.
Large language models (LLMs) are advanced artificial neural networks trained on extensive datasets to accurately understand and generate natural language. While they have received much attention and demonstrated potential in digital health, their application in mental health, particularly in clinical settings, has generated considerable debate.
This systematic review aims to critically assess the use of LLMs in mental health, specifically focusing on their applicability and efficacy in early screening, digital interventions, and clinical settings. By systematically collating and assessing the evidence from current studies, our work analyzes models, methodologies, data sources, and outcomes, thereby highlighting the potential of LLMs in mental health, the challenges they present, and the prospects for their clinical use.
Adhering to the PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) guidelines, this review searched 5 open-access databases: MEDLINE (accessed by PubMed), IEEE Xplore, Scopus, JMIR, and ACM Digital Library. Keywords used were (mental health OR mental illness OR mental disorder OR psychiatry) AND (large language models). This study included articles published between January 1, 2017, and April 30, 2024, and excluded articles published in languages other than English.
In total, 40 articles were evaluated, including 15 (38%) articles on mental health conditions and suicidal ideation detection through text analysis, 7 (18%) on the use of LLMs as mental health conversational agents, and 18 (45%) on other applications and evaluations of LLMs in mental health. LLMs show good effectiveness in detecting mental health issues and providing accessible, destigmatized eHealth services. However, assessments also indicate that the current risks associated with clinical use might surpass their benefits. These risks include inconsistencies in generated text; the production of hallucinations; and the absence of a comprehensive, benchmarked ethical framework.
This systematic review examines the clinical applications of LLMs in mental health, highlighting their potential and inherent risks. The study identifies several issues: the lack of multilingual datasets annotated by experts, concerns regarding the accuracy and reliability of generated content, challenges in interpretability due to the "black box" nature of LLMs, and ongoing ethical dilemmas. These ethical concerns include the absence of a clear, benchmarked ethical framework; data privacy issues; and the potential for overreliance on LLMs by both physicians and patients, which could compromise traditional medical practices. As a result, LLMs should not be considered substitutes for professional mental health services. However, the rapid development of LLMs underscores their potential as valuable clinical aids, emphasizing the need for continued research and development in this area.
PROSPERO CRD42024508617; https://www.crd.york.ac.uk/prospero/display_record.php?RecordID=508617.
大型语言模型(LLMs)是经过大量数据集训练的先进人工神经网络,能够准确理解和生成自然语言。虽然它们在数字健康领域受到了广泛关注并展示出了潜力,但它们在心理健康领域的应用,特别是在临床环境中的应用,引发了相当大的争议。
本系统评价旨在批判性地评估 LLM 在心理健康中的应用,特别关注它们在早期筛查、数字干预和临床环境中的适用性和疗效。通过系统地整理和评估当前研究的证据,我们的工作分析了模型、方法、数据源和结果,从而突出了 LLM 在心理健康中的潜力、它们所带来的挑战以及它们在临床应用中的前景。
本研究遵循 PRISMA(系统评价和荟萃分析的首选报告项目)指南,在 5 个开放获取数据库中进行检索:MEDLINE(通过 PubMed 访问)、IEEE Xplore、Scopus、JMIR 和 ACM 数字图书馆。使用的关键词是(心理健康 OR 精神疾病 OR 精神障碍 OR 精神病学)和(大型语言模型)。本研究纳入了 2017 年 1 月 1 日至 2024 年 4 月 30 日期间发表的文章,并排除了发表语言非英语的文章。
共评估了 40 篇文章,其中 15 篇(38%)文章涉及通过文本分析检测心理健康状况和自杀意念,7 篇(18%)文章涉及将 LLM 用作心理健康对话代理,18 篇(45%)文章涉及 LLM 在心理健康领域的其他应用和评估。LLM 在检测心理健康问题和提供可及、去污名化的电子健康服务方面显示出了良好的效果。然而,评估结果也表明,当前与临床应用相关的风险可能超过了其益处。这些风险包括生成文本的不一致性、幻觉的产生以及缺乏全面、有基准的伦理框架。
本系统评价考察了 LLM 在心理健康中的临床应用,突出了它们的潜力和固有风险。该研究发现了几个问题:缺乏由专家标注的多语言数据集,对生成内容的准确性和可靠性的担忧,由于 LLM 的“黑盒”性质而导致的可解释性挑战,以及持续存在的伦理困境。这些伦理问题包括缺乏明确、有基准的伦理框架、数据隐私问题,以及医生和患者过度依赖 LLM 的潜在风险,这可能会影响传统的医疗实践。因此,LLM 不应被视为专业心理健康服务的替代品。然而,LLM 的快速发展突显了它们作为有价值的临床辅助工具的潜力,强调了在这一领域继续进行研究和开发的必要性。
PROSPERO CRD42024508617; https://www.crd.york.ac.uk/prospero/display_record.php?RecordID=508617.