Omar Mahmud, Levkovich Inbar
Tel-Aviv University, Faculty of Medicine, Israel.
Oranim Academic College, Kiryat Tiv'on 36006, Israel.
J Affect Disord. 2025 Feb 15;371:234-244. doi: 10.1016/j.jad.2024.11.052. Epub 2024 Nov 22.
Depression is a substantial public health issue, with global ramifications. While initial literature reviews explored the intersection between artificial intelligence (AI) and mental health, they have not yet critically assessed the specific contributions of Large Language Models (LLMs) in this domain. The objective of this systematic review was to examine the usefulness of LLMs in diagnosing and managing depression, as well as to investigate their incorporation into clinical practice.
This review was based on a thorough search of the PubMed, Embase, Web of Science, and Scopus databases for the period January 2018 through March 2024. The search used PROSPERO and adhered to PRISMA guidelines. Original research articles, preprints, and conference papers were included, while non-English and non-research publications were excluded. Data extraction was standardized, and the risk of bias was evaluated using the ROBINS-I, QUADAS-2, and PROBAST tools.
Our review included 34 studies that focused on the application of LLMs in detecting and classifying depression through clinical data and social media texts. LLMs such as RoBERTa and BERT demonstrated high effectiveness, particularly in early detection and symptom classification. Nevertheless, the integration of LLMs into clinical practice is in its nascent stage, with ongoing concerns about data privacy and ethical implications.
LLMs exhibit significant potential for transforming strategies for diagnosing and treating depression. Nonetheless, full integration of LLMs into clinical practice requires rigorous testing, ethical considerations, and enhanced privacy measures to ensure their safe and effective use.
抑郁症是一个重大的公共卫生问题,具有全球影响。虽然最初的文献综述探讨了人工智能(AI)与心理健康的交叉点,但尚未对大语言模型(LLMs)在该领域的具体贡献进行批判性评估。本系统综述的目的是研究大语言模型在抑郁症诊断和管理中的有用性,并调查其在临床实践中的应用情况。
本综述基于对2018年1月至2024年3月期间PubMed、Embase、科学网和Scopus数据库的全面检索。检索使用了PROSPERO并遵循PRISMA指南。纳入原创研究文章、预印本和会议论文,排除非英文和非研究性出版物。数据提取标准化,并使用ROBINS-I、QUADAS-2和PROBAST工具评估偏倚风险。
我们的综述纳入了34项研究,这些研究聚焦于大语言模型通过临床数据和社交媒体文本在检测和分类抑郁症方面的应用。诸如RoBERTa和BERT等大语言模型显示出高效性,尤其是在早期检测和症状分类方面。然而,大语言模型在临床实践中的整合尚处于起步阶段,人们对数据隐私和伦理问题仍存在担忧。
大语言模型在改变抑郁症诊断和治疗策略方面具有巨大潜力。尽管如此,将大语言模型全面整合到临床实践中需要严格测试、伦理考量和加强隐私保护措施,以确保其安全有效地使用。