使用隐私保护大语言模型从医学文本中检测自杀倾向。

Detection of suicidality from medical text using privacy-preserving large language models.

作者信息

Wiest Isabella Catharina, Verhees Falk Gerrik, Ferber Dyke, Zhu Jiefu, Bauer Michael, Lewitzka Ute, Pfennig Andrea, Mikolas Pavol, Kather Jakob Nikolas

机构信息

Else Kroener Fresenius Center for Digital Health, Technical University Dresden, Dresden, Germany; and Department of Medicine II, Medical Faculty Mannheim, Heidelberg University, Mannheim, Germany.

Department of Psychiatry and Psychotherapy, Carl Gustav Carus University Hospital, Technical University Dresden, Dresden, Germany.

出版信息

Br J Psychiatry. 2024 Dec;225(6):532-537. doi: 10.1192/bjp.2024.134.

DOI:10.1192/bjp.2024.134

PMID:39497458

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11669470/

Abstract

BACKGROUND

Attempts to use artificial intelligence (AI) in psychiatric disorders show moderate success, highlighting the potential of incorporating information from clinical assessments to improve the models. This study focuses on using large language models (LLMs) to detect suicide risk from medical text in psychiatric care.

AIMS

To extract information about suicidality status from the admission notes in electronic health records (EHRs) using privacy-sensitive, locally hosted LLMs, specifically evaluating the efficacy of Llama-2 models.

METHOD

We compared the performance of several variants of the open source LLM Llama-2 in extracting suicidality status from 100 psychiatric reports against a ground truth defined by human experts, assessing accuracy, sensitivity, specificity and F1 score across different prompting strategies.

RESULTS

A German fine-tuned Llama-2 model showed the highest accuracy (87.5%), sensitivity (83.0%) and specificity (91.8%) in identifying suicidality, with significant improvements in sensitivity and specificity across various prompt designs.

CONCLUSIONS

The study demonstrates the capability of LLMs, particularly Llama-2, in accurately extracting information on suicidality from psychiatric records while preserving data privacy. This suggests their application in surveillance systems for psychiatric emergencies and improving the clinical management of suicidality by improving systematic quality control and research.

摘要

背景

在精神疾病中尝试使用人工智能（AI）取得了一定成功，这凸显了整合临床评估信息以改进模型的潜力。本研究聚焦于使用大语言模型（LLMs）从精神科护理中的医学文本中检测自杀风险。

目的

使用隐私敏感的本地托管大语言模型，从电子健康记录（EHRs）中的入院记录中提取有关自杀状态的信息，特别评估Llama-2模型的有效性。

方法

我们将开源大语言模型Llama-2的几个变体在从100份精神科报告中提取自杀状态的性能与人类专家定义的基本事实进行比较，评估不同提示策略下的准确性、敏感性、特异性和F1分数。

结果

一个德国微调的Llama-2模型在识别自杀状态方面显示出最高的准确性（87.5%）、敏感性（83.0%）和特异性（91.8%），在各种提示设计下敏感性和特异性都有显著提高。

结论

该研究证明了大语言模型，特别是Llama-2，在保护数据隐私的同时从精神科记录中准确提取自杀相关信息的能力。这表明它们可应用于精神科紧急情况监测系统，并通过改善系统质量控制和研究来改进自杀状态的临床管理。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e6de/11669470/b80bcdaebea9/S000712502400134X_fig1.jpg

相似文献

Detection of suicidality from medical text using privacy-preserving large language models.使用隐私保护大语言模型从医学文本中检测自杀倾向。

Br J Psychiatry. 2024 Dec;225(6):532-537. doi: 10.1192/bjp.2024.134.

Scalable information extraction from free text electronic health records using large language models.使用大语言模型从自由文本电子健康记录中进行可扩展的信息提取。

BMC Med Res Methodol. 2025 Jan 28;25(1):23. doi: 10.1186/s12874-025-02470-z.

Privacy-ensuring Open-weights Large Language Models Are Competitive with Closed-weights GPT-4o in Extracting Chest Radiography Findings from Free-Text Reports.在从自由文本报告中提取胸部X光检查结果方面，确保隐私的开放权重大型语言模型与封闭权重的GPT-4o具有竞争力。

Radiology. 2025 Jan;314(1):e240895. doi: 10.1148/radiol.240895.

Unlocking the Secrets Behind Advanced Artificial Intelligence Language Models in Deidentifying Chinese-English Mixed Clinical Text: Development and Validation Study.揭开高级人工智能语言模型在去识别汉英混合临床文本背后的秘密：开发与验证研究。

J Med Internet Res. 2024 Jan 25;26:e48443. doi: 10.2196/48443.

Engineering of Generative Artificial Intelligence and Natural Language Processing Models to Accurately Identify Arrhythmia Recurrence.用于准确识别心律失常复发的生成式人工智能和自然语言处理模型的工程设计。

Circ Arrhythm Electrophysiol. 2025 Jan;18(1):e013023. doi: 10.1161/CIRCEP.124.013023. Epub 2024 Dec 16.

Leveraging large language models to mimic domain expert labeling in unstructured text-based electronic healthcare records in non-english languages.利用大语言模型在非英语的基于文本的非结构化电子健康记录中模拟领域专家标注。

BMC Med Inform Decis Mak. 2025 Mar 31;25(1):154. doi: 10.1186/s12911-025-02871-6.

Utilizing large language models for detecting hospital-acquired conditions: an empirical study on pulmonary embolism.利用大语言模型检测医院获得性疾病：关于肺栓塞的实证研究

J Am Med Inform Assoc. 2025 May 1;32(5):876-884. doi: 10.1093/jamia/ocaf048.

Using Synthetic Health Care Data to Leverage Large Language Models for Named Entity Recognition: Development and Validation Study.利用合成医疗保健数据借助大语言模型进行命名实体识别：开发与验证研究。

J Med Internet Res. 2025 Mar 18;27:e66279. doi: 10.2196/66279.

Automatic quantitative stroke severity assessment based on Chinese clinical named entity recognition with domain-adaptive pre-trained large language model.基于具有领域自适应预训练的大型语言模型的中文临床命名实体识别的自动定量卒中严重程度评估。

Artif Intell Med. 2024 Apr;150:102822. doi: 10.1016/j.artmed.2024.102822. Epub 2024 Feb 27.

Large language models for data extraction from unstructured and semi-structured electronic health records: a multiple model performance evaluation.用于从非结构化和半结构化电子健康记录中提取数据的大语言模型：多模型性能评估

BMJ Health Care Inform. 2025 Jan 19;32(1):e101139. doi: 10.1136/bmjhci-2024-101139.

引用本文的文献

Large Language Models for Psychiatric Phenotype Extraction from Electronic Health Records.用于从电子健康记录中提取精神疾病表型的大语言模型

medRxiv. 2025 Aug 12:2025.08.07.25333172. doi: 10.1101/2025.08.07.25333172.

Using Open-Source Large Language Models to Identify Access to Germline Genetic Testing in Veterans With Breast Cancer From Unstructured Text.利用开源大语言模型从非结构化文本中识别乳腺癌退伍军人获得种系基因检测的情况。

JCO Clin Cancer Inform. 2025 Jul;9:e2400263. doi: 10.1200/CCI-24-00263. Epub 2025 Jul 22.

A highly scalable deep learning language model for common risks prediction among psychiatric inpatients.一种用于预测精神科住院患者常见风险的高度可扩展深度学习语言模型。

BMC Med. 2025 May 28;23(1):308. doi: 10.1186/s12916-025-04150-7.

LLM-AIx: An open source pipeline for Information Extraction from unstructured medical text based on privacy preserving Large Language Models.LLM-AIx：一种基于隐私保护大语言模型从非结构化医学文本中提取信息的开源管道。

medRxiv. 2024 Sep 3:2024.09.02.24312917. doi: 10.1101/2024.09.02.24312917.

本文引用的文献

Privacy-preserving large language models for structured medical information retrieval.用于结构化医学信息检索的隐私保护大语言模型

NPJ Digit Med. 2024 Sep 20;7(1):257. doi: 10.1038/s41746-024-01233-2.

Beyond rating scales: With targeted evaluation, large language models are poised for psychological assessment.超越评分量表：通过有针对性的评估，大语言模型有望用于心理评估。

Psychiatry Res. 2024 Mar;333:115667. doi: 10.1016/j.psychres.2023.115667. Epub 2023 Dec 10.

A Systematic Evaluation of Machine Learning-Based Biomarkers for Major Depressive Disorder.基于机器学习的重度抑郁症生物标志物的系统评价

JAMA Psychiatry. 2024 Apr 1;81(4):386-395. doi: 10.1001/jamapsychiatry.2023.5083.

The future landscape of large language models in medicine.医学领域大语言模型的未来前景。

Commun Med (Lond). 2023 Oct 10;3(1):141. doi: 10.1038/s43856-023-00370-1.

Evaluating the Application of Large Language Models in Clinical Research Contexts.评估大语言模型在临床研究背景下的应用。

JAMA Netw Open. 2023 Oct 2;6(10):e2335924. doi: 10.1001/jamanetworkopen.2023.35924.

Ethics of large language models in medicine and medical research.医学及医学研究中大型语言模型的伦理问题。

Lancet Digit Health. 2023 Jun;5(6):e333-e335. doi: 10.1016/S2589-7500(23)00083-3. Epub 2023 Apr 27.

Inpatient suicide in psychiatric settings: Evaluation of current prevention measures.精神科住院患者自杀：当前预防措施的评估

Front Psychiatry. 2022 Oct 28;13:997974. doi: 10.3389/fpsyt.2022.997974. eCollection 2022.

Suicide theory-guided natural language processing of clinical progress notes to improve prediction of veteran suicide risk: protocol for a mixed-method study.基于自杀理论的临床进展记录自然语言处理提高退伍军人自杀风险预测：一项混合方法研究的方案。

BMJ Open. 2022 Aug 24;12(8):e065088. doi: 10.1136/bmjopen-2022-065088.

Using weak supervision and deep learning to classify clinical notes for identification of current suicidal ideation.利用弱监督和深度学习对临床记录进行分类，以识别当前的自杀意念。

J Psychiatr Res. 2021 Apr;136:95-102. doi: 10.1016/j.jpsychires.2021.01.052. Epub 2021 Feb 2.

Multimodal Machine Learning Workflows for Prediction of Psychosis in Patients With Clinical High-Risk Syndromes and Recent-Onset Depression.多模态机器学习工作流程用于预测有临床高风险综合征和近期发病的抑郁症患者的精神病。

JAMA Psychiatry. 2021 Feb 1;78(2):195-209. doi: 10.1001/jamapsychiatry.2020.3604.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

使用隐私保护大语言模型从医学文本中检测自杀倾向。

Detection of suicidality from medical text using privacy-preserving large language models.

作者信息

机构信息

出版信息

BACKGROUND

AIMS

METHOD

RESULTS

CONCLUSIONS

背景

目的

方法

结果

结论

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献