Tsui Fuchiang R, Shi Lingyun, Ruiz Victor, Ryan Neal D, Biernesser Candice, Iyengar Satish, Walsh Colin G, Brent David A
Tsui Laboratory, Children's Hospital of Philadelphia, Philadelphia, Pennsylvania, USA.
Department of Anesthesiology and Critical Care Medicine, Children's Hospital of Philadelphia, Philadelphia, Pennsylvania, USA.
JAMIA Open. 2021 Mar 17;4(1):ooab011. doi: 10.1093/jamiaopen/ooab011. eCollection 2021 Jan.
OBJECTIVE: Limited research exists in predicting first-time suicide attempts that account for two-thirds of suicide decedents. We aimed to predict first-time suicide attempts using a large data-driven approach that applies natural language processing (NLP) and machine learning (ML) to unstructured (narrative) clinical notes and structured electronic health record (EHR) data. METHODS: This case-control study included patients aged 10-75 years who were seen between 2007 and 2016 from emergency departments and inpatient units. Cases were first-time suicide attempts from coded diagnosis; controls were randomly selected without suicide attempts regardless of demographics, following a ratio of nine controls per case. Four data-driven ML models were evaluated using 2-year historical EHR data prior to suicide attempt or control index visits, with prediction windows from 7 to 730 days. Patients without any historical notes were excluded. Model evaluation on accuracy and robustness was performed on a blind dataset (30% cohort). RESULTS: The study cohort included 45 238 patients (5099 cases, 40 139 controls) comprising 54 651 variables from 5.7 million structured records and 798 665 notes. Using both unstructured and structured data resulted in significantly greater accuracy compared to structured data alone (area-under-the-curve [AUC]: 0.932 vs. 0.901 < .001). The best-predicting model utilized 1726 variables with AUC = 0.932 (95% CI, 0.922-0.941). The model was robust across multiple prediction windows and subgroups by demographics, points of historical most recent clinical contact, and depression diagnosis history. CONCLUSIONS: Our large data-driven approach using both structured and unstructured EHR data demonstrated accurate and robust first-time suicide attempt prediction, and has the potential to be deployed across various populations and clinical settings.
目的:在预测占自杀死亡者三分之二的首次自杀未遂方面,现有研究有限。我们旨在采用一种大数据驱动的方法来预测首次自杀未遂,该方法将自然语言处理(NLP)和机器学习(ML)应用于非结构化(叙述性)临床记录和结构化电子健康记录(EHR)数据。 方法:本病例对照研究纳入了2007年至2016年间在急诊科和住院部就诊的10至75岁患者。病例为编码诊断的首次自杀未遂;对照组在不考虑人口统计学特征的情况下随机选择,且无自杀未遂情况,病例与对照的比例为1:9。使用自杀未遂或对照指标就诊前2年的历史EHR数据对四个数据驱动的ML模型进行评估,预测窗口为7至730天。排除没有任何历史记录的患者。在一个盲数据集(队列的30%)上对模型的准确性和稳健性进行评估。 结果:研究队列包括45238名患者(5099例病例,40139名对照),包含来自570万条结构化记录和798665条记录的54651个变量。与仅使用结构化数据相比,同时使用非结构化和结构化数据可显著提高准确性(曲线下面积[AUC]:0.932对0.901,P<0.001)。最佳预测模型使用了1726个变量,AUC = 0.932(95%CI,0.922 - 0.941)。该模型在多个预测窗口以及按人口统计学特征、历史上最近临床接触点和抑郁诊断史划分的亚组中都具有稳健性。 结论:我们使用结构化和非结构化EHR数据相结合的大数据驱动方法,证明了对首次自杀未遂的预测准确且稳健,并且有可能在不同人群和临床环境中得到应用。
J Am Med Inform Assoc. 2021-9-18
JAMA Netw Open. 2025-7-1
Br J Psychiatry. 2025-6-23
Croat Med J. 2025-2-28
Diagnostics (Basel). 2025-2-11
J Child Psychol Psychiatry. 2025-8
JAMA Psychiatry. 2019-6-1
BMC Med Inform Decis Mak. 2018-9-14
Int J Environ Res Public Health. 2018-7-6
J Child Psychol Psychiatry. 2018-4-30