Department of Health Care Policy, Harvard Medical School, Boston, Massachusetts.
Department of Psychiatry, Harvard Medical School, Boston, Massachusetts.
JAMA Psychiatry. 2023 Mar 1;80(3):230-240. doi: 10.1001/jamapsychiatry.2022.4634.
The months after psychiatric hospital discharge are a time of high risk for suicide. Intensive postdischarge case management, although potentially effective in suicide prevention, is likely to be cost-effective only if targeted at high-risk patients. A previously developed machine learning (ML) model showed that postdischarge suicides can be predicted from electronic health records and geospatial data, but it is unknown if prediction could be improved by adding additional information.
To determine whether model prediction could be improved by adding information extracted from clinical notes and public records.
DESIGN, SETTING, AND PARTICIPANTS: Models were trained to predict suicides in the 12 months after Veterans Health Administration (VHA) short-term (less than 365 days) psychiatric hospitalizations between the beginning of 2010 and September 1, 2012 (299 050 hospitalizations, with 916 hospitalizations followed within 12 months by suicides) and tested in the hospitalizations from September 2, 2012, to December 31, 2013 (149 738 hospitalizations, with 393 hospitalizations followed within 12 months by suicides). Validation focused on net benefit across a range of plausible decision thresholds. Predictor importance was assessed with Shapley additive explanations (SHAP) values. Data were analyzed from January to August 2022.
Suicides were defined by the National Death Index. Base model predictors included VHA electronic health records and patient residential data. The expanded predictors came from natural language processing (NLP) of clinical notes and a social determinants of health (SDOH) public records database.
The model included 448 788 unique hospitalizations. Net benefit over risk horizons between 3 and 12 months was generally highest for the model that included both NLP and SDOH predictors (area under the receiver operating characteristic curve range, 0.747-0.780; area under the precision recall curve relative to the suicide rate range, 3.87-5.75). NLP and SDOH predictors also had the highest predictor class-level SHAP values (proportional SHAP = 64.0% and 49.3%, respectively), although the single highest positive variable-level SHAP value was for a count of medications classified by the US Food and Drug Administration as increasing suicide risk prescribed the year before hospitalization (proportional SHAP = 15.0%).
In this study, clinical notes and public records were found to improve ML model prediction of suicide after psychiatric hospitalization. The model had positive net benefit over 3-month to 12-month risk horizons for plausible decision thresholds. Although caution is needed in inferring causality based on predictor importance, several key predictors have potential intervention implications that should be investigated in future studies.
精神科医院出院后的几个月是自杀的高风险时期。尽管强化出院后病例管理可能对预防自杀有效,但只有针对高风险患者,才可能具有成本效益。以前开发的机器学习(ML)模型表明,可以从电子健康记录和地理空间数据中预测出院后的自杀,但尚不清楚是否可以通过添加其他信息来提高预测能力。
确定通过添加从临床记录和公共记录中提取的信息是否可以提高模型预测能力。
设计、设置和参与者:在退伍军人事务部(VHA)短期(不到 365 天)精神科住院期间,从 2010 年初至 2012 年 9 月 1 日(299050 例住院,其中 916 例在 12 个月内出现自杀),训练模型预测自杀风险,然后在 2012 年 9 月 2 日至 2013 年 12 月 31 日(149738 例住院,其中 393 例在 12 个月内出现自杀)期间进行测试。验证集中在一系列合理决策阈值下的净收益。使用 Shapley 加法解释(SHAP)值评估预测器的重要性。数据分析于 2022 年 1 月至 8 月进行。
通过国家死亡指数定义自杀。基础模型预测因子包括 VHA 电子健康记录和患者居住数据。扩展预测因子来自临床记录的自然语言处理(NLP)和社会决定因素健康(SDOH)公共记录数据库。
该模型包含 448788 个独特的住院记录。在 3 至 12 个月的风险范围内,纳入 NLP 和 SDOH 预测因子的模型的净收益通常最高(接受者操作特征曲线下面积范围,0.747-0.780;相对于自杀率的精度召回曲线下面积范围,3.87-5.75)。NLP 和 SDOH 预测因子的预测器类级别的 SHAP 值也最高(比例 SHAP 分别为 64.0%和 49.3%),尽管单变量级别 SHAP 值最高的是住院前一年被美国食品和药物管理局归类为增加自杀风险的药物数量(比例 SHAP 为 15.0%)。
在这项研究中,发现临床记录和公共记录可以提高精神科住院后自杀的 ML 模型预测能力。该模型在 3 个月至 12 个月的风险范围内具有积极的净收益,适用于合理的决策阈值。尽管基于预测器重要性推断因果关系需要谨慎,但有几个关键预测器具有干预意义,应在未来研究中进行调查。