Department of Research and Evaluation, Kaiser Permanente Southern California, Pasadena, CA, USA.
Perm J. 2022 Apr 5;26(1):85-93. doi: 10.7812/TPP/21.102.
The purpose of this study was to develop a natural language processing algorithm to identify suicidal ideation/attempt from free-text clinical notes.
Clinical notes containing prespecified keywords related to suicidal ideation/attempts from 2010 to 2018 were extracted from our organization's electronic health record system. A random sample of 864 clinical notes was selected and equally divided into 4 subsets. These subsets were reviewed and classified as 1 of the following 3 suicidal ideation/attempt categories (current, historical, and no) by experienced research chart abstractors. The first 3 data sets were used to develop the rule-based computerized algorithm sequentially and the fourth data set was used to evaluate the algorithm's performance. The validated algorithm was then applied to the entire study sample of clinical notes.
The computerized algorithm correctly identified 23 of the 26 confirmed current suicidal ideation/attempts and all 10 confirmed historical suicidal ideation/attempts in the validation data set. It produced an 88.5% sensitivity and a 100.0% positive predictive value for current suicidal ideation/attempts, and a 100.0% sensitivity and positive predictive value for historical suicidal ideation/attempts. After applying the computerized algorithm to the entire set of study notes, we identified a total of 1,050,287 current ideation/attempt events and 293,037 historical ideation/attempt events documented in clinical notes. Those for which current ideation/attempt events were documented were more likely to be female (59.5%), 25-44 years old (28.3%), and White (43.4%).
Our study demonstrated that a computerized algorithm can effectively identify suicidal ideation/attempts from clinical notes. This algorithm can be utilized in support of suicide prevention research programs and patient care quality improvement initiatives.
本研究旨在开发一种自然语言处理算法,以从临床记录的自由文本中识别自杀意念/企图。
从我们组织的电子健康记录系统中提取了 2010 年至 2018 年包含与自杀意念/企图相关的预定关键字的临床记录。随机选择了 864 份临床记录,并将其等分为 4 个子集。由经验丰富的研究图表摘要员对这些子集进行审查和分类,分为以下 3 种自杀意念/企图类别(当前、历史和无)之一。前 3 个数据集依次用于开发基于规则的计算机算法,第四个数据集用于评估算法的性能。然后将经过验证的算法应用于整个研究样本的临床记录。
计算机算法正确识别了验证数据集中 26 例确诊的当前自杀意念/企图中的 23 例,以及 10 例确诊的历史自杀意念/企图。它对当前自杀意念/企图的敏感性为 88.5%,阳性预测值为 100.0%,对历史自杀意念/企图的敏感性和阳性预测值均为 100.0%。在将计算机算法应用于整个研究记录集后,我们在临床记录中总共确定了 1050287 例当前意念/企图事件和 293037 例历史意念/企图事件。记录了当前意念/企图事件的患者更可能是女性(59.5%)、25-44 岁(28.3%)和白人(43.4%)。
我们的研究表明,计算机算法可以有效地从临床记录中识别自杀意念/企图。该算法可用于支持自杀预防研究计划和患者护理质量改进计划。