Lepow Lauren A, Adekkanattu Prakash, Cusick Marika, Coon Hilary, Fennessy Brian, O'Connell Shane, Pierce Charlotte, Rabbany Jessica, Sharma Mohit, Olfson Mark, Bakian Amanda, Xiao Yunyu, Mullins Niamh, Nadkarni Girish N, Charney Alexander W, Pathak Jyotishman, Mann J John
medRxiv. 2024 Dec 20:2024.12.19.24319352. doi: 10.1101/2024.12.19.24319352.
Diagnostic codes in the Electronic Health Record (EHR) are known to be limited in reporting patient suicidality, and especially in differentiating the levels of suicide severity.
The authors developed and validated a portable natural language processing (NLP) algorithm for detection of suicidal ideation (SI) and suicide-related behavior and attempts (SB/SA) in EHR data. The algorithm was then deployed, and SI and SB/SA ascertainment was compared to that of International Statistical Classification of Diseases (ICD-9 and 10) diagnostic codes.
A group of experts designed the pipeline to detect and distinguish suicide severity based on the Columbia-Suicide Severity Rating Scale (C-SSRS). Notes were manually annotated to create the "Gold Standard" with which the algorithm output was evaluated for accuracy.
The algorithm was developed at two academic medical centers, Weill Cornell Medicine (WCM), the Mount Sinai Health System (MSHS), and tested at these two, plus a third, the University of Utah Healthcare Center (UUHSC).
Notes were from participants with psychiatric encounters at the three institutions.
The two main outcomes were the accuracy scores of the NLP pipeline and comparison of ascertainment rates to ICD codes.
F1 accuracy scores ranged from 0.86-0.97 at the three sites. The NLP rate of detection of SB/SA was almost 30 times higher, and SI was almost 10 times higher, when compared with that of diagnostic codes. NLP detected almost all cases detected by diagnostic codes. No bias in performance was found for race/ethnicity and performance was comparable in psychiatric and non-psychiatric EHRs.
EHRs from cohorts with psychiatric diagnoses or encounters at WCM, MSHS, and UUHSC had SI and SB/SA extracted using an NLP algorithm based on parameters defined by the C-SSRS. Validity was determined by comparing the algorithm output to manual annotations of clinical notes by domain experts. NLP-detection of SI and SB/SA was compared with that of ICD codes across a range of demographic groups. Algorithm performance was also examined for bias in minoritized groups and in non-psychiatric notes.
Can we automate the extraction of data available in clinical notes to accurately detect and distinguish patients with suicidal ideation (SI) and suicidal behavior (SB)? Our Natural Language Processing (NLP) approach was able to identify and distinguish SI and SB at three different hospital systems with benchmarked accuracy scores (above 0.85). The rate of detecting SI and SB using the algorithm was 10-30 times that of diagnostic codes found in the Electronic Health Record. Our algorithm renders the use of International Classification of Disease (ICD) diagnostic codes for SI and SB ascertainment obsolete.
电子健康记录(EHR)中的诊断代码在报告患者自杀倾向方面存在局限性,尤其是在区分自杀严重程度方面。
作者开发并验证了一种便携式自然语言处理(NLP)算法,用于检测EHR数据中的自杀意念(SI)以及与自杀相关的行为和企图(SB/SA)。然后部署该算法,并将SI和SB/SA的确定情况与国际疾病分类(ICD - 9和10)诊断代码进行比较。
一组专家设计了基于哥伦比亚自杀严重程度评定量表(C - SSRS)来检测和区分自杀严重程度的流程。对病历进行人工标注以创建“金标准”,并据此评估算法输出的准确性。
该算法在两个学术医疗中心——威尔康奈尔医学院(WCM)和西奈山医疗系统(MSHS)开发,并在这两个中心以及第三个中心——犹他大学医疗中心(UUHSC)进行测试。
病历来自这三个机构中患有精神疾病的患者。
两个主要结果是NLP流程的准确性得分以及与ICD代码确定率的比较。
三个地点的F1准确性得分在0.86 - 0.97之间。与诊断代码相比,NLP检测SB/SA的比率几乎高出30倍,检测SI的比率几乎高出10倍。NLP检测到了诊断代码检测出的几乎所有病例。未发现种族/民族方面的性能偏差,并且在精神科和非精神科EHR中的性能相当。
来自WCM、MSHS和UUHSC有精神疾病诊断或就诊记录队列的EHR中,使用基于C - SSRS定义参数的NLP算法提取了SI和SB/SA。通过将算法输出与领域专家对临床病历的人工标注进行比较来确定有效性。在一系列人口统计学群体中,将NLP检测的SI和SB/SA与ICD代码进行了比较。还检查了算法在少数群体和非精神科病历中的性能偏差。
我们能否自动提取临床病历中的可用数据,以准确检测和区分有自杀意念(SI)和自杀行为(SB)的患者?我们的自然语言处理(NLP)方法能够在三个不同的医院系统中识别和区分SI和SB,准确性得分达到基准水平(高于0.85)。使用该算法检测SI和SB的比率是电子健康记录中诊断代码的10 - 30倍。我们的算法使使用国际疾病分类(ICD)诊断代码来确定SI和SB过时。