Early Psychosis: Interventions and Clinical-Detection (EPIC) Lab, Department of Psychosis Studies, Institute of Psychiatry, Psychology & Neuroscience, King's College London, London, UK.
South London and Maudsley NHS Foundation Trust, London, UK.
Schizophr Bull. 2021 Mar 16;47(2):405-414. doi: 10.1093/schbul/sbaa126.
Using novel data mining methods such as natural language processing (NLP) on electronic health records (EHRs) for screening and detecting individuals at risk for psychosis.
The study included all patients receiving a first index diagnosis of nonorganic and nonpsychotic mental disorder within the South London and Maudsley (SLaM) NHS Foundation Trust between January 1, 2008, and July 28, 2018. Least Absolute Shrinkage and Selection Operator (LASSO)-regularized Cox regression was used to refine and externally validate a refined version of a five-item individualized, transdiagnostic, clinically based risk calculator previously developed (Harrell's C = 0.79) and piloted for implementation. The refined version included 14 additional NLP-predictors: tearfulness, poor appetite, weight loss, insomnia, cannabis, cocaine, guilt, irritability, delusions, hopelessness, disturbed sleep, poor insight, agitation, and paranoia.
A total of 92 151 patients with a first index diagnosis of nonorganic and nonpsychotic mental disorder within the SLaM Trust were included in the derivation (n = 28 297) or external validation (n = 63 854) data sets. Mean age was 33.6 years, 50.7% were women, and 67.0% were of white race/ethnicity. Mean follow-up was 1590 days. The overall 6-year risk of psychosis in secondary mental health care was 3.4 (95% CI, 3.3-3.6). External validation indicated strong performance on unseen data (Harrell's C 0.85, 95% CI 0.84-0.86), an increase of 0.06 from the original model.
Using NLP on EHRs can considerably enhance the prognostic accuracy of psychosis risk calculators. This can help identify patients at risk of psychosis who require assessment and specialized care, facilitating earlier detection and potentially improving patient outcomes.
使用自然语言处理(NLP)等新的数据挖掘方法对电子健康记录(EHR)进行筛查和检测,以发现有患精神病风险的个体。
该研究纳入了 2008 年 1 月 1 日至 2018 年 7 月 28 日期间在南伦敦和莫兹利国民保健信托基金会(SLaM)接受首次非器质性和非精神病性精神障碍索引诊断的所有患者。最小绝对收缩和选择算子(LASSO)正则化 Cox 回归用于精炼和外部验证之前开发的(Harrell 的 C = 0.79)并试点实施的五要素个体化、跨诊断、基于临床的风险计算器的精炼版本。精炼版本包括 14 个额外的 NLP 预测因子:流泪、食欲不振、体重减轻、失眠、大麻、可卡因、内疚、易怒、妄想、绝望、睡眠障碍、洞察力差、激动和偏执。
在 SLaM 信托基金中,共有 92151 名首次被诊断为非器质性和非精神病性精神障碍的患者纳入了推导(n = 28297)或外部验证(n = 63854)数据集。平均年龄为 33.6 岁,50.7%为女性,67.0%为白种人。平均随访时间为 1590 天。在二级精神卫生保健中,总体 6 年精神病风险为 3.4(95%CI,3.3-3.6)。外部验证表明,在未见数据上表现良好(Harrell 的 C 为 0.85,95%CI 为 0.84-0.86),比原始模型提高了 0.06。
在 EHR 中使用 NLP 可以显著提高精神病风险计算器的预测准确性。这有助于识别有精神病风险的患者,这些患者需要评估和专门护理,从而更早地发现并可能改善患者的预后。