Depression Clinic and Research Program, Department of Psychiatry, Massachusetts General Hospital, Boston, MA, USA.
Psychol Med. 2012 Jan;42(1):41-50. doi: 10.1017/S0033291711000997. Epub 2011 Jun 20.
Electronic medical records (EMR) provide a unique opportunity for efficient, large-scale clinical investigation in psychiatry. However, such studies will require development of tools to define treatment outcome.
Natural language processing (NLP) was applied to classify notes from 127 504 patients with a billing diagnosis of major depressive disorder, drawn from out-patient psychiatry practices affiliated with multiple, large New England hospitals. Classifications were compared with results using billing data (ICD-9 codes) alone and to a clinical gold standard based on chart review by a panel of senior clinicians. These cross-sectional classifications were then used to define longitudinal treatment outcomes, which were compared with a clinician-rated gold standard.
Models incorporating NLP were superior to those relying on billing data alone for classifying current mood state (area under receiver operating characteristic curve of 0.85-0.88 v. 0.54-0.55). When these cross-sectional visits were integrated to define longitudinal outcomes and incorporate treatment data, 15% of the cohort remitted with a single antidepressant treatment, while 13% were identified as failing to remit despite at least two antidepressant trials. Non-remitting patients were more likely to be non-Caucasian (p<0.001).
The application of bioinformatics tools such as NLP should enable accurate and efficient determination of longitudinal outcomes, enabling existing EMR data to be applied to clinical research, including biomarker investigations. Continued development will be required to better address moderators of outcome such as adherence and co-morbidity.
电子病历(EMR)为精神病学领域高效、大规模的临床研究提供了独特的机会。然而,此类研究将需要开发定义治疗结果的工具。
自然语言处理(NLP)被应用于对来自多家大型新英格兰医院附属的门诊精神病学实践的 127504 名主要抑郁障碍计费诊断患者的记录进行分类。分类结果与仅使用计费数据(ICD-9 代码)的结果以及基于高级临床医生小组图表审查的临床金标准进行了比较。然后,这些横断面分类被用于定义纵向治疗结果,并与临床医生评定的金标准进行比较。
与仅依赖计费数据的模型相比,纳入 NLP 的模型在分类当前情绪状态方面表现更优(接受者操作特征曲线下面积为 0.85-0.88 比 0.54-0.55)。当这些横断面就诊被整合以定义纵向结果并纳入治疗数据时,队列中有 15%的患者在接受单一抗抑郁药物治疗后缓解,而 13%的患者尽管接受了至少两种抗抑郁药物试验仍未缓解。未缓解的患者更可能是非白种人(p<0.001)。
生物信息学工具(如 NLP)的应用应能实现纵向结果的准确和高效确定,从而使现有 EMR 数据能够应用于临床研究,包括生物标志物研究。需要进一步开发以更好地解决结局的调节因素,如依从性和合并症。