Poulin Chris, Shiner Brian, Thompson Paul, Vepstas Linas, Young-Xu Yinong, Goertzel Benjamin, Watts Bradley, Flashman Laura, McAllister Thomas
The Geisel School of Medicine at Dartmouth College & The Thayer School of Engineering at Dartmouth College, Hanover, New Hampshire, United States of America ; The Durkheim Project, Portsmouth, New Hampshire, United States of America.
United States Department of Veterans Affairs, White River Junction VA Medical Center, White River Junction, Vermont, United States of America.
PLoS One. 2014 Jan 28;9(1):e85733. doi: 10.1371/journal.pone.0085733. eCollection 2014.
We developed linguistics-driven prediction models to estimate the risk of suicide. These models were generated from unstructured clinical notes taken from a national sample of U.S. Veterans Administration (VA) medical records. We created three matched cohorts: veterans who committed suicide, veterans who used mental health services and did not commit suicide, and veterans who did not use mental health services and did not commit suicide during the observation period (n = 70 in each group). From the clinical notes, we generated datasets of single keywords and multi-word phrases, and constructed prediction models using a machine-learning algorithm based on a genetic programming framework. The resulting inference accuracy was consistently 65% or more. Our data therefore suggests that computerized text analytics can be applied to unstructured medical records to estimate the risk of suicide. The resulting system could allow clinicians to potentially screen seemingly healthy patients at the primary care level, and to continuously evaluate the suicide risk among psychiatric patients.
我们开发了语言驱动的预测模型来估计自杀风险。这些模型是从美国退伍军人事务部(VA)医疗记录的全国样本中提取的非结构化临床记录生成的。我们创建了三个匹配队列:自杀的退伍军人、使用心理健康服务但未自杀的退伍军人,以及在观察期内未使用心理健康服务且未自杀的退伍军人(每组n = 70)。从临床记录中,我们生成了单关键词和多词短语的数据集,并使用基于遗传编程框架的机器学习算法构建了预测模型。由此产生的推理准确率始终达到65%或更高。因此,我们的数据表明,计算机化文本分析可应用于非结构化医疗记录以估计自杀风险。由此产生的系统可以让临床医生在初级保健层面潜在地筛查看似健康的患者,并持续评估精神病患者的自杀风险。