Department of Research Methodology, Measurement and Data Analysis, Behavioral Sciences Faculty, University of Twente, P O Box 217, GW, 7500AE Enschede, The Netherlands.
Psychiatry Res. 2012 Aug 15;198(3):441-7. doi: 10.1016/j.psychres.2012.01.032. Epub 2012 Mar 29.
Much evidence has shown that people's physical and mental health can be predicted by the words they use. However, such verbal information is seldom used in the screening and diagnosis process probably because the procedure to handle these words is rather difficult with traditional quantitative methods. The first challenge would be to extract robust information from diversified expression patterns, the second to transform unstructured text into a structuralized dataset. The present study developed a new textual assessment method to screen the posttraumatic stress disorder (PTSD) patients using lexical features in the self narratives with text mining techniques. Using 300 self narratives collected online, we extracted highly discriminative keywords with the Chi-square algorithm and constructed a textual assessment model to classify individuals with the presence or absence of PTSD. This resulted in a high agreement between computer and psychiatrists' diagnoses for PTSD and revealed some expressive characteristics in the writings of PTSD patients. Although the results of text analysis are not completely analogous to the results of structured interviews in PTSD diagnosis, the application of text mining is a promising addition to assessing PTSD in clinical and research settings.
大量证据表明,人们的身心健康可以通过他们使用的语言来预测。然而,由于传统的定量方法处理这些词的过程相当困难,因此这些语言信息很少在筛选和诊断过程中使用。第一个挑战是从多样化的表达模式中提取稳健的信息,第二个挑战是将非结构化文本转换为结构化数据集。本研究开发了一种新的文本评估方法,利用文本挖掘技术从自我叙述中提取词汇特征来筛选创伤后应激障碍(PTSD)患者。我们使用在线收集的 300 篇自我叙述,使用卡方算法提取了高度有区别性的关键词,并构建了一个文本评估模型,以分类是否存在 PTSD 的个体。这使得计算机和精神科医生的 PTSD 诊断之间达成了很高的一致性,并揭示了 PTSD 患者写作中的一些表达特征。尽管文本分析的结果与 PTSD 诊断中的结构化访谈结果不完全相似,但文本挖掘的应用是在临床和研究环境中评估 PTSD 的一种很有前途的补充方法。