PSL-EPHE, Paris, France.
ISCPIF, Institut des Systèmes Complexes, Paris île-de-France, France.
Sci Rep. 2024 May 30;14(1):12468. doi: 10.1038/s41598-024-61557-7.
Post-traumatic stress disorder (PTSD) lacks clear biomarkers in clinical practice. Language as a potential diagnostic biomarker for PTSD is investigated in this study. We analyze an original cohort of 148 individuals exposed to the November 13, 2015, terrorist attacks in Paris. The interviews, conducted 5-11 months after the event, include individuals from similar socioeconomic backgrounds exposed to the same incident, responding to identical questions and using uniform PTSD measures. Using this dataset to collect nuanced insights that might be clinically relevant, we propose a three-step interdisciplinary methodology that integrates expertise from psychiatry, linguistics, and the Natural Language Processing (NLP) community to examine the relationship between language and PTSD. The first step assesses a clinical psychiatrist's ability to diagnose PTSD using interview transcription alone. The second step uses statistical analysis and machine learning models to create language features based on psycholinguistic hypotheses and evaluate their predictive strength. The third step is the application of a hypothesis-free deep learning approach to the classification of PTSD in our cohort. Results show that the clinical psychiatrist achieved a diagnosis of PTSD with an AUC of 0.72. This is comparable to a gold standard questionnaire (Area Under Curve (AUC) ≈ 0.80). The machine learning model achieved a diagnostic AUC of 0.69. The deep learning approach achieved an AUC of 0.64. An examination of model error informs our discussion. Importantly, the study controls for confounding factors, establishes associations between language and DSM-5 subsymptoms, and integrates automated methods with qualitative analysis. This study provides a direct and methodologically robust description of the relationship between PTSD and language. Our work lays the groundwork for advancing early and accurate diagnosis and using linguistic markers to assess the effectiveness of pharmacological treatments and psychotherapies.
创伤后应激障碍(PTSD)在临床实践中缺乏明确的生物标志物。本研究探讨了语言作为 PTSD 的潜在诊断生物标志物。我们分析了一个由 148 名在巴黎 2015 年 11 月 13 日恐怖袭击中暴露的个体组成的原始队列。这些访谈在事件发生后 5-11 个月进行,包括来自相似社会经济背景的个体,他们暴露在相同的事件中,回答相同的问题并使用统一的 PTSD 测量方法。使用这个数据集收集可能具有临床相关性的细微见解,我们提出了一种三步跨学科方法,该方法整合了精神病学、语言学和自然语言处理(NLP)领域的专业知识,以研究语言与 PTSD 之间的关系。第一步评估临床精神科医生仅使用访谈转录诊断 PTSD 的能力。第二步使用统计分析和机器学习模型根据心理语言学假设创建语言特征,并评估其预测强度。第三步是在我们的队列中应用无假设的深度学习方法对 PTSD 进行分类。结果表明,临床精神科医生使用访谈转录诊断 PTSD 的 AUC 为 0.72。这与黄金标准问卷(AUC≈0.80)相当。机器学习模型的诊断 AUC 为 0.69。深度学习方法的 AUC 为 0.64。对模型误差的检查为我们的讨论提供了信息。重要的是,该研究控制了混杂因素,建立了语言与 DSM-5 亚症状之间的关联,并将自动化方法与定性分析相结合。本研究提供了 PTSD 与语言之间关系的直接且方法上稳健的描述。我们的工作为推进早期和准确的诊断以及使用语言标记来评估药物治疗和心理治疗的有效性奠定了基础。