Belouali Anas, Gupta Samir, Sourirajan Vaibhav, Yu Jiawei, Allen Nathaniel, Alaoui Adil, Dutton Mary Ann, Reinhard Matthew J
Innovation Center for Biomedical Informatics, Georgetown University Medical Center, Washington, DC, USA.
War Related Illness and Injury Study Center, Veterans Affairs Medical Center, Washington, DC, USA.
BioData Min. 2021 Feb 2;14(1):11. doi: 10.1186/s13040-021-00245-y.
Screening for suicidal ideation in high-risk groups such as U.S. veterans is crucial for early detection and suicide prevention. Currently, screening is based on clinical interviews or self-report measures. Both approaches rely on subjects to disclose their suicidal thoughts. Innovative approaches are necessary to develop objective and clinically applicable assessments. Speech has been investigated as an objective marker to understand various mental states including suicidal ideation. In this work, we developed a machine learning and natural language processing classifier based on speech markers to screen for suicidal ideation in US veterans.
Veterans submitted 588 narrative audio recordings via a mobile app in a real-life setting. In addition, participants completed self-report psychiatric scales and questionnaires. Recordings were analyzed to extract voice characteristics including prosodic, phonation, and glottal. The audios were also transcribed to extract textual features for linguistic analysis. We evaluated the acoustic and linguistic features using both statistical significance and ensemble feature selection. We also examined the performance of different machine learning algorithms on multiple combinations of features to classify suicidal and non-suicidal audios.
A combined set of 15 acoustic and linguistic features of speech were identified by the ensemble feature selection. Random Forest classifier, using the selected set of features, correctly identified suicidal ideation in veterans with 86% sensitivity, 70% specificity, and an area under the receiver operating characteristic curve (AUC) of 80%.
Speech analysis of audios collected from veterans in everyday life settings using smartphones offers a promising approach for suicidal ideation detection. A machine learning classifier may eventually help clinicians identify and monitor high-risk veterans.
对美国退伍军人等高风险群体进行自杀意念筛查对于早期发现和预防自杀至关重要。目前,筛查基于临床访谈或自我报告措施。这两种方法都依赖受试者披露其自杀想法。开发客观且临床适用的评估方法需要创新途径。语音已被研究作为一种客观指标来理解包括自杀意念在内的各种心理状态。在这项工作中,我们基于语音标记开发了一种机器学习和自然语言处理分类器,用于筛查美国退伍军人的自杀意念。
退伍军人通过移动应用程序在现实生活环境中提交了588份叙述性音频记录。此外,参与者完成了自我报告的精神科量表和问卷。对记录进行分析以提取语音特征,包括韵律、发声和喉音。音频也被转录以提取文本特征用于语言分析。我们使用统计显著性和集成特征选择来评估声学和语言特征。我们还研究了不同机器学习算法在多种特征组合上对自杀和非自杀音频进行分类的性能。
通过集成特征选择确定了一组15个语音的声学和语言特征的组合。使用选定特征集的随机森林分类器以86%的灵敏度、70%的特异性和80%的受试者工作特征曲线下面积(AUC)正确识别了退伍军人的自杀意念。
使用智能手机在日常生活环境中从退伍军人收集的音频进行语音分析为自杀意念检测提供了一种有前景的方法。机器学习分类器最终可能有助于临床医生识别和监测高风险退伍军人。