Park Jin-Hyun, Shin Yu-Bin, Jung Dooyoung, Hur Ji-Won, Pack Seung Pil, Lee Heon-Jeong, Lee Hwamin, Cho Chul-Hyun
Department of Biomedical Informatics, Korea University College of Medicine, Seoul, Republic of Korea.
Department of Psychiatry, Korea University College of Medicine, Seoul, Republic of Korea.
Front Psychiatry. 2025 Jan 7;15:1504190. doi: 10.3389/fpsyt.2024.1504190. eCollection 2024.
Machine learning (ML) is an effective tool for predicting mental states and is a key technology in digital psychiatry. This study aimed to develop ML algorithms to predict the upper tertile group of various anxiety symptoms based on multimodal data from virtual reality (VR) therapy sessions for social anxiety disorder (SAD) patients and to evaluate their predictive performance across each data type.
This study included 32 SAD-diagnosed individuals, and finalized a dataset of 132 samples from 25 participants. It utilized multimodal (physiological and acoustic) data from VR sessions to simulate social anxiety scenarios. This study employed extended Geneva minimalistic acoustic parameter set for acoustic feature extraction and extracted statistical attributes from time series-based physiological responses. We developed ML models that predict the upper tertile group for various anxiety symptoms in SAD using Random Forest, extreme gradient boosting (XGBoost), light gradient boosting machine (LightGBM), and categorical boosting (CatBoost) models. The best parameters were explored through grid search or random search, and the models were validated using stratified cross-validation and leave-one-out cross-validation.
The CatBoost, using multimodal features, exhibited high performance, particularly for the Social Phobia Scale with an area under the receiver operating characteristics curve (AUROC) of 0.852. It also showed strong performance in predicting cognitive symptoms, with the highest AUROC of 0.866 for the Post-Event Rumination Scale. For generalized anxiety, the LightGBM's prediction for the State-Trait Anxiety Inventory-trait led to an AUROC of 0.819. In the same analysis, models using only physiological features had AUROCs of 0.626, 0.744, and 0.671, whereas models using only acoustic features had AUROCs of 0.788, 0.823, and 0.754.
This study showed that a ML algorithm using integrated multimodal data can predict upper tertile anxiety symptoms in patients with SAD with higher performance than acoustic or physiological data obtained during a VR session. The results of this study can be used as evidence for personalized VR sessions and to demonstrate the strength of the clinical use of multimodal data.
机器学习(ML)是预测心理状态的有效工具,也是数字精神病学中的一项关键技术。本研究旨在开发机器学习算法,以基于社交焦虑障碍(SAD)患者虚拟现实(VR)治疗会话中的多模态数据,预测各种焦虑症状的上三分位数组,并评估其在每种数据类型上的预测性能。
本研究纳入了32名被诊断为SAD的个体,并最终确定了来自25名参与者的132个样本的数据集。它利用VR会话中的多模态(生理和声学)数据来模拟社交焦虑场景。本研究采用扩展的日内瓦简约声学参数集进行声学特征提取,并从基于时间序列的生理反应中提取统计属性。我们开发了使用随机森林、极端梯度提升(XGBoost)、轻梯度提升机(LightGBM)和分类提升(CatBoost)模型来预测SAD中各种焦虑症状上三分位数组的机器学习模型。通过网格搜索或随机搜索探索最佳参数,并使用分层交叉验证和留一法交叉验证对模型进行验证。
使用多模态特征的CatBoost表现出高性能,特别是对于社交恐惧量表,其受试者工作特征曲线下面积(AUROC)为0.852。它在预测认知症状方面也表现出强大性能,事件后反刍量表的最高AUROC为0.866。对于广泛性焦虑,LightGBM对状态-特质焦虑量表-特质的预测导致AUROC为0.819。在同一分析中,仅使用生理特征的模型的AUROC分别为0.626、0.744和0.671,而仅使用声学特征的模型的AUROC分别为0.788、0.823和0.754。
本研究表明,使用集成多模态数据的机器学习算法能够比VR会话期间获得的声学或生理数据更高效地预测SAD患者的上三分位数焦虑症状。本研究结果可用作个性化VR会话的证据,并证明多模态数据临床应用的优势。