Bi Shengxian, Li Gang, Tan Huawei, Chen Yingchun, Guo Dandan
School of Medicine and Health Management, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, Hubei, P.R. China.
School of Public Health, Shanghai Jiao Tong University School of Medicine, Shanghai, P.R. China.
BMC Psychiatry. 2025 Aug 13;25(1):787. doi: 10.1186/s12888-025-07178-4.
Understanding the spatiotemporal characteristics of depression risk in middle-aged and elderly individuals is crucial for early identification and intervention. However, current research predominantly employs machine learning (ML) methods to predict depression risk, often overlooking the spatiotemporal heterogeneity of this risk.
This study utilized five waves of data from the China Health and Retirement Longitudinal Study (CHARLS) and constructed nine long short-term memory (LSTM) frameworks using CNN, BiLSTM, and Attention mechanisms to improve the accuracy and stability of depression risk prediction. Dynamic time windows were employed to handle time data sequences of inconsistent lengths, aligning with the structure of public databases. SHAP (SHapley Additive exPlanations) analysis was used to quantify and visualize the impact of each feature on the prediction results.
Among the nine LSTM frameworks, the CNN-BiLSTM-Attention model demonstrated a potential improvement in predictive performance (AUC between 0.68 and 0.71). It also exhibited the highest stability during feature reduction (∆AUC = 0.0052). SHAP analysis for the LSTM and CNN-BiLSTM-Attention models identified health status and functionality as key factors influencing depression risk in middle-aged and elderly individuals, with pain, gender, sleep duration, and IADL (Instrumental Activities of Daily Living) being the most significant factors.
The LSTM + SHAP analysis framework showed significant application value in handling complex, high-dimensional spatiotemporal data. Future clinical interventions and public health policies should focus more on pain management and chronic disease management in middle-aged and elderly populations to reduce the risk of depression.
了解中老年人群抑郁风险的时空特征对于早期识别和干预至关重要。然而,目前的研究主要采用机器学习(ML)方法来预测抑郁风险,常常忽视这种风险的时空异质性。
本研究利用中国健康与养老追踪调查(CHARLS)的五轮数据,使用卷积神经网络(CNN)、双向长短期记忆网络(BiLSTM)和注意力机制构建了九个长短期记忆(LSTM)框架,以提高抑郁风险预测的准确性和稳定性。采用动态时间窗口处理长度不一致的时间数据序列,使其与公共数据库的结构相匹配。使用SHAP(Shapley值加法解释)分析来量化和可视化每个特征对预测结果的影响。
在九个LSTM框架中,CNN-BiLSTM-注意力模型在预测性能上有潜在提升(AUC在0.68至0.71之间)。在特征约简过程中,它也表现出最高的稳定性(∆AUC = 0.0052)。对LSTM和CNN-BiLSTM-注意力模型的SHAP分析确定,健康状况和功能是影响中老年人群抑郁风险的关键因素,疼痛、性别、睡眠时间和工具性日常生活活动(IADL)是最显著的因素。
LSTM + SHAP分析框架在处理复杂的高维时空数据方面显示出显著的应用价值。未来的临床干预和公共卫生政策应更多地关注中老年人群的疼痛管理和慢性病管理,以降低抑郁风险。