Suppr超能文献

双相情感障碍:使用随机森林和前馈神经网络构建和分析联合诊断模型

Bipolar disorder: Construction and analysis of a joint diagnostic model using random forest and feedforward neural networks.

作者信息

Sun Ping, Wang Xiangwen, Wang Shenghai, Jia Xueyu, Feng Shunkang, Chen Jun, Fang Yiru

机构信息

Qingdao Mental Health Center, Shandong 266034, China.

Clinical Research Center, Shanghai Mental Health Center, Shanghai Jiao Tong University School of Medicine, Shanghai 200030, China.

出版信息

IBRO Neurosci Rep. 2024 Jul 31;17:145-153. doi: 10.1016/j.ibneur.2024.07.007. eCollection 2024 Dec.

Abstract

BACKGROUND

To construct a diagnostic model for Bipolar Disorder (BD) depressive phase using peripheral tissue RNA data from patients and combining Random Forest with Feedforward Neural Network methods.

METHODS

Datasets GSE23848, GSE39653, and GSE69486 were selected, and differential gene expression analysis was conducted using the limma package in R. Key genes from the differentially expressed genes were identified using the Random Forest method. These key genes' expression levels in each sample were used to train a Feedforward Neural Network model. Techniques like L1 regularization, early stopping, and dropout layers were employed to prevent model overfitting. Model performance was then validated, followed by GO, KEGG, and protein-protein interaction network analyses.

RESULTS

The final model was a Feedforward Neural Network with two hidden layers and two dropout layers, comprising 2345 trainable parameters. Model performance on the validation set, assessed through 1000 bootstrap resampling iterations, demonstrated a specificity of 0.769 (95 % CI 0.571-1.000), sensitivity of 0.818 (95 % CI 0.533-1.000), AUC value of 0.832 (95 % CI 0.642-0.979), and accuracy of 0.792 (95 % CI 0.625-0.958). Enrichment analysis of key genes indicated no significant enrichment in any known pathways.

CONCLUSION

Key genes with biological significance were identified based on the decrease in Gini coefficient within the Random Forest model. The combined use of Random Forest and Feedforward Neural Network to establish a diagnostic model showed good classification performance in Bipolar Disorder.

摘要

背景

利用患者外周组织RNA数据,结合随机森林和前馈神经网络方法,构建双相情感障碍(BD)抑郁期的诊断模型。

方法

选择数据集GSE23848、GSE39653和GSE69486,使用R语言中的limma软件包进行差异基因表达分析。采用随机森林方法从差异表达基因中鉴定关键基因。将这些关键基因在每个样本中的表达水平用于训练前馈神经网络模型。采用L1正则化、提前停止和随机失活层等技术防止模型过拟合。然后对模型性能进行验证,随后进行基因本体(GO)、京都基因与基因组百科全书(KEGG)和蛋白质-蛋白质相互作用网络分析。

结果

最终模型是一个具有两个隐藏层和两个随机失活层的前馈神经网络,包含2345个可训练参数。通过1000次自助重采样迭代评估验证集上的模型性能,结果显示特异性为0.769(95%置信区间0.571 - 1.000),敏感性为0.818(95%置信区间0.533 - 1.000),曲线下面积(AUC)值为0.832(95%置信区间0.642 - 0.979),准确率为0.792(95%置信区间0.625 - 0.958)。关键基因的富集分析表明在任何已知通路中均无显著富集。

结论

基于随机森林模型中基尼系数的降低鉴定出具有生物学意义的关键基因。联合使用随机森林和前馈神经网络建立的诊断模型在双相情感障碍中表现出良好的分类性能。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f722/11350441/b3a247ddf9cb/gr1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验