Suppr超能文献

机器学习和生物信息学分析在重度抑郁症患者的大脑和血液 mRNA 谱中的应用:一项病例对照研究。

Machine learning and bioinformatic analysis of brain and blood mRNA profiles in major depressive disorder: A case-control study.

机构信息

Department of Human Genetics, McGill University, Montreal, Quebec, Canada.

Faculty of Science, McGill University, Montreal, Quebec, Canada.

出版信息

Am J Med Genet B Neuropsychiatr Genet. 2021 Mar;186(2):101-112. doi: 10.1002/ajmg.b.32839. Epub 2021 Mar 1.

Abstract

This study analyzed gene expression messenger RNA data, from cases with major depressive disorder (MDD) and controls, using supervised machine learning (ML). We built on the methodology of prior studies to obtain more generalizable/reproducible results. First, we obtained a classifier trained on gene expression data from the dorsolateral prefrontal cortex of post-mortem MDD cases (n = 126) and controls (n = 103). An average area-under-the-receiver-operating-characteristics-curve (AUC) from 10-fold cross-validation of 0.72 was noted, compared to an average AUC of 0.55 for a baseline classifier (p = .0048). The classifier achieved an AUC of 0.76 on a previously unused testing-set. We also performed external validation using DLPFC gene expression values from an independent cohort of matched MDD cases (n = 29) and controls (n = 29), obtained from Affymetrix microarray (vs. Illumina microarray for the original cohort) (AUC: 0.62). We highlighted gene sets differentially expressed in MDD that were enriched for genes identified by the ML algorithm. Next, we assessed the ML classification performance in blood-based microarray gene expression data from MDD cases (n = 1,581) and controls (n = 369). We observed a mean AUC of 0.64 on 10-fold cross-validation, which was significantly above baseline (p = .0020). Similar performance was observed on the testing-set (AUC: 0.61). Finally, we analyzed the classification performance in covariates subgroups. We identified an interesting interaction between smoking and recall performance in MDD case prediction (58% accurate predictions in cases who are smokers vs. 43% accurate predictions in cases who are non-smokers). Overall, our results suggest that ML in combination with gene expression data and covariates could further our understanding of the pathophysiology in MDD.

摘要

这项研究使用监督机器学习 (ML) 分析了来自重度抑郁症 (MDD) 病例和对照组的基因表达信使 RNA 数据。我们在先前研究的方法学基础上进行了扩展,以获得更具普遍性/可重复性的结果。首先,我们从尸检 MDD 病例 (n = 126) 和对照组 (n = 103) 的背外侧前额叶皮层获得了基于基因表达数据训练的分类器。 10 倍交叉验证的平均接收者操作特性曲线下面积 (AUC) 为 0.72,而基线分类器的平均 AUC 为 0.55 (p =.0048)。该分类器在以前未使用的测试集中达到了 0.76 的 AUC。我们还使用来自独立 MDD 病例 (n = 29) 和对照组 (n = 29) 的匹配 DLPFC 基因表达值进行了外部验证,这些值是从 Affymetrix 微阵列获得的(与原始队列中的 Illumina 微阵列相比)(AUC:0.62)。我们突出了 MDD 中差异表达的基因集,这些基因集富含 ML 算法识别的基因。接下来,我们评估了 MDD 病例 (n = 1,581) 和对照组 (n = 369) 的基于血液的微阵列基因表达数据中的 ML 分类性能。我们在 10 倍交叉验证中观察到平均 AUC 为 0.64,明显高于基线 (p =.0020)。在测试集中也观察到了类似的性能 (AUC:0.61)。最后,我们在协变量亚组中分析了分类性能。我们在 MDD 病例预测中发现了吸烟和回忆表现之间的有趣交互作用 (在吸烟者中,准确预测的比例为 58%,而非吸烟者中,准确预测的比例为 43%)。总体而言,我们的结果表明,ML 结合基因表达数据和协变量可以进一步了解 MDD 的病理生理学。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验