Suppr超能文献

解码抑郁:使用机器学习和深度学习方法对血液 DNA 甲基化进行的综合多队列探索。

Decoding depression: a comprehensive multi-cohort exploration of blood DNA methylation using machine learning and deep learning approaches.

机构信息

Department of Surgical Sciences, Functional Pharmacology and Neuroscience, Uppsala University, Uppsala, Sweden.

出版信息

Transl Psychiatry. 2024 Jul 15;14(1):287. doi: 10.1038/s41398-024-02992-y.

Abstract

The causes of depression are complex, and the current diagnosis methods rely solely on psychiatric evaluations with no incorporation of laboratory biomarkers in clinical practices. We investigated the stability of blood DNA methylation depression signatures in six different populations using six public and two domestic cohorts (n = 1942) conducting mega-analysis and meta-analysis of the individual studies. We evaluated 12 machine learning and deep learning strategies for depression classification both in cross-validation (CV) and in hold-out tests using merged data from 8 separate batches, constructing models with both biased and unbiased feature selection. We found 1987 CpG sites related to depression in both mega- and meta-analysis at the nominal level, and the associated genes were nominally related to axon guidance and immune pathways based on enrichment analysis and eQTM data. Random forest classifiers achieved the highest performance (AUC 0.73 and 0.76) in CV and hold-out tests respectively on the batch-level processed data. In contrast, the methylation showed low predictive power (all AUCs < 0.57) for all classifiers in CV and no predictive power in hold-out tests when used with harmonized data. All models achieved significantly better performance (>14% gain in AUCs) with pre-selected features (selection bias), with some of the models (joint autoencoder-classifier) reaching AUCs of up to 0.91 in the final testing regardless of data preparation. Different algorithmic feature selection approaches may outperform limma, however, random forest models perform well regardless of the strategy. The results provide an overview over potential future biomarkers for depression and highlight many important methodological aspects for DNA methylation-based depression profiling including the use of machine learning strategies.

摘要

抑郁症的病因复杂,目前的诊断方法仅依赖于精神科评估,而在临床实践中并未纳入实验室生物标志物。我们使用六个公共队列和两个国内队列(n=1942),通过对个体研究的荟萃分析和元分析,研究了血液 DNA 甲基化抑郁特征在六个不同人群中的稳定性。我们评估了 12 种用于抑郁分类的机器学习和深度学习策略,分别在交叉验证(CV)和保留测试中使用来自 8 个独立批次的合并数据进行,并使用有偏和无偏特征选择构建模型。我们在 mega-和 meta-分析中都发现了 1987 个与抑郁相关的 CpG 位点,在基于富集分析和 eQTM 数据的元分析中,相关基因与轴突导向和免疫途径有显著关联。随机森林分类器在 CV 和保留测试中分别在批次处理数据上获得了最高的性能(AUC 0.73 和 0.76)。相比之下,当使用调和数据时,甲基化在 CV 中对所有分类器的预测能力较低(所有 AUCs<0.57),在保留测试中则没有预测能力。所有模型在使用预选择特征(选择偏差)时都取得了显著更好的性能(AUC 增益超过 14%),其中一些模型(联合自动编码器-分类器)在最终测试中达到了高达 0.91 的 AUC,无论数据准备如何。不同的算法特征选择方法可能优于 limma,但随机森林模型无论策略如何都表现良好。研究结果提供了一个潜在的未来抑郁症生物标志物的概述,并强调了基于 DNA 甲基化的抑郁症分析的许多重要方法学方面,包括机器学习策略的应用。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验