Xiao Hanyu, Wang Jieqiong, Wan Shibiao
Department of Genetics, Cell Biology and Anatomy, University of Nebraska Medical Center, Omaha, NE, United States, 68198.
Department of Neurological Sciences, University of Nebraska Medical Center, Omaha, NE, United States, 68198.
bioRxiv. 2024 Sep 27:2024.09.25.614862. doi: 10.1101/2024.09.25.614862.
As the most common subtype of dementia, Alzheimer's disease (AD) is characterized by a progressive decline in cognitive functions, especially in memory, thinking, and reasoning ability. Early diagnosis and interventions enable the implementation of measures to reduce or slow further regression of the disease, preventing individuals from severe brain function decline. The current framework of AD diagnosis depends on A/T/(N) biomarkers detection from cerebrospinal fluid or brain imaging data, which is invasive and expensive during the data acquisition process. Moreover, the pathophysiological changes of AD accumulate in amino acids, metabolism, neuroinflammation, etc., resulting in heterogeneity in newly registered patients. Recently, next generation sequencing (NGS) technologies have found to be a non-invasive, efficient and less-costly alternative on AD screening. However, most of existing studies rely on single omics only. To address these concerns, we introduce WIMOAD, a weighted integration of multi-omics data for AD diagnosis. WIMOAD synergistically leverages specialized classifiers for patients' paired gene expression and methylation data for multi-stage classification. The resulting scores were then stacked with MLP-based meta-models for performance improvement. The prediction results of two distinct meta-models were integrated with optimized weights for the final decision-making of the model, providing higher performance than using single omics only. Remarkably, WIMOAD achieves significantly higher performance than using single omics alone in the classification tasks. The model's overall performance also outperformed most existing approaches, highlighting its ability to effectively discern intricate patterns in multi-omics data and their correlations with clinical diagnosis results. In addition, WIMOAD also stands out as a biologically interpretable model by leveraging the SHapley Additive exPlanations (SHAP) to elucidate the contributions of each gene from each omics to the model output. We believe WIMOAD is a very promising tool for accurate AD diagnosis and effective biomarker discovery across different progression stages, which eventually will have consequential impacts on early treatment intervention and personalized therapy design on AD.
作为痴呆症最常见的亚型,阿尔茨海默病(AD)的特征是认知功能逐渐衰退,尤其是记忆、思维和推理能力。早期诊断和干预能够采取措施减少或减缓疾病的进一步恶化,防止个体脑功能严重衰退。目前AD诊断的框架依赖于从脑脊液或脑成像数据中检测A/T/(N)生物标志物,在数据采集过程中具有侵入性且成本高昂。此外,AD的病理生理变化在氨基酸、代谢、神经炎症等方面积累,导致新确诊患者存在异质性。最近,下一代测序(NGS)技术被发现是AD筛查的一种非侵入性、高效且成本较低的替代方法。然而,现有的大多数研究仅依赖单一组学。为了解决这些问题,我们引入了WIMOAD,一种用于AD诊断的多组学数据加权整合方法。WIMOAD协同利用专门的分类器对患者的配对基因表达和甲基化数据进行多阶段分类。然后将得到的分数与基于多层感知器(MLP)的元模型堆叠以提高性能。将两个不同元模型的预测结果与优化权重进行整合,用于模型的最终决策,比仅使用单一组学提供更高的性能。值得注意的是,在分类任务中,WIMOAD的性能显著高于仅使用单一组学。该模型的整体性能也优于大多数现有方法,突出了其有效识别多组学数据中复杂模式及其与临床诊断结果相关性的能力。此外,WIMOAD通过利用SHapley值加法解释(SHAP)来阐明每个组学中每个基因对模型输出的贡献,也作为一个具有生物学可解释性的模型脱颖而出。我们相信WIMOAD是一个非常有前途的工具,可用于跨不同进展阶段的准确AD诊断和有效的生物标志物发现,最终将对AD的早期治疗干预和个性化治疗设计产生重大影响。