Department of Computer Science and Engineering, Center for Network and Data Science (CNDS), and Eck Institute for Global Health, University of Notre Dame, Notre Dame, IN, 46556, USA.
BMC Bioinformatics. 2021 Oct 25;22(1):520. doi: 10.1186/s12859-021-04439-3.
This study focuses on the task of supervised prediction of aging-related genes from -omics data. Unlike gene expression methods for this task that capture aging-specific information but ignore interactions between genes (i.e., their protein products), or protein-protein interaction (PPI) network methods for this task that account for PPIs but the PPIs are context-unspecific, we recently integrated the two data types into an aging-specific PPI subnetwork, which yielded more accurate aging-related gene predictions. However, a dynamic aging-specific subnetwork did not improve prediction performance compared to a static aging-specific subnetwork, despite the aging process being dynamic. This could be because the dynamic subnetwork was inferred using a naive Induced subgraph approach. Instead, we recently inferred a dynamic aging-specific subnetwork using a methodologically more advanced notion of network propagation (NP), which improved upon Induced dynamic aging-specific subnetwork in a different task, that of unsupervised analyses of the aging process.
Here, we evaluate whether our existing NP-based dynamic subnetwork will improve upon the dynamic as well as static subnetwork constructed by the Induced approach in the considered task of supervised prediction of aging-related genes. The existing NP-based subnetwork is unweighted, i.e., it gives equal importance to each of the aging-specific PPIs. Because accounting for aging-specific edge weights might be important, we additionally propose a weighted NP-based dynamic aging-specific subnetwork. We demonstrate that a predictive machine learning model trained and tested on the weighted subnetwork yields higher accuracy when predicting aging-related genes than predictive models run on the existing unweighted dynamic or static subnetworks, regardless of whether the existing subnetworks were inferred using NP or the Induced approach.
Our proposed weighted dynamic aging-specific subnetwork and its corresponding predictive model could guide with higher confidence than the existing data and models the discovery of novel aging-related gene candidates for future wet lab validation.
本研究专注于从组学数据中进行监督预测与衰老相关基因的任务。与捕捉衰老特异性信息但忽略基因间相互作用(即它们的蛋白质产物)的基因表达方法不同,或与考虑蛋白质-蛋白质相互作用 (PPI) 但 PPI 上下文特定的 PPI 网络方法不同,我们最近将这两种数据类型整合到一个与衰老特异性 PPI 子网中,这提高了更准确的与衰老相关的基因预测。然而,与静态衰老特异性子网相比,动态衰老特异性子网并没有提高预测性能,尽管衰老过程是动态的。这可能是因为动态子网是使用简单的诱导子图方法推断的。相反,我们最近使用更先进的网络传播 (NP) 方法推断了动态衰老特异性子网,该方法在不同的衰老过程无监督分析任务中改进了诱导动态衰老特异性子网。
在这里,我们评估了我们现有的基于 NP 的动态子网是否会改进通过诱导方法构建的动态和静态子网在考虑的与衰老相关基因的监督预测任务中的表现。现有的基于 NP 的子网是无权重的,即它对每个与衰老特异性 PPI 给予同等重视。因为考虑与衰老特异性边权重可能很重要,我们还提出了一个基于加权 NP 的动态衰老特异性子网。我们证明,在对加权子网进行训练和测试的预测机器学习模型在预测与衰老相关的基因时比在现有无权重动态或静态子网中运行的预测模型具有更高的准确性,无论现有子网是使用 NP 还是诱导方法推断的。
我们提出的加权动态衰老特异性子网及其相应的预测模型可以比现有数据和模型更有信心地指导未来湿实验室验证中新型与衰老相关基因候选物的发现。