Suppr超能文献

迈向人类衰老相关基因数据整合监督预测的未来方向。

Towards future directions in data-integrative supervised prediction of human aging-related genes.

作者信息

Li Qi, Newaz Khalique, Milenković Tijana

机构信息

Department of Computer Science and Engineering, Lucy Family Institute for Data & Society, and Eck Institute for Global Health (EIGH), University of Notre Dame, Notre Dame, IN 46556, USA.

Center for Data and Computing in Natural Sciences (CDCS), Institute for Computational Systems Biology, Universität Hamburg, Hamburg 20146, Germany.

出版信息

Bioinform Adv. 2022 Nov 2;2(1):vbac081. doi: 10.1093/bioadv/vbac081. eCollection 2022.

Abstract

MOTIVATION

Identification of human genes involved in the aging process is critical due to the incidence of many diseases with age. A state-of-the-art approach for this purpose infers a weighted dynamic aging-specific subnetwork by mapping gene expression (GE) levels at different ages onto the protein-protein interaction network (PPIN). Then, it analyzes this subnetwork in a supervised manner by training a predictive model to learn how network topologies of known aging- versus non-aging-related genes change across ages. Finally, it uses the trained model to predict novel aging-related gene candidates. However, the best current subnetwork resulting from this approach still yields suboptimal prediction accuracy. This could be because it was inferred using outdated GE and PPIN data. Here, we evaluate whether analyzing a weighted dynamic aging-specific subnetwork inferred from newer GE and PPIN data improves prediction accuracy upon analyzing the best current subnetwork inferred from outdated data.

RESULTS

Unexpectedly, we find that not to be the case. To understand this, we perform aging-related pathway and Gene Ontology term enrichment analyses. We find that the suboptimal prediction accuracy, regardless of which GE or PPIN data is used, may be caused by the current knowledge about which genes are aging-related being incomplete, or by the current methods for inferring or analyzing an aging-specific subnetwork being unable to capture all of the aging-related knowledge. These findings can potentially guide future directions towards improving supervised prediction of aging-related genes via -omics data integration.

AVAILABILITY AND IMPLEMENTATION

All data and code are available at zenodo, DOI: 10.5281/zenodo.6995045.

SUPPLEMENTARY INFORMATION

Supplementary data are available at online.

摘要

动机

由于许多疾病的发病率随年龄增长而上升,因此识别参与衰老过程的人类基因至关重要。为此,一种先进的方法是通过将不同年龄的基因表达(GE)水平映射到蛋白质-蛋白质相互作用网络(PPIN)上,推断出一个加权动态衰老特异性子网。然后,通过训练预测模型以了解已知衰老相关基因与非衰老相关基因的网络拓扑结构如何随年龄变化,从而以监督方式分析该子网。最后,使用训练好的模型预测新的衰老相关基因候选物。然而,这种方法产生的当前最佳子网仍然产生次优的预测准确性。这可能是因为它是使用过时的GE和PPIN数据推断出来的。在这里,我们评估分析从更新的GE和PPIN数据推断出的加权动态衰老特异性子网,与分析从过时数据推断出的当前最佳子网相比,是否能提高预测准确性。

结果

出乎意料的是,我们发现并非如此。为了理解这一点,我们进行了与衰老相关的通路和基因本体术语富集分析。我们发现,无论使用哪种GE或PPIN数据,次优的预测准确性可能是由于目前关于哪些基因与衰老相关的知识不完整,或者是由于目前推断或分析衰老特异性子网的方法无法捕捉所有与衰老相关的知识。这些发现可能会为未来通过组学数据整合改进衰老相关基因的监督预测指明方向。

可用性和实现方式

所有数据和代码可在zenodo上获取,DOI:10.5281/zenodo.6995045。

补充信息

补充数据可在网上获取。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3c09/9710570/4dfd4f23498c/vbac081f1.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验