Suppr超能文献

基于蛋白质相互作用网络的深度学习框架,用于鉴定与疾病相关的人类蛋白质。

Protein Interaction Network-based Deep Learning Framework for Identifying Disease-Associated Human Proteins.

机构信息

Department of Computer Science and Engineering, Indian Institute of Technology Kharagpur 721302, India.

Department of Computer Science and Engineering, Indian Institute of Technology Kharagpur 721302, India.

出版信息

J Mol Biol. 2021 Sep 17;433(19):167149. doi: 10.1016/j.jmb.2021.167149. Epub 2021 Jul 14.

Abstract

Infectious diseases in humans appear to be one of the most primary public health issues. Identification of novel disease-associated proteins will furnish an efficient recognition of the novel therapeutic targets. Here, we develop a Graph Convolutional Network (GCN)-based model called PINDeL to identify the disease-associated host proteins by integrating the human Protein Locality Graph and its corresponding topological features. Because of the amalgamation of GCN with the protein interaction network, PINDeL achieves the highest accuracy of 83.45% while AUROC and AUPRC values are 0.90 and 0.88, respectively. With high accuracy, recall, F1-score, specificity, AUROC, and AUPRC, PINDeL outperforms other existing machine-learning and deep-learning techniques for disease gene/protein identification in humans. Application of PINDeL on an independent dataset of 24320 proteins, which are not used for training, validation, or testing purposes, predicts 6448 new disease-protein associations of which we verify 3196 disease-proteins through experimental evidence like disease ontology, Gene Ontology, and KEGG pathway enrichment analyses. Our investigation informs that experimentally-verified 748 proteins are indeed responsible for pathogen-host protein interactions of which 22 disease-proteins share their association with multiple diseases such as cancer, aging, chem-dependency, pharmacogenomics, normal variation, infection, and immune-related diseases. This unique Graph Convolution Network-based prediction model is of utmost use in large-scale disease-protein association prediction and hence, will provide crucial insights on disease pathogenesis and will further aid in developing novel therapeutics.

摘要

人类传染病似乎是最主要的公共卫生问题之一。鉴定新的疾病相关蛋白将为识别新的治疗靶点提供有效的方法。在这里,我们开发了一种基于图卷积网络(GCN)的模型,称为 PINDeL,通过整合人类蛋白质局部性图及其相应的拓扑特征,来识别与疾病相关的宿主蛋白。由于 GCN 与蛋白质相互作用网络的融合,PINDeL 实现了 83.45%的最高准确率,AUROC 和 AUPRC 值分别为 0.90 和 0.88。PINDeL 在人类疾病基因/蛋白识别方面的准确性、召回率、F1 分数、特异性、AUROC 和 AUPRC 均高于其他现有的机器学习和深度学习技术。将 PINDeL 应用于一个独立的 24320 个蛋白质数据集,这些蛋白质不用于训练、验证或测试目的,预测了 6448 种新的疾病-蛋白质关联,我们通过实验证据(如疾病本体、基因本体和 KEGG 途径富集分析)验证了其中的 3196 种疾病-蛋白质关联。我们的研究表明,经过实验验证的 748 种蛋白质确实参与了病原体-宿主蛋白相互作用,其中 22 种疾病蛋白与多种疾病(如癌症、衰老、化学依赖性、药物基因组学、正常变异、感染和免疫相关疾病)有关联。这种基于独特图卷积网络的预测模型在大规模疾病-蛋白质关联预测中非常有用,因此将为疾病发病机制提供重要的见解,并有助于开发新的治疗方法。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验