Suppr超能文献

一种使用蛋白质重叠网络的新型功能预测方法。

A novel function prediction approach using protein overlap networks.

作者信息

Liang Shide, Zheng Dandan, Standley Daron M, Guo Huarong, Zhang Chi

机构信息

Systems Immunology Lab, Immunology Frontier Research Center, Osaka University, Suita, Osaka 565-0871, Japan.

出版信息

BMC Syst Biol. 2013 Jul 17;7:61. doi: 10.1186/1752-0509-7-61.

Abstract

BACKGROUND

Construction of a reliable network remains the bottleneck for network-based protein function prediction. We built an artificial network model called protein overlap network (PON) for the entire genome of yeast, fly, worm, and human, respectively. Each node of the network represents a protein, and two proteins are connected if they share a domain according to InterPro database.

RESULTS

The function of a protein can be predicted by counting the occurrence frequency of GO (gene ontology) terms associated with domains of direct neighbors. The average success rate and coverage were 34.3% and 43.9%, respectively, for the test genomes, and were increased to 37.9% and 51.3% when a composite PON of the four species was used for the prediction. As a comparison, the success rate was 7.0% in the random control procedure. We also made predictions with GO term annotations of the second layer nodes using the composite network and obtained an impressive success rate (>30%) and coverage (>30%), even for small genomes. Further improvement was achieved by statistical analysis of manually annotated GO terms for each neighboring protein.

CONCLUSIONS

The PONs are composed of dense modules accompanied by a few long distance connections. Based on the PONs, we developed multiple approaches effective for protein function prediction.

摘要

背景

构建可靠的网络仍然是基于网络的蛋白质功能预测的瓶颈。我们分别为酵母、果蝇、线虫和人类的全基因组构建了一种名为蛋白质重叠网络(PON)的人工网络模型。网络的每个节点代表一种蛋白质,根据InterPro数据库,如果两种蛋白质共享一个结构域,则它们相互连接。

结果

通过计算与直接邻居结构域相关的基因本体(GO)术语的出现频率,可以预测蛋白质的功能。测试基因组的平均成功率和覆盖率分别为34.3%和43.9%,当使用四个物种的复合PON进行预测时,平均成功率和覆盖率分别提高到37.9%和51.3%。作为比较,随机对照程序中的成功率为7.0%。我们还使用复合网络对第二层节点的GO术语注释进行了预测,即使对于小基因组,也获得了令人印象深刻的成功率(>30%)和覆盖率(>30%)。通过对每个相邻蛋白质的手动注释GO术语进行统计分析,进一步提高了预测效果。

结论

PON由密集的模块组成,并伴有一些长距离连接。基于PON,我们开发了多种有效的蛋白质功能预测方法。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c88f/3720179/89f0fb0f1302/1752-0509-7-61-1.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验