• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于随机森林和拓扑结构特征的蛋白质-蛋白质相互作用网络局部子图识别人类蛋白质复合物。

Identification of human protein complexes from local sub-graphs of protein-protein interaction network based on random forest with topological structure features.

机构信息

School of Chemistry and Chemical Engineering, Sun Yat-Sen University, Guangzhou, PR China.

出版信息

Anal Chim Acta. 2012 Mar 9;718:32-41. doi: 10.1016/j.aca.2011.12.069. Epub 2012 Jan 9.

DOI:10.1016/j.aca.2011.12.069
PMID:22305895
Abstract

In the post-genomic era, one of the most important and challenging tasks is to identify protein complexes and further elucidate its molecular mechanisms in specific biological processes. Previous computational approaches usually identify protein complexes from protein interaction network based on dense sub-graphs and incomplete priori information. Additionally, the computational approaches have little concern about the biological properties of proteins and there is no a common evaluation metric to evaluate the performance. So, it is necessary to construct novel method for identifying protein complexes and elucidating the function of protein complexes. In this study, a novel approach is proposed to identify protein complexes using random forest and topological structure. Each protein complex is represented by a graph of interactions, where descriptor of the protein primary structure is used to characterize biological properties of protein and vertex is weighted by the descriptor. The topological structure features are developed and used to characterize protein complexes. Random forest algorithm is utilized to build prediction model and identify protein complexes from local sub-graphs instead of dense sub-graphs. As a demonstration, the proposed approach is applied to protein interaction data in human, and the satisfied results are obtained with accuracy of 80.24%, sensitivity of 81.94%, specificity of 80.07%, and Matthew's correlation coefficient of 0.4087 in 10-fold cross-validation test. Some new protein complexes are identified, and analysis based on Gene Ontology shows that the complexes are likely to be true complexes and play important roles in the pathogenesis of some diseases. PCI-RFTS, a corresponding executable program for protein complexes identification, can be acquired freely on request from the authors.

摘要

在后基因组时代,最重要和最具挑战性的任务之一是识别蛋白质复合物,并进一步阐明其在特定生物过程中的分子机制。以前的计算方法通常基于密集子图和不完整的先验信息,从蛋白质相互作用网络中识别蛋白质复合物。此外,计算方法很少关注蛋白质的生物学特性,也没有通用的评估指标来评估性能。因此,有必要构建识别蛋白质复合物和阐明蛋白质复合物功能的新方法。在这项研究中,提出了一种使用随机森林和拓扑结构识别蛋白质复合物的新方法。每个蛋白质复合物都表示为一个相互作用的图,其中使用蛋白质一级结构的描述符来表征蛋白质的生物学特性,并且顶点由描述符加权。开发了拓扑结构特征来表征蛋白质复合物。随机森林算法用于从局部子图而不是密集子图构建预测模型并识别蛋白质复合物。作为演示,将所提出的方法应用于人类蛋白质相互作用数据,在 10 折交叉验证测试中获得了 80.24%的准确率、81.94%的灵敏度、80.07%的特异性和 0.4087 的马修相关系数的满意结果。鉴定出一些新的蛋白质复合物,基于基因本体论的分析表明这些复合物很可能是真实的复合物,并在某些疾病的发病机制中发挥重要作用。可根据需要向作者免费索取用于蛋白质复合物识别的相应可执行程序 PCI-RFTS。

相似文献

1
Identification of human protein complexes from local sub-graphs of protein-protein interaction network based on random forest with topological structure features.基于随机森林和拓扑结构特征的蛋白质-蛋白质相互作用网络局部子图识别人类蛋白质复合物。
Anal Chim Acta. 2012 Mar 9;718:32-41. doi: 10.1016/j.aca.2011.12.069. Epub 2012 Jan 9.
2
Identifying subcellular localizations of mammalian protein complexes based on graph theory with a random forest algorithm.基于图论和随机森林算法识别哺乳动物蛋白质复合物的亚细胞定位
Mol Biosyst. 2013 Apr 5;9(4):658-67. doi: 10.1039/c3mb25451h. Epub 2013 Feb 22.
3
Identifying protein complexes using hybrid properties.利用混合特性鉴定蛋白质复合物。
J Proteome Res. 2009 Nov;8(11):5212-8. doi: 10.1021/pr900554a.
4
Fitting a geometric graph to a protein-protein interaction network.将几何图拟合到蛋白质-蛋白质相互作用网络。
Bioinformatics. 2008 Apr 15;24(8):1093-9. doi: 10.1093/bioinformatics/btn079. Epub 2008 Mar 14.
5
Combining functional and topological properties to identify core modules in protein interaction networks.结合功能和拓扑特性以识别蛋白质相互作用网络中的核心模块。
Proteins. 2006 Sep 1;64(4):948-59. doi: 10.1002/prot.21071.
6
Identifying functions of protein complexes based on topology similarity with random forest.基于与随机森林的拓扑相似性识别蛋白质复合物的功能。
Mol Biosyst. 2014 Mar 4;10(3):514-25. doi: 10.1039/c3mb70401g. Epub 2014 Jan 6.
7
Protein complex prediction with RNSC.使用RNSC进行蛋白质复合物预测。
Methods Mol Biol. 2012;804:297-312. doi: 10.1007/978-1-61779-361-5_16.
8
Systematic computational prediction of protein interaction networks.系统的计算预测蛋白质相互作用网络。
Phys Biol. 2011 Jun;8(3):035008. doi: 10.1088/1478-3975/8/3/035008. Epub 2011 May 13.
9
[The study on the characters of membrane protein interaction and its network based on integrated intelligence method].基于集成智能方法的膜蛋白相互作用特征及其网络研究
Sheng Wu Yi Xue Gong Cheng Xue Za Zhi. 2011 Aug;28(4):658-62.
10
Assessment of protein domain fusions in human protein interaction networks prediction: application to the human kinetochore model.评估人类蛋白质相互作用网络预测中的蛋白质结构域融合:在人类着丝粒模型中的应用。
N Biotechnol. 2010 Dec 31;27(6):755-65. doi: 10.1016/j.nbt.2010.09.005. Epub 2010 Sep 17.

引用本文的文献

1
IoMT-Based Automated Detection and Classification of Leukemia Using Deep Learning.基于物联网的深度学习白血病自动检测与分类
J Healthc Eng. 2020 Dec 3;2020:6648574. doi: 10.1155/2020/6648574. eCollection 2020.
2
Graph theory and stability analysis of protein complex interaction networks.蛋白质复合物相互作用网络的图论与稳定性分析
IET Syst Biol. 2016 Apr;10(2):64-75. doi: 10.1049/iet-syb.2015.0007.