Sun Jingchun, Xu Jinlin, Liu Zhen, Liu Qi, Zhao Aimin, Shi Tieliu, Li Yixue
School of Life Sciences & Technology, Shanghai Jiaotong University, Shanghai 200240, China.
Bioinformatics. 2005 Aug 15;21(16):3409-15. doi: 10.1093/bioinformatics/bti532. Epub 2005 Jun 9.
The increasing availability of complete genome sequences provides excellent opportunity for the further development of tools for functional studies in proteomics. Several experimental approaches and in silico algorithms have been developed to cluster proteins into networks of biological significance that may provide new biological insights, especially into understanding the functions of many uncharacterized proteins. Among these methods, the phylogenetic profiles method has been widely used to predict protein-protein interactions. It involves the selection of reference organisms and identification of homologous proteins. Up to now, no published report has systematically studied the effects of the reference genome selection and the identification of homologous proteins upon the accuracy of this method.
In this study, we optimized the phylogenetic profiles method by integrating phylogenetic relationships among reference organisms and sequence homology information to improve prediction accuracy. Our results revealed that the selection of the reference organisms set and the criteria for homology identification significantly are two critical factors for the prediction accuracy of this method. Our refined phylogenetic profiles method shows greater performance and potentially provides more reliable functional linkages compared with previous methods.
全基因组序列可用性的不断提高为蛋白质组学功能研究工具的进一步发展提供了绝佳机会。已经开发了几种实验方法和计算机算法,将蛋白质聚类成具有生物学意义的网络,这可能会提供新的生物学见解,特别是有助于理解许多未表征蛋白质的功能。在这些方法中,系统发育谱方法已被广泛用于预测蛋白质-蛋白质相互作用。它涉及参考生物体的选择和同源蛋白质的鉴定。到目前为止,尚无已发表的报告系统地研究参考基因组选择和同源蛋白质鉴定对该方法准确性的影响。
在本研究中,我们通过整合参考生物体之间的系统发育关系和序列同源性信息来优化系统发育谱方法,以提高预测准确性。我们的结果表明,参考生物体集的选择和同源性鉴定标准是该方法预测准确性的两个关键因素。与以前的方法相比,我们改进的系统发育谱方法表现更优,可能提供更可靠的功能联系。