Sun Jingchun, Li Yixue, Zhao Zhongming
Bioinformatics Laboratory, Virginia Institute for Psychiatric and Behavioral Genetics, Virginia Commonwealth University, Richmond, VA 23298, USA.
Biochem Biophys Res Commun. 2007 Feb 23;353(4):985-91. doi: 10.1016/j.bbrc.2006.12.146. Epub 2006 Dec 27.
The phylogenetic profile method has been widely applied in the prediction of protein-protein interactions (PPIs). Studies often use all of the available complete genomes for this method. With more than 400 genomes complete and new ones on the horizon, it remains unclear how to select reference organisms for profile construction and then influence the PPI prediction. Here, we performed a systematic assessment of reference organism selection from 225 complete genomes with their evolutionary tree. Our results suggest that reference organisms should be selected from moderately and highly genetically distant organisms, from all three domains (Bacteria, Archaea, and Eukarya), and by their even distribution at the fifth hierarchical level in the evolutionary tree. Our study provides important guidance on the construction of phylogenetic profiles for PPI prediction and functional genomics, which has become challenging due to the large and increasing number of available candidate organisms.
系统发生谱方法已广泛应用于蛋白质-蛋白质相互作用(PPI)的预测。该方法的研究通常会使用所有可用的完整基因组。随着400多个基因组已完成测序且新的基因组即将出现,目前仍不清楚如何选择用于构建图谱的参考生物体,以及这将如何影响PPI预测。在此,我们根据225个完整基因组及其进化树对参考生物体的选择进行了系统评估。我们的结果表明,参考生物体应从遗传距离适中及较远的生物体中选择,涵盖所有三个域(细菌、古菌和真核生物),并且应在进化树的第五个层次水平上均匀分布。我们的研究为构建用于PPI预测和功能基因组学的系统发生谱提供了重要指导,由于可用候选生物体数量众多且不断增加,这一工作已变得具有挑战性。