Theoretical Biology and Bioinformatics group, Department of Biology, Faculty of Science, Utrecht University, Padualaan 8, Utrecht, 3584CH, Utrecht, the Netherlands.
BMC Bioinformatics. 2010 Feb 12;11:86. doi: 10.1186/1471-2105-11-86.
Homology is a crucial concept in comparative genomics. The algorithm probably most widely used for homology detection in comparative genomics, is BLAST. Usually a stringent score cutoff is applied to distinguish putative homologs from possible false positive hits. As a consequence, some BLAST hits are discarded that are in fact homologous.
Analogous to the use of the genomics context in genome alignments, we test whether conserved functional context can be used to select candidate homologs from insignificant BLAST hits. We make a co-complex network alignment between complex subunits in yeast and human and find that proteins with an insignificant BLAST hit that are part of homologous complexes, are likely to be homologous themselves. Further analysis of the distant homologs we recovered using the co-complex network alignment, shows that a large majority of these distant homologs are in fact ancient paralogs.
Our results show that, even though evolution takes place at the sequence and genome level, co-complex networks can be used as circumstantial evidence to improve confidence in the homology of distantly related sequences.
同源性是比较基因组学中的一个关键概念。在比较基因组学中,用于同源性检测的算法可能是最广泛使用的 BLAST。通常应用严格的得分截止值来区分可能的同源物和可能的假阳性命中。因此,一些实际上是同源的 BLAST 命中被丢弃。
类似于在基因组比对中使用基因组学上下文,我们测试是否可以利用保守的功能上下文从无意义的 BLAST 命中中选择候选同源物。我们在酵母和人类的复杂亚基之间进行共复合物网络比对,并发现具有无意义 BLAST 命中但属于同源复合物的蛋白质本身很可能是同源的。对我们使用共复合物网络比对恢复的远程同源物的进一步分析表明,这些远程同源物中的绝大多数实际上是古老的旁系同源物。
我们的结果表明,尽管进化发生在序列和基因组水平上,但共复合物网络可以作为间接证据,提高对远距离相关序列同源性的信心。