van der Heijden René T J M, Snel Berend, van Noort Vera, Huynen Martijn A
Center for Molecular and Biomolecular Informatics, Nijmegen Center for Molecular Life Sciences, Radboud University Nijmegen Medical Center, Nijmegen, The Netherlands.
BMC Bioinformatics. 2007 Mar 8;8:83. doi: 10.1186/1471-2105-8-83.
Orthology is one of the cornerstones of gene function prediction. Dividing the phylogenetic relations between genes into either orthologs or paralogs is however an oversimplification. Already in two-species gene-phylogenies, the complicated, non-transitive nature of phylogenetic relations results in inparalogs and outparalogs. For situations with more than two species we lack semantics to specifically describe the phylogenetic relations, let alone to exploit them. Published procedures to extract orthologous groups from phylogenetic trees do not allow identification of orthology at various levels of resolution, nor do they document the relations between the orthologous groups.
We introduce "levels of orthology" to describe the multi-level nature of gene relations. This is implemented in a program LOFT (Levels of Orthology From Trees) that assigns hierarchical orthology numbers to genes based on a phylogenetic tree. To decide upon speciation and gene duplication events in a tree LOFT can be instructed either to perform classical species-tree reconciliation or to use the species overlap between partitions in the tree. The hierarchical orthology numbers assigned by LOFT effectively summarize the phylogenetic relations between genes. The resulting high-resolution orthologous groups are depicted in colour, facilitating visual inspection of (large) trees. A benchmark for orthology prediction, that takes into account the varying levels of orthology between genes, shows that the phylogeny-based high-resolution orthology assignments made by LOFT are reliable.
The "levels of orthology" concept offers high resolution, reliable orthology, while preserving the relations between orthologous groups. A Windows as well as a preliminary Java version of LOFT is available from the LOFT website http://www.cmbi.ru.nl/LOFT.
直系同源关系是基因功能预测的基石之一。然而,将基因间的系统发育关系简单地划分为直系同源基因或旁系同源基因是一种过度简化的做法。早在两物种基因系统发育中,系统发育关系复杂的、非传递性本质就导致了内共生同源基因和外共生同源基因的出现。对于有两个以上物种的情况,我们缺乏专门描述系统发育关系的语义,更不用说利用这些关系了。已发表的从系统发育树中提取直系同源基因簇的程序既不允许在不同分辨率水平上识别直系同源关系,也没有记录直系同源基因簇之间的关系。
我们引入“直系同源水平”来描述基因关系的多层次性质。这在一个名为LOFT(Levels of Orthology From Trees)的程序中得以实现,该程序基于系统发育树为基因分配层次化的直系同源编号。为了确定树中的物种形成和基因复制事件,可以指示LOFT执行经典的物种树和解,或者使用树中分区之间的物种重叠。LOFT分配的层次化直系同源编号有效地总结了基因之间的系统发育关系。由此产生的高分辨率直系同源基因簇以颜色描绘,便于直观检查(大型)树。一个考虑到基因间直系同源水平变化的直系同源预测基准表明,LOFT基于系统发育的高分辨率直系同源分配是可靠的。
“直系同源水平”概念提供了高分辨率、可靠的直系同源关系,同时保留了直系同源基因簇之间的关系。可以从LOFT网站http://www.cmbi.ru.nl/LOFT获得LOFT的Windows版本以及初步的Java版本。