Wu Guohong Albert, Jun Se-Ran, Sims Gregory E, Kim Sung-Hou
Department of Chemistry, University of California, Berkeley, CA 94720, USA.
Proc Natl Acad Sci U S A. 2009 Aug 4;106(31):12826-31. doi: 10.1073/pnas.0905115106. Epub 2009 Jun 24.
The vast sequence divergence among different virus groups has presented a great challenge to alignment-based sequence comparison among different virus families. Using an alignment-free comparison method, we construct the whole-proteome phylogeny for a population of viruses from 11 viral families comprising 142 large dsDNA eukaryote viruses. The method is based on the feature frequency profiles (FFP), where the length of the feature (l-mer) is selected to be optimal for phylogenomic inference. We observe that (i) the FFP phylogeny segregates the population into clades, the membership of each has remarkable agreement with current classification by the International Committee on the Taxonomy of Viruses, with one exception that the mimivirus joins the phycodnavirus family; (ii) the FFP tree detects potential evolutionary relationships among some viral families; (iii) the relative position of the 3 herpesvirus subfamilies in the FFP tree differs from gene alignment-based analysis; (iv) the FFP tree suggests the taxonomic positions of certain "unclassified" viruses; and (v) the FFP method identifies candidates for horizontal gene transfer between virus families.
不同病毒群体之间巨大的序列差异给不同病毒家族间基于比对的序列比较带来了巨大挑战。我们使用一种无需比对的比较方法,构建了来自11个病毒家族的一群病毒的全蛋白质组系统发育树,这些病毒包括142种大型双链DNA真核病毒。该方法基于特征频率谱(FFP),其中特征(l-mer)的长度被选择为对系统发育基因组推断最优化的长度。我们观察到:(i)FFP系统发育树将该群体分为多个进化枝,每个进化枝的成员与国际病毒分类委员会当前的分类有显著一致性,唯一的例外是拟菌病毒归入了藻DNA病毒科;(ii)FFP树检测到了一些病毒家族之间潜在的进化关系;(iii)FFP树中3个疱疹病毒亚科的相对位置与基于基因比对的分析不同;(iv)FFP树表明了某些“未分类”病毒的分类地位;以及(v)FFP方法识别出了病毒家族间水平基因转移的候选对象。