Department of Biochemistry, Genetics and Immunology, University of Vigo, E-36310 Vigo, Spain.
BMC Bioinformatics. 2010 May 20;11:268. doi: 10.1186/1471-2105-11-268.
Typical evolutionary events like recombination, hybridization or gene transfer make necessary the use of phylogenetic networks to properly depict the evolution of DNA and protein sequences. Although several theoretical classes have been proposed to characterize these networks, they make stringent assumptions that will likely not be met by the evolutionary process. We have recently shown that the complexity of simulated networks is a function of the population recombination rate, and that at moderate and large recombination rates the resulting networks cannot be categorized. However, we do not know whether these results extend to networks estimated from real data.
We introduce a web server for the categorization of explicit phylogenetic networks, including the most relevant theoretical classes developed so far. Using this tool, we analyzed statistical parsimony phylogenetic networks estimated from approximately 5,000 DNA alignments, obtained from the NCBI PopSet and Polymorphix databases. The level of characterization was correlated to nucleotide diversity, and a high proportion of the networks derived from these data sets could be formally characterized.
We have developed a public web server, NetTest (freely available from the software section at http://darwin.uvigo.es), to formally characterize the complexity of phylogenetic networks. Using NetTest we found that most statistical parsimony networks estimated with the program TCS could be assigned to a known network class. The level of network characterization was correlated to nucleotide diversity and dependent upon the intra/interspecific levels, although no significant differences were detected among genes. More research on the properties of phylogenetic networks is clearly needed.
典型的进化事件,如重组、杂交或基因转移,使得有必要使用系统发生网络来正确描述 DNA 和蛋白质序列的进化。尽管已经提出了几种理论类别来描述这些网络,但它们做出了严格的假设,而这些假设很可能不符合进化过程。我们最近表明,模拟网络的复杂性是种群重组率的函数,并且在中等和大的重组率下,产生的网络无法分类。然而,我们不知道这些结果是否适用于从真实数据估计的网络。
我们引入了一个用于分类显式系统发生网络的网络服务器,包括迄今为止开发的最相关的理论类别。使用此工具,我们分析了从 NCBI PopSet 和 Polymorphix 数据库中大约 5000 个 DNA 比对中估计的统计简约系统发生网络。特征化水平与核苷酸多样性相关,并且这些数据集衍生的网络的很大一部分可以被正式描述。
我们开发了一个公共网络服务器 NetTest(可从 http://darwin.uvigo.es 的软件部分免费获得),用于正式描述系统发生网络的复杂性。使用 NetTest,我们发现可以将 TCS 程序估计的大多数统计简约网络分配到已知的网络类别。网络特征化的水平与核苷酸多样性相关,并且取决于种内/种间水平,尽管在基因之间没有检测到显著差异。显然,需要对系统发生网络的特性进行更多的研究。