Computational and Molecular Population Genetics (CMPG), Institute of Ecology and Evolution, University of Bern, Bern, Switzerland.
Mol Biol Evol. 2014 Apr;31(4):817-31. doi: 10.1093/molbev/mst271. Epub 2013 Dec 25.
Phylogenetic reconstruction of the evolutionary history of closely related organisms may be difficult because of the presence of unsorted lineages and of a relatively high proportion of heterozygous sites that are usually not handled well by phylogenetic programs. Genomic data may provide enough fixed polymorphisms to resolve phylogenetic trees, but the diploid nature of sequence data remains analytically challenging. Here, we performed a phylogenomic reconstruction of the evolutionary history of the common vole (Microtus arvalis) with a focus on the influence of heterozygosity on the estimation of intraspecific divergence times. We used genome-wide sequence information from 15 voles distributed across the European range. We provide a novel approach to integrate heterozygous information in existing phylogenetic programs by repeated random haplotype sampling from sequences with multiple unphased heterozygous sites. We evaluated the impact of the use of full, partial, or no heterozygous information for tree reconstructions on divergence time estimates. All results consistently showed four deep and strongly supported evolutionary lineages in the vole data. These lineages undergoing divergence processes split only at the end or after the last glacial maximum based on calibration with radiocarbon-dated paleontological material. However, the incorporation of information from heterozygous sites had a significant impact on absolute and relative branch length estimations. Ignoring heterozygous information led to an overestimation of divergence times between the evolutionary lineages of M. arvalis. We conclude that the exclusion of heterozygous sites from evolutionary analyses may cause biased and misleading divergence time estimates in closely related taxa.
由于未分类谱系的存在以及相对较高比例的杂合位点,对密切相关的生物体的进化历史进行系统发育重建可能很困难,这些杂合位点通常不能很好地被系统发育程序处理。基因组数据可能提供足够的固定多态性来解决系统发育树,但序列数据的二倍体性质在分析上仍然具有挑战性。在这里,我们对普通田鼠(Microtus arvalis)的进化历史进行了基因组系统发育重建,重点研究了杂合性对种内分歧时间估计的影响。我们使用了来自分布在欧洲范围内的 15 只田鼠的全基因组序列信息。我们提供了一种新颖的方法,通过从具有多个未定相杂合位点的序列中重复随机单倍型采样,将杂合信息整合到现有的系统发育程序中。我们评估了在树重建中使用全、部分或无杂合信息对分歧时间估计的影响。所有结果一致表明,在田鼠数据中存在四个深度和强烈支持的进化谱系。这些谱系在经历分歧过程时仅在末次冰期结束或之后分裂,这是基于与放射性碳定年的古生物学材料进行校准的结果。然而,杂合位点信息的纳入对绝对和相对分支长度估计有显著影响。忽略杂合信息会导致对 M. arvalis 进化谱系之间分歧时间的高估。我们得出结论,将杂合位点从进化分析中排除可能会导致密切相关类群的分歧时间估计出现偏差和误导。