Ji Xiang, Griffing Alexander, Thorne Jeffrey L
Bioinformatics Research Center, North Carolina State University Department of Statistics, North Carolina State University.
Bioinformatics Research Center, North Carolina State University Department of Biological Sciences, North Carolina State University.
Mol Biol Evol. 2016 Sep;33(9):2469-76. doi: 10.1093/molbev/msw114. Epub 2016 Jun 13.
Interlocus gene conversion (IGC) homogenizes repeats. While genomes can be repeat-rich, the evolutionary importance of IGC is poorly understood. Additional statistical tools for characterizing it are needed. We propose a composite likelihood strategy for incorporating IGC into widely-used probabilistic models for sequence changes that originate with point mutation. We estimated the percentage of nucleotide substitutions that originate with an IGC event rather than a point mutation in 14 groups of yeast ribosomal protein-coding genes, and found values ranging from 20% to 38%. We designed and applied a procedure to determine whether these percentages are inflated due to artifacts arising from model misspecification. The results of this procedure are consistent with IGC having had an important role in the evolution of each of these 14 gene families. We further investigate the properties of our IGC approach via simulation. In contrast to usual practice, our findings suggest that the IGC should and can be considered when multigene family evolution is investigated.
基因座间基因转换(IGC)使重复序列同质化。虽然基因组可能富含重复序列,但人们对IGC在进化中的重要性了解甚少。需要更多用于表征它的统计工具。我们提出了一种复合似然策略,将IGC纳入广泛使用的源于点突变的序列变化概率模型中。我们估计了在14组酵母核糖体蛋白编码基因中,起源于IGC事件而非点突变的核苷酸替换百分比,发现其值在20%到38%之间。我们设计并应用了一个程序来确定这些百分比是否因模型错误设定产生的假象而被夸大。该程序的结果与IGC在这14个基因家族各自的进化中发挥重要作用相一致。我们通过模拟进一步研究了我们的IGC方法的特性。与通常的做法不同,我们的研究结果表明,在研究多基因家族进化时,应该且能够考虑IGC。