Department of Human Genetics, University of Chicago, Chicago, IL, USA.
Department of Biology, Indiana University, Bloomington, IN, USA.
Nat Ecol Evol. 2018 Aug;2(8):1280-1288. doi: 10.1038/s41559-018-0584-5. Epub 2018 Jul 2.
Phylogenetic tests of adaptive evolution, such as the widely used branch-site test (BST), assume that nucleotide substitutions occur singly and independently. Recent research has shown that errors at adjacent sites often occur during DNA replication, and the resulting multinucleotide mutations (MNMs) are overwhelmingly likely to be non-synonymous. To evaluate whether the BST misinterprets sequence patterns produced by MNMs as false support for positive selection, we analysed two genome-scale datasets-one from mammals and one from flies. We found that codons with multiple differences account for virtually all the support for lineage-specific positive selection in the BST. Simulations under conditions derived from these alignments but without positive selection show that realistic rates of MNMs cause a strong and systematic bias towards false inferences of selection. This bias is sufficient under empirically derived conditions to produce false positive inferences as often as the BST infers positive selection from the empirical data. Although some genes with BST-positive results may have evolved adaptively, the test cannot distinguish sequence patterns produced by authentic positive selection from those caused by neutral fixation of MNMs. Many published inferences of adaptive evolution using this technique may therefore be artefacts of model violation caused by unincorporated neutral mutational processes. We introduce a model that incorporates MNMs and may help to ameliorate this bias.
进化适应性的系统发育检验,如广泛使用的分支位点检验(BST),假定核苷酸替换是单个且独立发生的。最近的研究表明,在 DNA 复制过程中,相邻位点经常会发生错误,而产生的多核苷酸突变(MNMs)绝大多数是非同义的。为了评估 BST 是否错误地将 MNMs 产生的序列模式解释为对正选择的虚假支持,我们分析了两个全基因组数据集——一个来自哺乳动物,一个来自果蝇。我们发现,具有多个差异的密码子几乎解释了 BST 中所有谱系特异性正选择的支持。根据这些比对推导但没有正选择的模拟表明,MNMs 的现实速率导致了对选择的错误推断的强烈且系统的偏差。在经验推导的条件下,这种偏差足以产生与 BST 从经验数据推断正选择一样频繁的假阳性推断。尽管一些具有 BST 阳性结果的基因可能已经适应了进化,但该测试无法区分由真实正选择产生的序列模式与由 MNMs 的中性固定引起的序列模式。因此,许多使用该技术进行的适应性进化推断可能是由于未包含的中性突变过程导致模型违反而产生的假象。我们引入了一个包含 MNMs 的模型,该模型可能有助于减轻这种偏差。