Sabath Niv, Landan Giddy, Graur Dan
Department of Biology and Biochemistry, University of Houston, Houston, Texas, United States of America.
PLoS One. 2008;3(12):e3996. doi: 10.1371/journal.pone.0003996. Epub 2008 Dec 22.
Inferring the intensity of positive selection in protein-coding genes is important since it is used to shed light on the process of adaptation. Recently, it has been reported that overlapping genes, which are ubiquitous in all domains of life, seem to exhibit inordinate degrees of positive selection. Here, we present a new method for the simultaneous estimation of selection intensities in overlapping genes. We show that the appearance of positive selection is caused by assuming that selection operates independently on each gene in an overlapping pair, thereby ignoring the unique evolutionary constraints on overlapping coding regions. Our method uses an exact evolutionary model, thereby voiding the need for approximation or intensive computation. We test the method by simulating the evolution of overlapping genes of different types as well as under diverse evolutionary scenarios. Our results indicate that the independent estimation approach leads to the false appearance of positive selection even though the gene is in reality subject to negative selection. Finally, we use our method to estimate selection in two influenza A genes for which positive selection was previously inferred. We find no evidence for positive selection in both cases.
推断蛋白质编码基因中正选择的强度很重要,因为它有助于揭示适应过程。最近有报道称,在生命的所有领域中普遍存在的重叠基因似乎表现出过度的正选择。在此,我们提出了一种同时估计重叠基因中选择强度的新方法。我们表明,正选择的出现是由于假设选择在重叠对中的每个基因上独立起作用,从而忽略了重叠编码区域独特的进化限制。我们的方法使用精确的进化模型,从而无需近似或密集计算。我们通过模拟不同类型的重叠基因以及在不同进化场景下的进化来测试该方法。我们的结果表明,即使基因实际上受到负选择,独立估计方法也会导致正选择的错误表象。最后,我们使用我们的方法来估计之前推断存在正选择的两个甲型流感病毒基因中的选择情况。我们发现这两种情况下都没有正选择的证据。