School of Life Sciences, Arizona State University, Tempe, AZ, United States of America.
PLoS Genet. 2018 Dec 28;14(12):e1007859. doi: 10.1371/journal.pgen.1007859. eCollection 2018 Dec.
Since the initial description of the genomic patterns expected under models of positive selection acting on standing genetic variation and on multiple beneficial mutations-so-called soft selective sweeps-researchers have sought to identify these patterns in natural population data. Indeed, over the past two years, large-scale data analyses have argued that soft sweeps are pervasive across organisms of very different effective population size and mutation rate-humans, Drosophila, and HIV. Yet, others have evaluated the relevance of these models to natural populations, as well as the identifiability of the models relative to other known population-level processes, arguing that soft sweeps are likely to be rare. Here, we look to reconcile these opposing results by carefully evaluating three recent studies and their underlying methodologies. Using population genetic theory, as well as extensive simulation, we find that all three examples are prone to extremely high false-positive rates, incorrectly identifying soft sweeps under both hard sweep and neutral models. Furthermore, we demonstrate that well-fit demographic histories combined with rare hard sweeps serve as the more parsimonious explanation. These findings represent a necessary response to the growing tendency of invoking parameter-heavy, assumption-laden models of pervasive positive selection, and neglecting best practices regarding the construction of proper demographic null models.
自从最初描述了在作用于遗传变异和多个有益突变的正向选择模型下预期的基因组模式以来——所谓的软选择清除——研究人员一直在努力在自然种群数据中识别这些模式。事实上,在过去的两年中,大规模数据分析表明,软选择清除在有效种群大小和突变率差异很大的生物体中普遍存在——人类、果蝇和 HIV。然而,其他人评估了这些模型对自然种群的相关性,以及相对于其他已知的种群水平过程的可识别性,认为软选择清除可能很少见。在这里,我们通过仔细评估最近的三项研究及其基础方法来调和这些相互矛盾的结果。我们使用群体遗传理论以及广泛的模拟,发现这三个例子都容易出现极高的假阳性率,在硬选择清除和中性模型下错误地识别软选择清除。此外,我们证明,与稀有硬选择清除相结合的拟合良好的人口历史可以作为更简约的解释。这些发现是对越来越倾向于援引具有大量参数和假设的普遍正向选择的模型的必要回应,并且忽视了构建适当的人口统计零模型的最佳实践。