Beijing Institute of Genomics, Chinese Academy of Sciences and China National Center for Bioinformation, Beijing 100101, China.
School of Future Technology, University of Chinese Academy of Sciences, Beijing 100049, China.
Mol Biol Evol. 2024 Oct 4;41(10). doi: 10.1093/molbev/msae192.
Identifying soft selective sweeps using genomic data is a challenging yet crucial task in population genetics. In this study, we present HaploSweep, a novel method for detecting and categorizing soft and hard selective sweeps based on haplotype structure. Through simulations spanning a broad range of selection intensities, softness levels, and demographic histories, we demonstrate that HaploSweep outperforms iHS, nSL, and H12 in detecting soft sweeps. HaploSweep achieves high classification accuracy-0.9247 for CHB, 0.9484 for CEU, and 0.9829 YRI-when applied to simulations in line with the human Out-of-Africa demographic model. We also observe that the classification accuracy remains consistently robust across different demographic models. Additionally, we introduce a refined method to accurately distinguish soft shoulders adjacent to hard sweeps from soft sweeps. Application of HaploSweep to genomic data of CHB, CEU, and YRI populations from the 1000 genomes project has led to the discovery of several new genes that bear strong evidence of population-specific soft sweeps (HRNR, AMBRA1, CBFA2T2, DYNC2H1, and RANBP2 etc.), with prevalent associations to immune functions and metabolic processes. The validated performance of HaploSweep, demonstrated through both simulated and real data, underscores its potential as a valuable tool for detecting and comprehending the role of soft sweeps in adaptive evolution.
利用基因组数据识别软选择漂变是群体遗传学中一项具有挑战性但至关重要的任务。在本研究中,我们提出了 HaploSweep,这是一种基于单倍型结构检测和分类软选择漂变和硬选择漂变的新方法。通过跨越广泛的选择强度、软度水平和人口历史的模拟,我们证明 HaploSweep 在检测软选择漂变方面优于 iHS、nSL 和 H12。HaploSweep 在应用于符合人类走出非洲人口模型的模拟时,实现了高分类准确率——CHB 为 0.9247、CEU 为 0.9484、YRI 为 0.9829。我们还观察到,分类准确率在不同的人口模型下仍然保持稳健。此外,我们引入了一种改进的方法,可以准确地区分与硬选择漂变相邻的软选择漂变的肩部和软选择漂变。将 HaploSweep 应用于 1000 基因组计划中 CHB、CEU 和 YRI 人群的基因组数据,发现了几个新的基因,这些基因具有强烈的群体特异性软选择漂变的证据(HRNR、AMBRA1、CBFA2T2、DYNC2H1 和 RANBP2 等),与免疫功能和代谢过程有普遍的关联。HaploSweep 通过模拟和真实数据的验证性能,突显了其作为检测和理解软选择漂变在适应性进化中作用的有价值工具的潜力。