Suppr超能文献

区分正选择和中性进化:提高汇总统计数据的性能。

Distinguishing positive selection from neutral evolution: boosting the performance of summary statistics.

机构信息

CAS-MPG Partner Institute for Computational Biology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, Shanghai 200031, China.

出版信息

Genetics. 2011 Jan;187(1):229-44. doi: 10.1534/genetics.110.122614. Epub 2010 Nov 1.

Abstract

Summary statistics are widely used in population genetics, but they suffer from the drawback that no simple sufficient summary statistic exists, which captures all information required to distinguish different evolutionary hypotheses. Here, we apply boosting, a recent statistical method that combines simple classification rules to maximize their joint predictive performance. We show that our implementation of boosting has a high power to detect selective sweeps. Demographic events, such as bottlenecks, do not result in a large excess of false positives. A comparison to other neutrality tests shows that our boosting implementation performs well compared to other neutrality tests. Furthermore, we evaluated the relative contribution of different summary statistics to the identification of selection and found that for recent sweeps integrated haplotype homozygosity is very informative whereas older sweeps are better detected by Tajima's π. Overall, Watterson's was found to contribute the most information for distinguishing between bottlenecks and selection.

摘要

摘要统计数据在群体遗传学中被广泛应用,但它们存在一个缺点,即没有一个简单的充分总结统计数据存在,它可以捕获区分不同进化假说所需的所有信息。在这里,我们应用了boosting,这是一种最近的统计方法,它结合了简单的分类规则来最大化它们的联合预测性能。我们表明,我们的boosting 实现具有很高的检测选择清扫的能力。人口统计学事件,如瓶颈,不会导致大量的假阳性。与其他中性检验的比较表明,我们的boosting 实现与其他中性检验相比表现良好。此外,我们评估了不同摘要统计数据对选择识别的相对贡献,发现对于最近的清扫,整合的单倍型同质性非常有信息,而较旧的清扫则由 Tajima 的π更好地检测到。总的来说,发现 Watterson 的对区分瓶颈和选择最有贡献。

相似文献

1
Distinguishing positive selection from neutral evolution: boosting the performance of summary statistics.
Genetics. 2011 Jan;187(1):229-44. doi: 10.1534/genetics.110.122614. Epub 2010 Nov 1.
2
Detecting Positive Selection in Populations Using Genetic Data.
Methods Mol Biol. 2020;2090:87-123. doi: 10.1007/978-1-0716-0199-0_5.
4
The pitfalls and virtues of population genetic summary statistics: Detecting selective sweeps in recent divergences.
J Evol Biol. 2021 Jun;34(6):893-909. doi: 10.1111/jeb.13738. Epub 2020 Dec 16.
5
Detection and Classification of Hard and Soft Sweeps from Unphased Genotypes by Multilocus Genotype Identity.
Genetics. 2018 Dec;210(4):1429-1452. doi: 10.1534/genetics.118.301502. Epub 2018 Oct 12.
6
Detecting positive selection from genome scans of linkage disequilibrium.
BMC Genomics. 2010 Jan 5;11:8. doi: 10.1186/1471-2164-11-8.
7
Soft shoulders ahead: spurious signatures of soft and partial selective sweeps result from linked hard sweeps.
Genetics. 2015 May;200(1):267-84. doi: 10.1534/genetics.115.174912. Epub 2015 Feb 25.
8
Optimal neutrality tests based on the frequency spectrum.
Genetics. 2010 Sep;186(1):353-65. doi: 10.1534/genetics.110.118570. Epub 2010 Jul 6.
9
Fully Bayesian tests of neutrality using genealogical summary statistics.
BMC Genet. 2008 Oct 31;9:68. doi: 10.1186/1471-2156-9-68.
10
An investigation of the statistical power of neutrality tests based on comparative and population genetic data.
Mol Biol Evol. 2009 Feb;26(2):273-83. doi: 10.1093/molbev/msn231. Epub 2008 Oct 14.

引用本文的文献

1
Signatures of soft selective sweeps predominate in the yellow fever mosquito .
bioRxiv. 2025 Jul 10:2025.07.06.663360. doi: 10.1101/2025.07.06.663360.
2
Genomic Anomaly Detection with Functional Data Analysis.
Genes (Basel). 2025 Jun 15;16(6):710. doi: 10.3390/genes16060710.
3
Accessible, realistic genome simulation with selection using stdpopsim.
bioRxiv. 2025 Mar 23:2025.03.23.644823. doi: 10.1101/2025.03.23.644823.
4
Digital Image Processing to Detect Adaptive Evolution.
Mol Biol Evol. 2024 Dec 6;41(12). doi: 10.1093/molbev/msae242.
5
Tree Sequences as a General-Purpose Tool for Population Genetic Inference.
Mol Biol Evol. 2024 Nov 1;41(11). doi: 10.1093/molbev/msae223.
6
Tree sequences as a general-purpose tool for population genetic inference.
bioRxiv. 2024 Oct 5:2024.02.20.581288. doi: 10.1101/2024.02.20.581288.
7
IntroUNET: Identifying introgressed alleles via semantic segmentation.
PLoS Genet. 2024 Feb 20;20(2):e1010657. doi: 10.1371/journal.pgen.1010657. eCollection 2024 Feb.
9
Enrichment of hard sweeps on the X chromosome compared to autosomes in six Drosophila species.
bioRxiv. 2023 Dec 7:2023.06.21.545888. doi: 10.1101/2023.06.21.545888.

本文引用的文献

1
Searching for footprints of positive selection in whole-genome SNP data from nonequilibrium populations.
Genetics. 2010 Jul;185(3):907-22. doi: 10.1534/genetics.110.116459. Epub 2010 Apr 20.
2
A composite of multiple signals distinguishes causal variants in regions of positive selection.
Science. 2010 Feb 12;327(5967):883-6. doi: 10.1126/science.1183863. Epub 2010 Jan 7.
3
Tracking footprints of maize domestication and evidence for a massive selective sweep on chromosome 10.
Proc Natl Acad Sci U S A. 2009 Jun 16;106 Suppl 1(Suppl 1):9979-86. doi: 10.1073/pnas.0901122106. Epub 2009 Jun 15.
5
Detecting selective sweeps: a new approach based on hidden markov models.
Genetics. 2009 Apr;181(4):1567-78. doi: 10.1534/genetics.108.100032. Epub 2009 Feb 9.
6
Approximately sufficient statistics and bayesian computation.
Stat Appl Genet Mol Biol. 2008;7(1):Article26. doi: 10.2202/1544-6115.1389. Epub 2008 Aug 30.
7
Second-order moments of segregating sites under variable population size.
Genetics. 2008 Sep;180(1):341-57. doi: 10.1534/genetics.108.091231. Epub 2008 Aug 20.
10
Compound tests for the detection of hitchhiking under positive selection.
Mol Biol Evol. 2007 Aug;24(8):1898-908. doi: 10.1093/molbev/msm119. Epub 2007 Jun 8.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验