用于遗传多样性（theta）多基因座群体估计的基因采样策略。

Gene sampling strategies for multi-locus population estimates of genetic diversity (theta).

机构信息

Museum of Natural Science, Louisiana State University, Baton Rouge, Louisiana, United States of America.

出版信息

PLoS One. 2007 Jan 17;2(1):e160. doi: 10.1371/journal.pone.0000160.

DOI:10.1371/journal.pone.0000160

PMID:17225863

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC1764684/

Abstract

BACKGROUND

Theoretical work suggests that data from multiple nuclear loci provide better estimates of population genetic parameters than do single loci, but just how many loci are needed and how much sequence is required from each has been little explored.

METHODOLOGY/PRINCIPLE FINDINGS: To investigate how much data is required to estimate the population genetic parameter theta (4N(e)mu) accurately under ideal circumstances, we simulated datasets of DNA sequences under three values of theta per site (0.1, 0.01, 0.001), varying in both the total number of base pairs sequenced per individual and the number of equal-length loci. From these datasets we estimated theta using the maximum likelihood coalescent framework implemented in the computer program Migrate. Our results corroborated the theoretical expectation that increasing the number of loci impacted the accuracy of the estimate more than increasing the sequence length at single loci. However, when the value of theta was low (0.001), the per-locus sequence length was also important for estimating theta accurately, something that has not been emphasized in previous work.

CONCLUSIONS/SIGNIFICANCE: Accurate estimation of theta required data from at least 25 independently evolving loci. Beyond this, there was little added benefit in terms of decreasing the squared coefficient of variation of the coalescent estimates relative to the extra effort required to sample more loci.

摘要

背景

理论研究表明，与单一基因座相比，多个基因座的数据可提供更准确的群体遗传参数估计，但需要多少个基因座以及每个基因座需要多少序列一直以来都鲜有探讨。

方法/原理发现：为了在理想情况下准确估计群体遗传参数 theta（4N(e)mu）所需的数据量，我们模拟了每个基因座三种 theta 值（0.1、0.01、0.001）的 DNA 序列数据集，每个个体的测序碱基对总数和等长基因座数量均有所不同。我们使用计算机程序 Migrate 中的最大似然合并框架从这些数据集中估计了 theta。我们的研究结果证实了理论预期，即增加基因座数量比增加单个基因座的序列长度对估计精度的影响更大。但是，当 theta 值较低（0.001）时，准确估计 theta 还需要每个基因座的序列长度，这在之前的研究中并未得到强调。

结论/意义：准确估计 theta 需要至少 25 个独立进化的基因座的数据。除此之外，相对于增加采样更多基因座所需的额外工作量，减少合并估计的平方变异系数方面几乎没有额外的好处。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f2d0/1764684/04bbb09d3bd0/pone.0000160.g001.jpg

相似文献

Gene sampling strategies for multi-locus population estimates of genetic diversity (theta).

PLoS One. 2007 Jan 17;2(1):e160. doi: 10.1371/journal.pone.0000160.

Accuracy of coalescent likelihood estimates: do we need more sites, more sequences, or more loci?

Mol Biol Evol. 2006 Mar;23(3):691-700. doi: 10.1093/molbev/msj079. Epub 2005 Dec 19.

Evaluating the performance of likelihood methods for detecting population structure and migration.

Mol Ecol. 2004 Apr;13(4):837-51. doi: 10.1111/j.1365-294x.2004.02132.x.

Estimating F-statistics.

Annu Rev Genet. 2002;36:721-50. doi: 10.1146/annurev.genet.36.050802.093940. Epub 2002 Jun 11.

A non-zero variance of Tajima's estimator for two sequences even for infinitely many unlinked loci.

Theor Popul Biol. 2018 Jul;122:22-29. doi: 10.1016/j.tpb.2017.03.002. Epub 2017 Mar 21.

Maximum likelihood estimation of recombination rates from population data.

Genetics. 2000 Nov;156(3):1393-401. doi: 10.1093/genetics/156.3.1393.

Properties of Weir and Cockerham's Fst estimators and associated bootstrap confidence intervals.

Theor Popul Biol. 2011 Feb-Mar;79(1-2):39-52. doi: 10.1016/j.tpb.2010.11.001. Epub 2010 Nov 20.

How low can you go? The effects of mutation rate on the accuracy of species-tree estimation.

Mol Phylogenet Evol. 2014 Jan;70:112-9. doi: 10.1016/j.ympev.2013.09.006. Epub 2013 Sep 21.

Maximum likelihood estimation of population parameters.

Genetics. 1993 Aug;134(4):1261-70. doi: 10.1093/genetics/134.4.1261.

Comparing likelihood and Bayesian coalescent estimation of population parameters.

Genetics. 2007 Jan;175(1):155-65. doi: 10.1534/genetics.106.056457. Epub 2006 Mar 1.

引用本文的文献

Evidence of Genetic Isolation and Differentiation Among Historically Fragmented British Populations of Common Juniper, L.

Ecol Evol. 2025 Jul 20;15(7):e71818. doi: 10.1002/ece3.71818. eCollection 2025 Jul.

Fitness consequences of structural variation inferred from a House Finch pangenome.

Proc Natl Acad Sci U S A. 2024 Nov 19;121(47):e2409943121. doi: 10.1073/pnas.2409943121. Epub 2024 Nov 12.

Relationships between crayfish population genetic diversity, species richness, and abundance within impounded and unimpounded streams in Alabama, USA.

PeerJ. 2024 Sep 24;12:e18006. doi: 10.7717/peerj.18006. eCollection 2024.

Reduced representation approaches produce similar results to whole genome sequencing for some common phylogeographic analyses.

PLoS One. 2023 Nov 30;18(11):e0291941. doi: 10.1371/journal.pone.0291941. eCollection 2023.

Genetic diversity, variation, and structure of two populations of bigfin reef squid (Sepioteuthis lessoniana d'Orbigny) in Con Dao and Phu Quoc islands, Vietnam.

J Genet Eng Biotechnol. 2023 Nov 13;21(1):116. doi: 10.1186/s43141-023-00573-y.

Predictors of genomic diversity within North American squamates.

J Hered. 2023 Apr 6;114(2):131-142. doi: 10.1093/jhered/esad001.

Assexon: Assembling Exon Using Gene Capture Data.

Evol Bioinform Online. 2019 Sep 6;15:1176934319874792. doi: 10.1177/1176934319874792. eCollection 2019.

Ice age unfrozen: severe effect of the last interglacial, not glacial, climate change on East Asian avifauna.

BMC Evol Biol. 2017 Dec 6;17(1):244. doi: 10.1186/s12862-017-1100-2.

SNPs across time and space: population genomic signatures of founder events and epizootics in the House Finch ().

Ecol Evol. 2016 Sep 28;6(20):7475-7489. doi: 10.1002/ece3.2444. eCollection 2016 Oct.

Phylogeographic model selection leads to insight into the evolutionary history of four-eyed frogs.

Proc Natl Acad Sci U S A. 2016 Jul 19;113(29):8010-7. doi: 10.1073/pnas.1601064113.

本文引用的文献

TESTING MODELS OF MIGRATION AND ISOLATION AMONG POPULATIONS OF CHINOOK SALMON (ONCORHYNCHUS TSCHAWYTSCHA).

Evolution. 1998 Apr;52(2):539-557. doi: 10.1111/j.1558-5646.1998.tb01653.x.

PERSPECTIVE: HIGHLY VARIABLE LOCI AND THEIR INTERPRETATION IN EVOLUTION AND CONSERVATION.

Evolution. 1999 Apr;53(2):313-318. doi: 10.1111/j.1558-5646.1999.tb03767.x.

MITOCHONDRIAL-GENE TREES VERSUS NUCLEAR-GENE TREES, A REPLY TO HOELZER.

Evolution. 1997 Apr;51(2):627-629. doi: 10.1111/j.1558-5646.1997.tb02452.x.

INFERRING PHYLOGENIES FROM mtDNA VARIATION: MITOCHONDRIAL-GENE TREES VERSUS NUCLEAR-GENE TREES.

Evolution. 1995 Aug;49(4):718-726. doi: 10.1111/j.1558-5646.1995.tb02308.x.

Arlequin (version 3.0): an integrated software package for population genetics data analysis.

Evol Bioinform Online. 2007 Feb 23;1:47-50.

Genetic evidence for complex speciation of humans and chimpanzees.

Nature. 2006 Jun 29;441(7097):1103-8. doi: 10.1038/nature04789. Epub 2006 May 17.

Accuracy of coalescent likelihood estimates: do we need more sites, more sequences, or more loci?

Mol Biol Evol. 2006 Mar;23(3):691-700. doi: 10.1093/molbev/msj079. Epub 2005 Dec 19.

Phylogeography of sexual Heteronotia binoei (Gekkonidae) in the Australian arid zone: climatic cycling and repetitive hybridization.

Mol Ecol. 2005 Aug;14(9):2755-72. doi: 10.1111/j.1365-294X.2005.02627.x.

Multilocus analysis of introgression between two sympatric sister species of Drosophila: Drosophila yakuba and D. santomea.

Genetics. 2005 Sep;171(1):197-210. doi: 10.1534/genetics.104.033597. Epub 2005 Jun 18.

Recent trends in population genetics: more data! More math! Simple models?

J Hered. 2004 Sep-Oct;95(5):397-405. doi: 10.1093/jhered/esh062.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

用于遗传多样性（theta）多基因座群体估计的基因采样策略。

Gene sampling strategies for multi-locus population estimates of genetic diversity (theta).

机构信息

出版信息

BACKGROUND

背景

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献