Suppr超能文献

使用祖先重组图估计合并时间的方法评估

Evaluation of methods for estimating coalescence times using ancestral recombination graphs.

作者信息

Y C Brandt Débora, Wei Xinzhu, Deng Yun, Vaughn Andrew H, Nielsen Rasmus

机构信息

Department of Integrative Biology, University of California Berkeley, Berkeley, CA 94720, USA.

Department of Computational Biology, Cornell University, Ithaca, NY 14850, USA.

出版信息

Genetics. 2022 May 5;221(1). doi: 10.1093/genetics/iyac044.

Abstract

The ancestral recombination graph is a structure that describes the joint genealogies of sampled DNA sequences along the genome. Recent computational methods have made impressive progress toward scalably estimating whole-genome genealogies. In addition to inferring the ancestral recombination graph, some of these methods can also provide ancestral recombination graphs sampled from a defined posterior distribution. Obtaining good samples of ancestral recombination graphs is crucial for quantifying statistical uncertainty and for estimating population genetic parameters such as effective population size, mutation rate, and allele age. Here, we use standard neutral coalescent simulations to benchmark the estimates of pairwise coalescence times from 3 popular ancestral recombination graph inference programs: ARGweaver, Relate, and tsinfer+tsdate. We compare (1) the true coalescence times to the inferred times at each locus; (2) the distribution of coalescence times across all loci to the expected exponential distribution; (3) whether the sampled coalescence times have the properties expected of a valid posterior distribution. We find that inferred coalescence times at each locus are most accurate in ARGweaver, and often more accurate in Relate than in tsinfer+tsdate. However, all 3 methods tend to overestimate small coalescence times and underestimate large ones. Lastly, the posterior distribution of ARGweaver is closer to the expected posterior distribution than Relate's, but this higher accuracy comes at a substantial trade-off in scalability. The best choice of method will depend on the number and length of input sequences and on the goal of downstream analyses, and we provide guidelines for the best practices.

摘要

祖先重组图是一种描述沿基因组采样的DNA序列的联合系谱的结构。最近的计算方法在可扩展地估计全基因组系谱方面取得了令人瞩目的进展。除了推断祖先重组图外,其中一些方法还可以提供从定义的后验分布中采样的祖先重组图。获得良好的祖先重组图样本对于量化统计不确定性以及估计诸如有效种群大小、突变率和等位基因年龄等群体遗传参数至关重要。在这里,我们使用标准的中性合并模拟来对来自3个流行的祖先重组图推断程序(ARGweaver、Relate和tsinfer+tsdate)的成对合并时间估计进行基准测试。我们比较了:(1)每个位点的真实合并时间与推断时间;(2)所有位点的合并时间分布与预期的指数分布;(3)采样的合并时间是否具有有效后验分布所期望的属性。我们发现,ARGweaver中每个位点的推断合并时间最准确,Relate中的推断合并时间通常比tsinfer+tsdate中的更准确。然而,所有这3种方法都倾向于高估小的合并时间而低估大的合并时间。最后,ARGweaver的后验分布比Relate的更接近预期的后验分布,但这种更高的准确性是以可扩展性方面的巨大权衡为代价的。最佳方法的选择将取决于输入序列的数量和长度以及下游分析的目标,并且我们提供了最佳实践指南。

相似文献

2
The Promise of Inferring the Past Using the Ancestral Recombination Graph.
Genome Biol Evol. 2024 Feb 1;16(2). doi: 10.1093/gbe/evae005.
3
Inference of Ancestral Recombination Graphs Using ARGweaver.
Methods Mol Biol. 2020;2090:231-266. doi: 10.1007/978-1-0716-0199-0_10.
4
Evaluating ARG-estimation methods in the context of estimating population-mean polygenic score histories.
bioRxiv. 2024 Dec 20:2024.05.24.595829. doi: 10.1101/2024.05.24.595829.
5
Asymptotic distributions of coalescence times and ancestral lineage numbers for populations with temporally varying size.
Genetics. 2013 Jul;194(3):721-36. doi: 10.1534/genetics.113.151522. Epub 2013 May 11.
6
Genome-wide inference of ancestral recombination graphs.
PLoS Genet. 2014 May 15;10(5):e1004342. doi: 10.1371/journal.pgen.1004342. eCollection 2014.
7
An ancestral recombination graph for diploid populations with skewed offspring distribution.
Genetics. 2013 Jan;193(1):255-90. doi: 10.1534/genetics.112.144329. Epub 2012 Nov 12.
8
The distribution of waiting distances in ancestral recombination graphs.
Theor Popul Biol. 2021 Oct;141:34-43. doi: 10.1016/j.tpb.2021.06.003. Epub 2021 Jun 26.
9
Recoverability of ancestral recombination graph topologies.
Theor Popul Biol. 2023 Dec;154:27-39. doi: 10.1016/j.tpb.2023.07.004. Epub 2023 Aug 5.
10
The SMC' is a highly accurate approximation to the ancestral recombination graph.
Genetics. 2015 May;200(1):343-55. doi: 10.1534/genetics.114.173898. Epub 2015 Mar 17.

引用本文的文献

2
GHIST 2024: The 1st Genomic History Inference Strategies Tournament.
bioRxiv. 2025 Aug 11:2025.08.05.668560. doi: 10.1101/2025.08.05.668560.
4
5
Tsbrowse: an interactive browser for ancestral recombination graphs.
Bioinformatics. 2025 Aug 2;41(8). doi: 10.1093/bioinformatics/btaf393.
6
Recent Statistical Innovations in Human Genetics.
Ann Hum Genet. 2025 Sep;89(5):241-254. doi: 10.1111/ahg.12606. Epub 2025 Jun 27.
8
Constructing ancestral recombination graphs through reinforcement learning.
Front Genet. 2025 Apr 29;16:1569358. doi: 10.3389/fgene.2025.1569358. eCollection 2025.
10
A likelihood-based framework for demographic inference from genealogical trees.
Nat Genet. 2025 Apr;57(4):865-874. doi: 10.1038/s41588-025-02129-x. Epub 2025 Mar 20.

本文引用的文献

1
A unified genealogy of modern and ancient genomes.
Science. 2022 Feb 25;375(6583):eabi8264. doi: 10.1126/science.abi8264.
2
Efficient ancestry and mutation simulation with msprime 1.0.
Genetics. 2022 Mar 3;220(3). doi: 10.1093/genetics/iyab229.
3
The distribution of waiting distances in ancestral recombination graphs.
Theor Popul Biol. 2021 Oct;141:34-43. doi: 10.1016/j.tpb.2021.06.003. Epub 2021 Jun 26.
4
Mapping gene flow between ancient hominins through demography-aware inference of the ancestral recombination graph.
PLoS Genet. 2020 Aug 6;16(8):e1008895. doi: 10.1371/journal.pgen.1008895. eCollection 2020 Aug.
5
Efficiently Summarizing Relationships in Large Samples: A General Duality Between Statistics of Genealogies and Genomes.
Genetics. 2020 Jul;215(3):779-797. doi: 10.1534/genetics.120.303253. Epub 2020 May 1.
6
Inference of Ancestral Recombination Graphs Using ARGweaver.
Methods Mol Biol. 2020;2090:231-266. doi: 10.1007/978-1-0716-0199-0_10.
7
An approximate full-likelihood method for inferring selection and allele frequency trajectories from DNA sequence data.
PLoS Genet. 2019 Sep 13;15(9):e1008384. doi: 10.1371/journal.pgen.1008384. eCollection 2019 Sep.
8
Inferring whole-genome histories in large population datasets.
Nat Genet. 2019 Sep;51(9):1330-1338. doi: 10.1038/s41588-019-0483-y. Epub 2019 Sep 2.
9
A method for genome-wide genealogy estimation for thousands of samples.
Nat Genet. 2019 Sep;51(9):1321-1329. doi: 10.1038/s41588-019-0484-x. Epub 2019 Sep 2.
10
Inference of complex population histories using whole-genome sequences from multiple populations.
Proc Natl Acad Sci U S A. 2019 Aug 20;116(34):17115-17120. doi: 10.1073/pnas.1905060116. Epub 2019 Aug 6.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验