Suppr超能文献

在估计群体平均多基因评分历史的背景下评估ARG估计方法。

Evaluating ARG-estimation methods in the context of estimating population-mean polygenic score histories.

作者信息

Peng Dandan, Mulder Obadiah J, Edge Michael D

机构信息

Department of Quantitative and Computational Biology, University of Southern California, 1050 Childs Way, Los Angeles, CA 90098, USA.

出版信息

Genetics. 2025 Apr 17;229(4). doi: 10.1093/genetics/iyaf033.

Abstract

Scalable methods for estimating marginal coalescent trees across the genome present new opportunities for studying evolution and have generated considerable excitement, with new methods extending scalability to thousands of samples. Benchmarking of the available methods has revealed general tradeoffs between accuracy and scalability, but performance in downstream applications has not always been easily predictable from general performance measures, suggesting that specific features of the ancestral recombination graph (ARG) may be important for specific downstream applications of estimated ARGs. To exemplify this point, we benchmark ARG estimation methods with respect to a specific set of methods for estimating the historical time course of a population-mean polygenic score (PGS) using the marginal coalescent trees encoded by the ARG. Here, we examine the performance in simulation of seven ARG estimation methods: ARGweaver, RENT+, Relate, tsinfer+tsdate, ARG-Needle, ASMC-clust, and SINGER, using their estimated coalescent trees and examining bias, mean squared error, confidence interval coverage, and Type I and II error rates of the downstream methods. Although it does not scale to the sample sizes attainable by other new methods, SINGER produced the most accurate estimated PGS histories in many instances, even when Relate, tsinfer+tsdate, ARG-Needle, and ASMC-clust used samples 10 or more times as large as those used by SINGER. In general, the best choice of method depends on the number of samples available and the historical time period of interest. In particular, the unprecedented sample sizes allowed by Relate, tsinfer+tsdate, ARG-Needle, and ASMC-clust are of greatest importance when the recent past is of interest-further back in time, most of the tree has coalesced, and differences in contemporary sample size are less salient.

摘要

用于估计全基因组边际合并树的可扩展方法为研究进化带来了新机遇,并引发了广泛关注,新方法将可扩展性扩展到了数千个样本。对现有方法的基准测试揭示了准确性和可扩展性之间的一般权衡,但从一般性能指标并不总是能轻易预测下游应用中的性能,这表明祖先重组图(ARG)的特定特征可能对估计ARG的特定下游应用很重要。为了说明这一点,我们针对一组特定的方法对ARG估计方法进行基准测试,该方法使用ARG编码的边际合并树来估计群体平均多基因评分(PGS)的历史时间进程。在这里,我们研究了七种ARG估计方法在模拟中的性能:ARGweaver、RENT+、Relate、tsinfer+tsdate、ARG-Needle、ASMC-clust和SINGER,使用它们估计的合并树,并检查下游方法的偏差、均方误差、置信区间覆盖率以及I型和II型错误率。尽管SINGER无法扩展到其他新方法所能达到的样本量,但在许多情况下,它产生了最准确的估计PGS历史,即使Relate、tsinfer+tsdate、ARG-Needle和ASMC-clust使用的样本量是SINGER的10倍或更多。一般来说,最佳方法选择取决于可用样本数量和感兴趣的历史时间段。特别是,当关注较近的过去时,Relate、tsinfer+tsdate、ARG-Needle和ASMC-clust所允许的前所未有的样本量最为重要——时间回溯得更远,大部分树已经合并,当代样本量的差异就不那么显著了。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d520/12005257/d8f41cb84743/iyaf033f1.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验