Suppr超能文献

比较两向和三向混合群体中的本地祖先推断模型。

Comparing local ancestry inference models in populations of two- and three-way admixture.

作者信息

Schubert Ryan, Andaleon Angela, Wheeler Heather E

机构信息

Department of Mathematics and Statistics, Loyola University Chicago, Chicago, IL, United States of America.

Department of Biology, Loyola University Chicago, Chicago, IL, United States of America.

出版信息

PeerJ. 2020 Oct 2;8:e10090. doi: 10.7717/peerj.10090. eCollection 2020.

Abstract

Local ancestry estimation infers the regional ancestral origin of chromosomal segments in admixed populations using reference populations and a variety of statistical models. Integrating local ancestry into complex trait genetics has the potential to increase detection of genetic associations and improve genetic prediction models in understudied admixed populations, including African Americans and Hispanics. Five methods for local ancestry estimation that have been used in human complex trait genetics are LAMP-LD (2012), RFMix (2013), ELAI (2014), Loter (2018), and MOSAIC (2019). As users rather than developers, we sought to perform direct comparisons of accuracy, runtime, memory usage, and usability of these software tools to determine which is best for incorporation into association study pipelines. We find that in the majority of cases RFMix has the highest median accuracy with the ranking of the remaining software dependent on the ancestral architecture of the population tested. Additionally, we estimate the O(n) of both memory and runtime for each software and find that for both time and memory most software increase linearly with respect to sample size. The only exception is RFMix, which increases quadratically with respect to runtime and linearly with respect to memory. Effective local ancestry estimation tools are necessary to increase diversity and prevent population disparities in human genetics studies. RFMix performs the best across methods, however, depending on application, other methods perform just as well with the benefit of shorter runtimes. Scripts used to format data, run software, and estimate accuracy can be found at https://github.com/WheelerLab/LAI_benchmarking.

摘要

本地血统估计利用参考群体和各种统计模型推断混合群体中染色体片段的区域祖先起源。将本地血统整合到复杂性状遗传学中,有可能在包括非裔美国人和西班牙裔在内的研究较少的混合群体中增加遗传关联的检测,并改进遗传预测模型。人类复杂性状遗传学中使用的五种本地血统估计方法是LAMP-LD(2012年)、RFMix(2013年)、ELAI(2014年)、Loter(2018年)和MOSAIC(2019年)。作为用户而非开发者,我们试图对这些软件工具的准确性、运行时间、内存使用和可用性进行直接比较,以确定哪种工具最适合纳入关联研究流程。我们发现,在大多数情况下,RFMix的中位数准确性最高,其余软件的排名取决于所测试群体的祖先结构。此外,我们估计了每个软件的内存和运行时间的O(n),发现对于时间和内存,大多数软件都随样本量线性增加。唯一的例外是RFMix,其运行时间呈二次方增加,内存呈线性增加。有效的本地血统估计工具对于增加人类遗传学研究中的多样性和防止群体差异是必要的。RFMix在所有方法中表现最佳,然而根据应用情况,其他方法在运行时间较短的情况下表现同样出色。用于格式化数据、运行软件和估计准确性的脚本可在https://github.com/WheelerLab/LAI_benchmarking上找到。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1e5b/7537619/7c915dc2cc6f/peerj-08-10090-g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验