Suppr超能文献

一种使用任意倍性样本中的下一代序列数据同时估计局部祖先和混合时间的隐马尔可夫模型方法。

A Hidden Markov Model Approach for Simultaneously Estimating Local Ancestry and Admixture Time Using Next Generation Sequence Data in Samples of Arbitrary Ploidy.

作者信息

Corbett-Detig Russell, Nielsen Rasmus

机构信息

Genomics Institute and Department of Biomolecular Engineering, UC Santa Cruz, Santa Cruz, CA, United States of America.

Department of Integrative Biology, UC Berkeley, Berkeley, CA, United States of America.

出版信息

PLoS Genet. 2017 Jan 3;13(1):e1006529. doi: 10.1371/journal.pgen.1006529. eCollection 2017 Jan.

Abstract

Admixture-the mixing of genomes from divergent populations-is increasingly appreciated as a central process in evolution. To characterize and quantify patterns of admixture across the genome, a number of methods have been developed for local ancestry inference. However, existing approaches have a number of shortcomings. First, all local ancestry inference methods require some prior assumption about the expected ancestry tract lengths. Second, existing methods generally require genotypes, which is not feasible to obtain for many next-generation sequencing projects. Third, many methods assume samples are diploid, however a wide variety of sequencing applications will fail to meet this assumption. To address these issues, we introduce a novel hidden Markov model for estimating local ancestry that models the read pileup data, rather than genotypes, is generalized to arbitrary ploidy, and can estimate the time since admixture during local ancestry inference. We demonstrate that our method can simultaneously estimate the time since admixture and local ancestry with good accuracy, and that it performs well on samples of high ploidy-i.e. 100 or more chromosomes. As this method is very general, we expect it will be useful for local ancestry inference in a wider variety of populations than what previously has been possible. We then applied our method to pooled sequencing data derived from populations of Drosophila melanogaster on an ancestry cline on the east coast of North America. We find that regions of local recombination rates are negatively correlated with the proportion of African ancestry, suggesting that selection against foreign ancestry is the least efficient in low recombination regions. Finally we show that clinal outlier loci are enriched for genes associated with gene regulatory functions, consistent with a role of regulatory evolution in ecological adaptation of admixed D. melanogaster populations. Our results illustrate the potential of local ancestry inference for elucidating fundamental evolutionary processes.

摘要

基因混合——不同种群基因组的混合——日益被视为进化中的核心过程。为了表征和量化全基因组的基因混合模式,已开发出多种用于推断局部祖先的方法。然而,现有方法存在一些缺点。首先,所有局部祖先推断方法都需要对预期的祖先片段长度做一些先验假设。其次,现有方法通常需要基因型,而这对于许多下一代测序项目来说是不可行的。第三,许多方法假设样本是二倍体,然而各种各样的测序应用将无法满足这一假设。为了解决这些问题,我们引入了一种新颖的隐马尔可夫模型来估计局部祖先,该模型对读取堆积数据而非基因型进行建模,可推广到任意倍性,并且能够在局部祖先推断过程中估计混合发生后的时间。我们证明,我们的方法能够以良好的准确性同时估计混合发生后的时间和局部祖先,并且在高倍性样本(即100条或更多染色体)上表现良好。由于该方法非常通用,我们预计它将比以往更广泛地用于各种种群的局部祖先推断。然后,我们将我们的方法应用于从北美洲东海岸祖先渐变带上的黑腹果蝇种群获得的混合测序数据。我们发现,局部重组率区域与非洲祖先比例呈负相关,这表明在低重组区域对外来祖先的选择效率最低。最后,我们表明渐变离群位点富含与基因调控功能相关的基因,这与调控进化在混合黑腹果蝇种群生态适应中的作用一致。我们的结果说明了局部祖先推断在阐明基本进化过程方面的潜力。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b255/5242547/5939f8429daf/pgen.1006529.g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验