Suppr超能文献

高通量测序数据的基因型频率估计。

Genotype-Frequency Estimation from High-Throughput Sequencing Data.

机构信息

Department of Biology, Indiana University, Bloomington, Indiana 47405

Department of Biology, Indiana University, Bloomington, Indiana 47405.

出版信息

Genetics. 2015 Oct;201(2):473-86. doi: 10.1534/genetics.115.179077. Epub 2015 Jul 29.

Abstract

Rapidly improving high-throughput sequencing technologies provide unprecedented opportunities for carrying out population-genomic studies with various organisms. To take full advantage of these methods, it is essential to correctly estimate allele and genotype frequencies, and here we present a maximum-likelihood method that accomplishes these tasks. The proposed method fully accounts for uncertainties resulting from sequencing errors and biparental chromosome sampling and yields essentially unbiased estimates with minimal sampling variances with moderately high depths of coverage regardless of a mating system and structure of the population. Moreover, we have developed statistical tests for examining the significance of polymorphisms and their genotypic deviations from Hardy-Weinberg equilibrium. We examine the performance of the proposed method by computer simulations and apply it to low-coverage human data generated by high-throughput sequencing. The results show that the proposed method improves our ability to carry out population-genomic analyses in important ways. The software package of the proposed method is freely available from https://github.com/Takahiro-Maruki/Package-GFE.

摘要

高通量测序技术的快速发展为各种生物的群体基因组研究提供了前所未有的机会。为了充分利用这些方法,正确估计等位基因和基因型频率是至关重要的,为此我们提出了一种最大似然法来完成这些任务。该方法充分考虑了测序错误和双亲染色体采样产生的不确定性,并且无论交配系统和群体结构如何,在中等深度覆盖的情况下,都能以最小的采样方差产生基本上无偏的估计值。此外,我们还开发了用于检验多态性及其与哈迪-温伯格平衡的基因型偏差显著性的统计检验。我们通过计算机模拟来检验所提出方法的性能,并将其应用于高通量测序产生的低覆盖度人类数据。结果表明,所提出的方法在重要方面提高了我们进行群体基因组分析的能力。所提出方法的软件包可从 https://github.com/Takahiro-Maruki/Package-GFE 免费获得。

相似文献

1
Genotype-Frequency Estimation from High-Throughput Sequencing Data.高通量测序数据的基因型频率估计。
Genetics. 2015 Oct;201(2):473-86. doi: 10.1534/genetics.115.179077. Epub 2015 Jul 29.
8
Estimating IBD tracts from low coverage NGS data.从低覆盖度 NGS 数据估算 IBD 片段。
Bioinformatics. 2016 Jul 15;32(14):2096-102. doi: 10.1093/bioinformatics/btw212. Epub 2016 Apr 22.

引用本文的文献

2
The genome-wide signature of short-term temporal selection.短期时间选择的全基因组特征。
Proc Natl Acad Sci U S A. 2024 Jul 9;121(28):e2307107121. doi: 10.1073/pnas.2307107121. Epub 2024 Jul 3.
7
Evolutionary Genomics of a Subdivided Species.细分物种的进化基因组学。
Mol Biol Evol. 2022 Aug 3;39(8). doi: 10.1093/molbev/msac152.

本文引用的文献

1
ANGSD: Analysis of Next Generation Sequencing Data.ANGSD:下一代测序数据分析
BMC Bioinformatics. 2014 Nov 25;15(1):356. doi: 10.1186/s12859-014-0356-4.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验