• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

从高覆盖度基因组测序项目中估计等位基因频率。

Estimation of allele frequencies from high-coverage genome-sequencing projects.

作者信息

Lynch Michael

机构信息

Department of Biology, Indiana University, Bloomington, Indiana 47405, USA.

出版信息

Genetics. 2009 May;182(1):295-301. doi: 10.1534/genetics.109.100479. Epub 2009 Mar 16.

DOI:10.1534/genetics.109.100479
PMID:19293142
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC2674824/
Abstract

A new generation of high-throughput sequencing strategies will soon lead to the acquisition of high-coverage genomic profiles of hundreds to thousands of individuals within species, generating unprecedented levels of information on the frequencies of nucleotides segregating at individual sites. However, because these new technologies are error prone and yield uneven coverage of alleles in diploid individuals, they also introduce the need for novel methods for analyzing the raw read data. A maximum-likelihood method for the estimation of allele frequencies is developed, eliminating both the need to arbitrarily discard individuals with low coverage and the requirement for an extrinsic measure of the sequence error rate. The resultant estimates are nearly unbiased with asymptotically minimal sampling variance, thereby defining the limits to our ability to estimate population-genetic parameters and providing a logical basis for the optimal design of population-genomic surveys.

摘要

新一代高通量测序策略很快将带来物种内数百至数千个体的高覆盖基因组图谱的获取,产生关于单个位点核苷酸分离频率的前所未有的信息量。然而,由于这些新技术容易出错且在二倍体个体中产生等位基因覆盖不均的情况,它们也带来了对分析原始读取数据的新方法的需求。开发了一种用于估计等位基因频率的最大似然方法,既消除了任意舍弃低覆盖个体的需要,也消除了对序列错误率进行外部测量的要求。所得估计几乎无偏差,渐近采样方差最小,从而确定了我们估计群体遗传参数能力的极限,并为群体基因组调查的优化设计提供了逻辑基础。

相似文献

1
Estimation of allele frequencies from high-coverage genome-sequencing projects.从高覆盖度基因组测序项目中估计等位基因频率。
Genetics. 2009 May;182(1):295-301. doi: 10.1534/genetics.109.100479. Epub 2009 Mar 16.
2
Estimation of nucleotide diversity, disequilibrium coefficients, and mutation rates from high-coverage genome-sequencing projects.从高覆盖度基因组测序项目中估算核苷酸多样性、不平衡系数和突变率。
Mol Biol Evol. 2008 Nov;25(11):2409-19. doi: 10.1093/molbev/msn185. Epub 2008 Aug 25.
3
Fast and accurate estimation of multidimensional site frequency spectra from low-coverage high-throughput sequencing data.从低覆盖高通量测序数据中快速准确地估计多维位点频率谱。
Gigascience. 2022 May 17;11. doi: 10.1093/gigascience/giac032.
4
A novel approach to estimating heterozygosity from low-coverage genome sequence.一种从低覆盖度基因组序列估算杂合度的新方法。
Genetics. 2013 Oct;195(2):553-61. doi: 10.1534/genetics.113.154500. Epub 2013 Aug 9.
5
Genome-wide estimation of linkage disequilibrium from population-level high-throughput sequencing data.基于群体水平高通量测序数据的全基因组连锁不平衡估计
Genetics. 2014 Aug;197(4):1303-13. doi: 10.1534/genetics.114.165514. Epub 2014 May 28.
6
Estimation of allele frequency and association mapping using next-generation sequencing data.利用下一代测序数据进行等位基因频率估计和关联作图。
BMC Bioinformatics. 2011 Jun 11;12:231. doi: 10.1186/1471-2105-12-231.
7
Genotype Calling from Population-Genomic Sequencing Data.基于群体基因组测序数据的基因型分析
G3 (Bethesda). 2017 May 5;7(5):1393-1404. doi: 10.1534/g3.117.039008.
8
Fast individual ancestry inference from DNA sequence data leveraging allele frequencies for multiple populations.利用多个群体的等位基因频率从DNA序列数据中快速推断个体祖先。
BMC Bioinformatics. 2015 Jan 16;16:4. doi: 10.1186/s12859-014-0418-7.
9
Population genomics based on low coverage sequencing: how low should we go?基于低覆盖度测序的群体基因组学:我们应该低到什么程度?
Mol Ecol. 2013 Jun;22(11):3028-35. doi: 10.1111/mec.12105. Epub 2012 Nov 22.
10
Genotype-free estimation of allele frequencies reduces bias and improves demographic inference from RADSeq data.无基因型估计等位基因频率可减少偏差并提高 RADSeq 数据的种群遗传推断准确性。
Mol Ecol Resour. 2019 May;19(3):586-596. doi: 10.1111/1755-0998.12990. Epub 2019 Apr 17.

引用本文的文献

1
Speciation with gene flow in an island endemic hummingbird.岛屿特有蜂鸟中伴随基因流动的物种形成
PNAS Nexus. 2025 Apr 15;4(4):pgaf095. doi: 10.1093/pnasnexus/pgaf095. eCollection 2025 Apr.
2
Appropriate Use of Bifactor Analysis in Psychopathology Research: Appreciating Benefits and Limitations.双因素分析在精神病理学研究中的恰当运用:认识其优势与局限。
Biol Psychiatry. 2020 Jul 1;88(1):18-27. doi: 10.1016/j.biopsych.2020.01.013. Epub 2020 Jan 28.
3
Genetic diversity among cultivated beets (Beta vulgaris) assessed via population-based whole genome sequences.基于群体的全基因组序列评估栽培甜菜(Beta vulgaris)的遗传多样性。
BMC Genomics. 2020 Mar 2;21(1):189. doi: 10.1186/s12864-020-6451-1.
4
Efficient genome-wide genotyping strategies and data integration in crop plants.作物中高效的全基因组基因分型策略与数据整合
Theor Appl Genet. 2018 Mar;131(3):499-511. doi: 10.1007/s00122-018-3056-z. Epub 2018 Jan 19.
5
Estimating error models for whole genome sequencing using mixtures of Dirichlet-multinomial distributions.使用狄利克雷多项分布的混合模型估计全基因组测序的误差模型。
Bioinformatics. 2017 Aug 1;33(15):2322-2329. doi: 10.1093/bioinformatics/btx133.
6
Genotype Calling from Population-Genomic Sequencing Data.基于群体基因组测序数据的基因型分析
G3 (Bethesda). 2017 May 5;7(5):1393-1404. doi: 10.1534/g3.117.039008.
7
Fast-GBS: a new pipeline for the efficient and highly accurate calling of SNPs from genotyping-by-sequencing data.Fast-GBS:一种用于从测序基因分型数据中高效且高精度地调用单核苷酸多态性(SNP)的新流程。
BMC Bioinformatics. 2017 Jan 3;18(1):5. doi: 10.1186/s12859-016-1431-9.
8
From next-generation resequencing reads to a high-quality variant data set.从新一代重测序 reads 到高质量变异数据集。
Heredity (Edinb). 2017 Feb;118(2):111-124. doi: 10.1038/hdy.2016.102. Epub 2016 Oct 19.
9
Genotype-Frequency Estimation from High-Throughput Sequencing Data.高通量测序数据的基因型频率估计。
Genetics. 2015 Oct;201(2):473-86. doi: 10.1534/genetics.115.179077. Epub 2015 Jul 29.
10
Genome-wide estimation of linkage disequilibrium from population-level high-throughput sequencing data.基于群体水平高通量测序数据的全基因组连锁不平衡估计
Genetics. 2014 Aug;197(4):1303-13. doi: 10.1534/genetics.114.165514. Epub 2014 May 28.

本文引用的文献

1
Population genetic inference from resequencing data.基于重测序数据的群体遗传推断。
Genetics. 2009 Jan;181(1):187-97. doi: 10.1534/genetics.107.080630. Epub 2008 Nov 3.
2
Estimation of nucleotide diversity, disequilibrium coefficients, and mutation rates from high-coverage genome-sequencing projects.从高覆盖度基因组测序项目中估算核苷酸多样性、不平衡系数和突变率。
Mol Biol Evol. 2008 Nov;25(11):2409-19. doi: 10.1093/molbev/msn185. Epub 2008 Aug 25.
3
Population genetic analysis of shotgun assemblies of genomic sequences from multiple individuals.来自多个个体的基因组序列鸟枪法组装的群体遗传分析。
Genome Res. 2008 Jul;18(7):1020-9. doi: 10.1101/gr.074187.107. Epub 2008 Apr 14.
4
DNA from pre-Clovis human coprolites in Oregon, North America.来自北美俄勒冈州克洛维斯人之前的人类粪便化石中的DNA。
Science. 2008 May 9;320(5877):786-9. doi: 10.1126/science.1154116. Epub 2008 Apr 3.
5
Joint inference of the distribution of fitness effects of deleterious mutations and population demography based on nucleotide polymorphism frequencies.基于核苷酸多态性频率对有害突变的适合度效应分布和群体人口统计学进行联合推断。
Genetics. 2007 Dec;177(4):2251-61. doi: 10.1534/genetics.107.080663.
6
Accounting for bias from sequencing error in population genetic estimates.在群体遗传估计中考虑测序误差引起的偏差。
Mol Biol Evol. 2008 Jan;25(1):199-206. doi: 10.1093/molbev/msm239. Epub 2007 Nov 2.
7
Patterns of damage in genomic DNA sequences from a Neandertal.来自尼安德特人的基因组DNA序列中的损伤模式。
Proc Natl Acad Sci U S A. 2007 Sep 11;104(37):14616-21. doi: 10.1073/pnas.0704665104. Epub 2007 Aug 21.
8
Accuracy and quality of massively parallel DNA pyrosequencing.大规模平行DNA焦磷酸测序的准确性和质量
Genome Biol. 2007;8(7):R143. doi: 10.1186/gb-2007-8-7-r143.
9
Estimation of average heterozygosity and genetic distance from a small number of individuals.从少数个体估计平均杂合度和遗传距离。
Genetics. 1978 Jul;89(3):583-90. doi: 10.1093/genetics/89.3.583.
10
The structure of linkage disequilibrium around a selective sweep.选择性清除周围的连锁不平衡结构。
Genetics. 2007 Mar;175(3):1395-406. doi: 10.1534/genetics.106.062828. Epub 2006 Dec 28.