• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

相似文献

1
Predicting discovery rates of genomic features.预测基因组特征的发现率。
Genetics. 2014 Jun;197(2):601-10. doi: 10.1534/genetics.114.162149. Epub 2014 Mar 17.
2
Demographic history and rare allele sharing among human populations.人口历史与人类群体中的罕见等位基因共享。
Proc Natl Acad Sci U S A. 2011 Jul 19;108(29):11983-8. doi: 10.1073/pnas.1019276108. Epub 2011 Jul 5.
3
kWIP: The k-mer weighted inner product, a de novo estimator of genetic similarity.kWIP:k-mer加权内积,一种遗传相似性的从头估计器。
PLoS Comput Biol. 2017 Sep 5;13(9):e1005727. doi: 10.1371/journal.pcbi.1005727. eCollection 2017 Sep.
4
The inference of sex-biased human demography from whole-genome data.从全基因组数据推断人类性别偏向的人口统计学。
PLoS Genet. 2019 Sep 20;15(9):e1008293. doi: 10.1371/journal.pgen.1008293. eCollection 2019 Sep.
5
Computational and statistical approaches to analyzing variants identified by exome sequencing.基于外显子组测序鉴定变异的计算与统计分析方法。
Genome Biol. 2011 Sep 14;12(9):227. doi: 10.1186/gb-2011-12-9-227.
6
Estimating the number of unseen variants in the human genome.估算人类基因组中未发现变异的数量。
Proc Natl Acad Sci U S A. 2009 Mar 31;106(13):5008-13. doi: 10.1073/pnas.0807815106. Epub 2009 Mar 10.
7
[From population genetics to population genomics of forest trees: integrated population genomics approach].[从林木群体遗传学到群体基因组学:综合群体基因组学方法]
Genetika. 2006 Oct;42(10):1304-18.
8
Identification of regions of positive selection using Shared Genomic Segment analysis.利用共享基因组片段分析鉴定正选择区域。
Eur J Hum Genet. 2011 Jun;19(6):667-71. doi: 10.1038/ejhg.2010.257. Epub 2011 Feb 9.
9
Approaches to estimating inbreeding coefficients in clinical isolates of Plasmodium falciparum from genomic sequence data.利用基因组序列数据估算恶性疟原虫临床分离株近亲繁殖系数的方法。
Malar J. 2016 Sep 15;15:473. doi: 10.1186/s12936-016-1531-z.
10
A framework for variation discovery and genotyping using next-generation DNA sequencing data.利用下一代 DNA 测序数据进行变异发现和基因分型的框架。
Nat Genet. 2011 May;43(5):491-8. doi: 10.1038/ng.806. Epub 2011 Apr 10.

引用本文的文献

1
Scaled Process Priors for Bayesian Nonparametric Estimation of the Unseen Genetic Variation.用于未观察到的基因变异的贝叶斯非参数估计的尺度化过程先验
J Am Stat Assoc. 2022 Sep 29;119(545):320-331. doi: 10.1080/01621459.2022.2115918. eCollection 2024.
2
Genome sequencing of 2000 canids by the Dog10K consortium advances the understanding of demography, genome function and architecture.犬 10K 联盟对 2000 只犬科动物进行基因组测序,增进了对种群动态、基因组功能和结构的了解。
Genome Biol. 2023 Aug 15;24(1):187. doi: 10.1186/s13059-023-03023-7.
3
RAREsim: A simulation method for very rare genetic variants.RAREsim:一种用于非常罕见遗传变异的模拟方法。
Am J Hum Genet. 2022 Apr 7;109(4):680-691. doi: 10.1016/j.ajhg.2022.02.009. Epub 2022 Mar 16.
4
A numerical framework for genetic hitchhiking in populations of variable size.一个用于种群中遗传渐渗的数值框架,该种群的大小是可变的。
Genetics. 2022 Mar 3;220(3). doi: 10.1093/genetics/iyac012.
5
Inferring the Joint Demographic History of Multiple Populations: Beyond the Diffusion Approximation.推断多个群体的联合人口历史:超越扩散近似法
Genetics. 2017 Jul;206(3):1549-1567. doi: 10.1534/genetics.117.200493. Epub 2017 May 11.
6
Quantifying unobserved protein-coding variants in human populations provides a roadmap for large-scale sequencing projects.量化人类群体中未被观察到的蛋白质编码变异为大规模测序项目提供了路线图。
Nat Commun. 2016 Oct 31;7:13293. doi: 10.1038/ncomms13293.

本文引用的文献

1
An integrated map of genetic variation from 1,092 human genomes.1092 个人类基因组遗传变异的综合图谱。
Nature. 2012 Nov 1;491(7422):56-65. doi: 10.1038/nature11632.
2
The accessible chromatin landscape of the human genome.人类基因组的可及染色质景观。
Nature. 2012 Sep 6;489(7414):75-82. doi: 10.1038/nature11232.
3
An abundance of rare functional variants in 202 drug target genes sequenced in 14,002 people.在 14002 个人中对 202 个药物靶标基因进行测序,发现了大量罕见的功能变异。
Science. 2012 Jul 6;337(6090):100-4. doi: 10.1126/science.1217876. Epub 2012 May 17.
4
Evolution and functional impact of rare coding variation from deep sequencing of human exomes.人类外显子组深度测序中罕见编码变异的进化和功能影响。
Science. 2012 Jul 6;337(6090):64-9. doi: 10.1126/science.1219240. Epub 2012 May 17.
5
Demographic history and rare allele sharing among human populations.人口历史与人类群体中的罕见等位基因共享。
Proc Natl Acad Sci U S A. 2011 Jul 19;108(29):11983-8. doi: 10.1073/pnas.1019276108. Epub 2011 Jul 5.
6
Non-equilibrium allele frequency spectra via spectral methods.通过谱方法得到的非平衡等位基因频率谱
Theor Popul Biol. 2011 Jun;79(4):203-19. doi: 10.1016/j.tpb.2011.02.003. Epub 2011 Mar 2.
7
On the optimal design of genetic variant discovery studies.关于基因变异发现研究的最优设计
Stat Appl Genet Mol Biol. 2010;9(1):Article33. doi: 10.2202/1544-6115.1581. Epub 2010 Aug 27.
8
Inferring the joint demographic history of multiple populations from multidimensional SNP frequency data.从多维 SNP 频率数据推断多个群体的联合人口历史。
PLoS Genet. 2009 Oct;5(10):e1000695. doi: 10.1371/journal.pgen.1000695. Epub 2009 Oct 23.
9
Estimating the number of unseen variants in the human genome.估算人类基因组中未发现变异的数量。
Proc Natl Acad Sci U S A. 2009 Mar 31;106(13):5008-13. doi: 10.1073/pnas.0807815106. Epub 2009 Mar 10.
10
Can one learn history from the allelic spectrum?能否从等位基因谱中了解历史?
Theor Popul Biol. 2008 May;73(3):342-8. doi: 10.1016/j.tpb.2008.01.001. Epub 2008 Jan 30.

预测基因组特征的发现率。

Predicting discovery rates of genomic features.

作者信息

Gravel Simon

机构信息

Department of Human Genetics and Génome Québec Innovation Centre, McGill University, Montréal, Quebec H3A 0G1, Canada and

出版信息

Genetics. 2014 Jun;197(2):601-10. doi: 10.1534/genetics.114.162149. Epub 2014 Mar 17.

DOI:10.1534/genetics.114.162149
PMID:24637199
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC4063918/
Abstract

Successful sequencing experiments require judicious sample selection. However, this selection must often be performed on the basis of limited preliminary data. Predicting the statistical properties of the final sample based on preliminary data can be challenging, because numerous uncertain model assumptions may be involved. Here, we ask whether we can predict "omics" variation across many samples by sequencing only a fraction of them. In the infinite-genome limit, we find that a pilot study sequencing 5% of a population is sufficient to predict the number of genetic variants in the entire population within 6% of the correct value, using an estimator agnostic to demography, selection, or population structure. To reach similar accuracy in a finite genome with millions of polymorphisms, the pilot study would require ∼15% of the population. We present computationally efficient jackknife and linear programming methods that exhibit substantially less bias than the state of the art when applied to simulated data and subsampled 1000 Genomes Project data. Extrapolating based on the National Heart, Lung, and Blood Institute Exome Sequencing Project data, we predict that 7.2% of sites in the capture region would be variable in a sample of 50,000 African Americans and 8.8% in a European sample of equal size. Finally, we show how the linear programming method can also predict discovery rates of various genomic features, such as the number of transcription factor binding sites across different cell types.

摘要

成功的测序实验需要明智地选择样本。然而,这种选择往往必须基于有限的初步数据来进行。基于初步数据预测最终样本的统计特性可能具有挑战性,因为可能涉及众多不确定的模型假设。在这里,我们要问的是,我们能否仅通过对一部分样本进行测序来预测多个样本间的“组学”变异。在无限基因组的极限情况下,我们发现,使用一种与人口统计学、选择或群体结构无关的估计器,对5%的群体进行预实验测序就足以在正确值的6%范围内预测整个人口中的遗传变异数量。在具有数百万个多态性的有限基因组中要达到类似的准确性,预实验则需要约15%的群体。我们提出了计算效率高的留一法和线性规划方法,当应用于模拟数据和 subsampled 1000基因组计划数据时,这些方法的偏差比现有技术小得多。根据美国国立心肺血液研究所外显子测序项目的数据进行推断,我们预测在一个50000名非裔美国人的样本中,捕获区域7.2%的位点会发生变异,在同等规模的欧洲样本中这一比例为8.8%。最后,我们展示了线性规划方法如何还能预测各种基因组特征的发现率,比如不同细胞类型中转录因子结合位点的数量。