• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

使用病例对照样本池的新一代测序进行关联研究的分析与优化设计

Analysis and optimal design for association studies using next-generation sequencing with case-control pools.

作者信息

Liang Wei E, Thomas Duncan C, Conti David V

机构信息

Department of Preventive Medicine, University of Southern California, Los Angeles, California.

出版信息

Genet Epidemiol. 2012 Dec;36(8):870-81. doi: 10.1002/gepi.21681. Epub 2012 Sep 12.

DOI:10.1002/gepi.21681
PMID:22972696
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC4139478/
Abstract

With its potential to discover a much greater amount of genetic variation, next-generation sequencing is fast becoming an emergent tool for genetic association studies. However, the cost of sequencing all individuals in a large-scale population study is still high in comparison to most alternative genotyping options. While the ability to identify individual-level data is lost (without bar-coding), sequencing pooled samples can substantially lower costs without compromising the power to detect significant associations. We propose a hierarchical Bayesian model that estimates the association of each variant using pools of cases and controls, accounting for the variation in read depth across pools and sequencing error. To investigate the performance of our method across a range of number of pools, number of individuals within each pool, and average coverage, we undertook extensive simulations varying effect sizes, minor allele frequencies, and sequencing error rates. In general, the number of pools and pool size have dramatic effects on power while the total depth of coverage per pool has only a moderate impact. This information can guide the selection of a study design that maximizes power subject to cost, sample size, or other laboratory constraints. We provide an R package (hiPOD: hierarchical Pooled Optimal Design) to find the optimal design, allowing the user to specify a cost function, cost, and sample size limitations, and distributions of effect size, minor allele frequency, and sequencing error rate.

摘要

凭借其发现大量遗传变异的潜力,下一代测序正迅速成为遗传关联研究的一种新兴工具。然而,与大多数其他基因分型方法相比,在大规模人群研究中对所有个体进行测序的成本仍然很高。虽然在没有条形码的情况下会丢失识别个体水平数据的能力,但对混合样本进行测序可以在不影响检测显著关联能力的前提下大幅降低成本。我们提出了一种分层贝叶斯模型,该模型使用病例组和对照组的混合样本估计每个变异的关联,同时考虑不同混合样本间的测序深度差异和测序错误。为了研究我们的方法在不同数量的混合样本、每个混合样本中的个体数量以及平均覆盖度下的性能,我们进行了广泛的模拟,改变效应大小、次要等位基因频率和测序错误率。一般来说,混合样本的数量和混合样本大小对检验效能有显著影响,而每个混合样本的总覆盖深度只有适度影响。这些信息可以指导研究设计的选择,以便在成本、样本量或其他实验室限制条件下使检验效能最大化。我们提供了一个R包(hiPOD:分层混合最优设计)来寻找最优设计,允许用户指定成本函数、成本和样本量限制,以及效应大小、次要等位基因频率和测序错误率的分布。

相似文献

1
Analysis and optimal design for association studies using next-generation sequencing with case-control pools.使用病例对照样本池的新一代测序进行关联研究的分析与优化设计
Genet Epidemiol. 2012 Dec;36(8):870-81. doi: 10.1002/gepi.21681. Epub 2012 Sep 12.
2
Design of association studies with pooled or un-pooled next-generation sequencing data.基于汇集或未汇集下一代测序数据的关联研究设计。
Genet Epidemiol. 2010 Jul;34(5):479-91. doi: 10.1002/gepi.20501.
3
Estimation of population allele frequencies from next-generation sequencing data: pool-versus individual-based genotyping.基于下一代测序数据的群体等位基因频率估计:基于池与个体的基因分型。
Mol Ecol. 2013 Jul;22(14):3766-79. doi: 10.1111/mec.12360. Epub 2013 Jun 4.
4
A unified approach for allele frequency estimation, SNP detection and association studies based on pooled sequencing data using EM algorithms.基于 EM 算法的基于测序数据的等位基因频率估计、SNP 检测和关联研究的统一方法。
BMC Genomics. 2013;14 Suppl 1(Suppl 1):S1. doi: 10.1186/1471-2164-14-S1-S1. Epub 2013 Jan 21.
5
Low-, high-coverage, and two-stage DNA sequencing in the design of the genetic association study.遗传关联研究设计中的低覆盖度、高覆盖度和两阶段DNA测序
Genet Epidemiol. 2017 Apr;41(3):187-197. doi: 10.1002/gepi.22015. Epub 2016 Nov 4.
6
Efficient study design for next generation sequencing.下一代测序的高效研究设计
Genet Epidemiol. 2011 May;35(4):269-77. doi: 10.1002/gepi.20575.
7
SNP calling by sequencing pooled samples.基于测序的混合样本 SNP 检测。
BMC Bioinformatics. 2012 Sep 20;13:239. doi: 10.1186/1471-2105-13-239.
8
Confounded by sequencing depth in association studies of rare alleles.在罕见等位基因关联研究中受测序深度的困扰。
Genet Epidemiol. 2011 May;35(4):261-8. doi: 10.1002/gepi.20574.
9
Accurate detection of subclonal single nucleotide variants in whole genome amplified and pooled cancer samples using HaloPlex target enrichment.使用HaloPlex靶向富集技术准确检测全基因组扩增和混合癌症样本中的亚克隆单核苷酸变异。
BMC Genomics. 2013 Dec 5;14(1):856. doi: 10.1186/1471-2164-14-856.
10
On optimal pooling designs to identify rare variants through massive resequencing.通过大规模重测序鉴定罕见变异的最优合并设计。
Genet Epidemiol. 2011 Apr;35(3):139-47. doi: 10.1002/gepi.20561. Epub 2011 Jan 19.

引用本文的文献

1
On the design and analysis of next-generation sequencing genotyping for a cohort with haplotype-informative reads.关于具有单倍型信息性读段的队列的下一代测序基因分型的设计与分析。
Methods. 2015 Jun;79-80:41-6. doi: 10.1016/j.ymeth.2015.01.016. Epub 2015 Jan 30.
2
Two-stage family-based designs for sequencing studies.用于测序研究的两阶段基于家系的设计。
BMC Proc. 2014 Jun 17;8(Suppl 1):S32. doi: 10.1186/1753-6561-8-S1-S32. eCollection 2014.
3
A stepwise likelihood ratio test procedure for rare variant selection in case-control studies.

本文引用的文献

1
Estimating allele frequency from next-generation sequencing of pooled mitochondrial DNA samples.从混合线粒体DNA样本的下一代测序中估计等位基因频率。
Front Genet. 2011 Aug 17;2:51. doi: 10.3389/fgene.2011.00051. eCollection 2011.
2
Incorporating model uncertainty in detecting rare variants: the Bayesian risk index.在检测罕见变异中纳入模型不确定性:贝叶斯风险指数。
Genet Epidemiol. 2011 Nov;35(7):638-49. doi: 10.1002/gepi.20613. Epub 2011 Aug 26.
3
Testing for an unusual distribution of rare variants.检测罕见变异的异常分布。
病例对照研究中用于罕见变异选择的逐步似然比检验程序。
J Hum Genet. 2014 Apr;59(4):198-205. doi: 10.1038/jhg.2014.1. Epub 2014 Jan 23.
4
Two-phase and family-based designs for next-generation sequencing studies.用于下一代测序研究的两阶段和基于家系的设计。
Front Genet. 2013 Dec 13;4:276. doi: 10.3389/fgene.2013.00276.
5
An EM algorithm based on an internal list for estimating haplotype distributions of rare variants from pooled genotype data.基于内部列表的 EM 算法,用于从合并基因型数据估计罕见变异体的单体型分布。
BMC Genet. 2013 Sep 13;14:82. doi: 10.1186/1471-2156-14-82.
PLoS Genet. 2011 Mar;7(3):e1001322. doi: 10.1371/journal.pgen.1001322. Epub 2011 Mar 3.
4
Deep sequencing of the Nicastrin gene in pooled DNA, the identification of genetic variants that affect risk of Alzheimer's disease.对汇集 DNA 中的 Nicastrin 基因进行深度测序,鉴定影响阿尔茨海默病风险的遗传变异。
PLoS One. 2011 Feb 25;6(2):e17298. doi: 10.1371/journal.pone.0017298.
5
On optimal pooling designs to identify rare variants through massive resequencing.通过大规模重测序鉴定罕见变异的最优合并设计。
Genet Epidemiol. 2011 Apr;35(3):139-47. doi: 10.1002/gepi.20561. Epub 2011 Jan 19.
6
Identification of rare alleles and their carriers using compressed se(que)nsing.利用压缩测序技术鉴定罕见等位基因及其携带者。
Nucleic Acids Res. 2010 Oct;38(19):e179. doi: 10.1093/nar/gkq675. Epub 2010 Aug 10.
7
Design of association studies with pooled or un-pooled next-generation sequencing data.基于汇集或未汇集下一代测序数据的关联研究设计。
Genet Epidemiol. 2010 Jul;34(5):479-91. doi: 10.1002/gepi.20501.
8
Pooled association tests for rare variants in exon-resequencing studies.外显子重测序研究中罕见变异的合并关联分析。
Am J Hum Genet. 2010 Jun 11;86(6):832-8. doi: 10.1016/j.ajhg.2010.04.005. Epub 2010 May 13.
9
Highly-multiplexed barcode sequencing: an efficient method for parallel analysis of pooled samples.高多重条码测序:一种用于并行分析混合样本的高效方法。
Nucleic Acids Res. 2010 Jul;38(13):e142. doi: 10.1093/nar/gkq368. Epub 2010 May 11.
10
The next generation of molecular markers from massively parallel sequencing of pooled DNA samples.基于 DNA 样本池的高通量测序的下一代分子标记物。
Genetics. 2010 Sep;186(1):207-18. doi: 10.1534/genetics.110.114397. Epub 2010 May 10.