• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

用于识别疾病相关单例的最佳测序策略。

Optimal sequencing strategies for identifying disease-associated singletons.

作者信息

Rashkin Sara, Jun Goo, Chen Sai, Abecasis Goncalo R

机构信息

Center for Statistical Genetics, Department of Biostatistics, University of Michigan, Ann Arbor, Michigan, United States of America.

Department of Epidemiology and Biostatistics, University of California, San Francisco, San Francisco, California, United States of America.

出版信息

PLoS Genet. 2017 Jun 22;13(6):e1006811. doi: 10.1371/journal.pgen.1006811. eCollection 2017 Jun.

DOI:10.1371/journal.pgen.1006811
PMID:28640830
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC5501675/
Abstract

With the increasing focus of genetic association on the identification of trait-associated rare variants through sequencing, it is important to identify the most cost-effective sequencing strategies for these studies. Deep sequencing will accurately detect and genotype the most rare variants per individual, but may limit sample size. Low pass sequencing will miss some variants in each individual but has been shown to provide a cost-effective alternative for studies of common variants. Here, we investigate the impact of sequencing depth on studies of rare variants, focusing on singletons-the variants that are sampled in a single individual and are hardest to detect at low sequencing depths. We first estimate the sensitivity to detect singleton variants in both simulated data and in down-sampled deep genome and exome sequence data. We then explore the power of association studies comparing burden of singleton variants in cases and controls under a variety of conditions. We show that the power to detect singletons increases with coverage, typically plateauing for coverage > ~25x. Next, we show that, when total sequencing capacity is fixed, the power of association studies focused on singletons is typically maximized for coverage of 15-20x, independent of relative risk, disease prevalence, singleton burden, and case-control ratio. Our results suggest sequencing depth of 15-20x as an appropriate compromise of singleton detection power and sample size for studies of rare variants in complex disease.

摘要

随着基因关联研究越来越关注通过测序来识别与性状相关的罕见变异,为这些研究确定最具成本效益的测序策略至关重要。深度测序能够准确检测每个个体中最罕见的变异并进行基因分型,但可能会限制样本量。低通量测序会遗漏每个个体中的一些变异,但已被证明是研究常见变异的一种具有成本效益的替代方法。在此,我们研究测序深度对罕见变异研究的影响,重点关注单例变异——即在单个个体中被检测到且在低测序深度下最难检测的变异。我们首先在模拟数据以及下采样的深度基因组和外显子序列数据中估计检测单例变异的灵敏度。然后,我们探讨在各种条件下比较病例组和对照组中单例变异负担的关联研究的效能。我们表明,检测单例变异的效能随覆盖度增加,通常在覆盖度 > ~25x 时趋于平稳。接下来,我们表明,当总测序能力固定时,专注于单例变异的关联研究的效能通常在覆盖度为 15 - 20x 时达到最大化,与相对风险、疾病患病率、单例变异负担和病例对照比无关。我们的结果表明,对于复杂疾病中罕见变异的研究,15 - 20x 的测序深度是单例变异检测效能和样本量之间的适当折衷。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a697/5501675/91fe31c789de/pgen.1006811.g009.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a697/5501675/ac3876f2db6b/pgen.1006811.g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a697/5501675/881158e0df41/pgen.1006811.g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a697/5501675/998d2892895f/pgen.1006811.g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a697/5501675/1e41df66d06a/pgen.1006811.g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a697/5501675/f8ddb3908089/pgen.1006811.g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a697/5501675/c0926d1a695c/pgen.1006811.g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a697/5501675/1aced890308f/pgen.1006811.g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a697/5501675/d9a9c528c807/pgen.1006811.g008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a697/5501675/91fe31c789de/pgen.1006811.g009.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a697/5501675/ac3876f2db6b/pgen.1006811.g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a697/5501675/881158e0df41/pgen.1006811.g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a697/5501675/998d2892895f/pgen.1006811.g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a697/5501675/1e41df66d06a/pgen.1006811.g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a697/5501675/f8ddb3908089/pgen.1006811.g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a697/5501675/c0926d1a695c/pgen.1006811.g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a697/5501675/1aced890308f/pgen.1006811.g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a697/5501675/d9a9c528c807/pgen.1006811.g008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a697/5501675/91fe31c789de/pgen.1006811.g009.jpg

相似文献

1
Optimal sequencing strategies for identifying disease-associated singletons.用于识别疾病相关单例的最佳测序策略。
PLoS Genet. 2017 Jun 22;13(6):e1006811. doi: 10.1371/journal.pgen.1006811. eCollection 2017 Jun.
2
Impact of variant-level batch effects on identification of genetic risk factors in large sequencing studies.变异水平批次效应在大型测序研究中对遗传风险因素识别的影响。
PLoS One. 2021 Apr 16;16(4):e0249305. doi: 10.1371/journal.pone.0249305. eCollection 2021.
3
Analysis and optimal design for association studies using next-generation sequencing with case-control pools.使用病例对照样本池的新一代测序进行关联研究的分析与优化设计
Genet Epidemiol. 2012 Dec;36(8):870-81. doi: 10.1002/gepi.21681. Epub 2012 Sep 12.
4
Next-generation sequencing using a pre-designed gene panel for the molecular diagnosis of congenital disorders in pediatric patients.使用预先设计的基因检测板进行下一代测序,用于儿科患者先天性疾病的分子诊断。
Hum Genomics. 2015 Dec 14;9:33. doi: 10.1186/s40246-015-0055-x.
5
Assessing the necessity of confirmatory testing for exome-sequencing results in a clinical molecular diagnostic laboratory.评估临床分子诊断实验室中对全外显子测序结果进行验证性检测的必要性。
Genet Med. 2014 Jul;16(7):510-5. doi: 10.1038/gim.2013.183. Epub 2014 Jan 9.
6
Combining sequence data from multiple studies: Impact of analysis strategies on rare variant calling and association results.结合多项研究的序列数据:分析策略对罕见变异调用和关联结果的影响。
Genet Epidemiol. 2020 Jan;44(1):41-51. doi: 10.1002/gepi.22261. Epub 2019 Sep 14.
7
Identifying rare variants with optimal depth of coverage and cost-effective overlapping pool sequencing.采用最佳覆盖深度和具有成本效益的重叠池测序鉴定罕见变异。
Genet Epidemiol. 2013 Dec;37(8):820-30. doi: 10.1002/gepi.21769. Epub 2013 Oct 28.
8
Rare variant association testing under low-coverage sequencing.低覆盖度测序下的罕见变异关联测试。
Genetics. 2013 Jul;194(3):769-79. doi: 10.1534/genetics.113.150169. Epub 2013 May 1.
9
A new strategy for enhancing imputation quality of rare variants from next-generation sequencing data via combining SNP and exome chip data.一种通过结合单核苷酸多态性(SNP)和外显子芯片数据来提高下一代测序数据中罕见变异插补质量的新策略。
BMC Genomics. 2015 Dec 29;16:1109. doi: 10.1186/s12864-015-2192-y.
10
Concurrent exome-targeted next-generation sequencing and single nucleotide polymorphism array to identify the causative genetic aberrations of isolated Mayer-Rokitansky-Küster-Hauser syndrome.同时进行外显子靶向新一代测序和单核苷酸多态性阵列分析以鉴定孤立性 Mayer-Rokitansky-Küster-Hauser 综合征的致病基因畸变。
Hum Reprod. 2015 Jul;30(7):1732-42. doi: 10.1093/humrep/dev095. Epub 2015 Apr 29.

引用本文的文献

1
Advances in Whole Genome Sequencing: Methods, Tools, and Applications in Population Genomics.全基因组测序进展:群体基因组学中的方法、工具及应用
Int J Mol Sci. 2025 Jan 4;26(1):372. doi: 10.3390/ijms26010372.
2
Whole genome sequences of 70 indigenous Ethiopian cattle.70 头埃塞俄比亚本土牛的全基因组序列。
Sci Data. 2024 Jun 5;11(1):584. doi: 10.1038/s41597-024-03342-9.
3
Scaled Process Priors for Bayesian Nonparametric Estimation of the Unseen Genetic Variation.用于未观察到的基因变异的贝叶斯非参数估计的尺度化过程先验

本文引用的文献

1
A global reference for human genetic variation.人类遗传变异的全球参考。
Nature. 2015 Oct 1;526(7571):68-74. doi: 10.1038/nature15393.
2
Inactivating mutations in NPC1L1 and protection from coronary heart disease.NPC1L1基因的失活突变与冠心病防护
N Engl J Med. 2014 Nov 27;371(22):2072-82. doi: 10.1056/NEJMoa1405386. Epub 2014 Nov 12.
3
Rare-variant association analysis: study designs and statistical tests.罕见变异关联分析:研究设计与统计检验。
J Am Stat Assoc. 2022 Sep 29;119(545):320-331. doi: 10.1080/01621459.2022.2115918. eCollection 2024.
4
DNA methylation changes underlie the long-term association between periodontitis and atherosclerotic cardiovascular disease.DNA甲基化变化是牙周炎与动脉粥样硬化性心血管疾病之间长期关联的基础。
Front Cardiovasc Med. 2023 Apr 21;10:1164499. doi: 10.3389/fcvm.2023.1164499. eCollection 2023.
5
Next-Generation Sequencing Advances the Genetic Diagnosis of Cerebral Cavernous Malformation (CCM).下一代测序技术推动了脑海绵状血管畸形(CCM)的基因诊断。
Antioxidants (Basel). 2022 Jun 29;11(7):1294. doi: 10.3390/antiox11071294.
6
Lower Exome Sequencing Coverage of Ancestrally African Patients in The Cancer Genome Atlas.《癌症基因组图谱》中非洲裔祖先患者外显子组测序覆盖度较低。
J Natl Cancer Inst. 2022 Aug 8;114(8):1192-1199. doi: 10.1093/jnci/djac054.
7
Towards a Cost-Effective Implementation of Genomic Prediction Based on Low Coverage Whole Genome Sequencing in Dezhou Donkey.基于低覆盖度全基因组测序在德州驴中实现具有成本效益的基因组预测
Front Genet. 2021 Nov 3;12:728764. doi: 10.3389/fgene.2021.728764. eCollection 2021.
8
Comparison of sequencing data processing pipelines and application to underrepresented African human populations.测序数据处理管道的比较及其在代表性不足的非洲人群中的应用。
BMC Bioinformatics. 2021 Oct 9;22(1):488. doi: 10.1186/s12859-021-04407-x.
9
Whole genome sequencing of 10K patients with acute ischaemic stroke or transient ischaemic attack: design, methods and baseline patient characteristics.10000 例急性缺血性卒中和短暂性脑缺血发作患者的全基因组测序:设计、方法和基线患者特征。
Stroke Vasc Neurol. 2021 Jun;6(2):291-297. doi: 10.1136/svn-2020-000664. Epub 2020 Dec 18.
10
Initial whole-genome sequencing and analysis of the host genetic contribution to COVID-19 severity and susceptibility.宿主基因对新冠病毒疾病严重程度和易感性影响的全基因组初步测序与分析
Cell Discov. 2020 Nov 10;6(1):83. doi: 10.1038/s41421-020-00231-4.
Am J Hum Genet. 2014 Jul 3;95(1):5-23. doi: 10.1016/j.ajhg.2014.06.009.
4
Rare variants in CFI, C3 and C9 are associated with high risk of advanced age-related macular degeneration.CFI、C3 和 C9 中的罕见变异与高龄相关性黄斑变性的高风险相关。
Nat Genet. 2013 Nov;45(11):1366-70. doi: 10.1038/ng.2741. Epub 2013 Sep 15.
5
A rare nonsynonymous sequence variant in C3 is associated with high risk of age-related macular degeneration.C3 中罕见的非同义序列变异与年龄相关性黄斑变性的高风险相关。
Nat Genet. 2013 Nov;45(11):1371-4. doi: 10.1038/ng.2740. Epub 2013 Sep 15.
6
Identification of a rare coding variant in complement 3 associated with age-related macular degeneration.鉴定与年龄相关性黄斑变性相关的补体 3 中的罕见编码变异。
Nat Genet. 2013 Nov;45(11):1375-9. doi: 10.1038/ng.2758. Epub 2013 Sep 15.
7
Whole-genome sequence-based analysis of high-density lipoprotein cholesterol.基于全基因组序列的高密度脂蛋白胆固醇分析。
Nat Genet. 2013 Aug;45(8):899-901. doi: 10.1038/ng.2671. Epub 2013 Jun 16.
8
Identification of Genetic Susceptibility Loci for Colorectal Tumors in a Genome-Wide Meta-analysis.全基因组荟萃分析鉴定结直肠癌的遗传易感性位点。
Gastroenterology. 2013 Apr;144(4):799-807.e24. doi: 10.1053/j.gastro.2012.12.020. Epub 2012 Dec 22.
9
Heart disease and stroke statistics--2013 update: a report from the American Heart Association.《2013年心脏病和中风统计数据更新:美国心脏协会报告》
Circulation. 2013 Jan 1;127(1):e6-e245. doi: 10.1161/CIR.0b013e31828124ad. Epub 2012 Dec 12.
10
Analysis of 6,515 exomes reveals the recent origin of most human protein-coding variants.对 6515 个外显子组的分析揭示了大多数人类蛋白质编码变异的近期起源。
Nature. 2013 Jan 10;493(7431):216-20. doi: 10.1038/nature11690. Epub 2012 Nov 28.