Suppr超能文献

量化外显子测序中单核苷酸变异检测灵敏度。

Quantifying single nucleotide variant detection sensitivity in exome sequencing.

机构信息

MRC Human Genetics Unit, MRC Institute for Genetics and Molecular Medicine, University of Edinburgh, Western General Hospital, Crewe Road, Edinburgh, UK.

出版信息

BMC Bioinformatics. 2013 Jun 18;14:195. doi: 10.1186/1471-2105-14-195.

Abstract

BACKGROUND

The targeted capture and sequencing of genomic regions has rapidly demonstrated its utility in genetic studies. Inherent in this technology is considerable heterogeneity of target coverage and this is expected to systematically impact our sensitivity to detect genuine polymorphisms. To fully interpret the polymorphisms identified in a genetic study it is often essential to both detect polymorphisms and to understand where and with what probability real polymorphisms may have been missed.

RESULTS

Using down-sampling of 30 deeply sequenced exomes and a set of gold-standard single nucleotide variant (SNV) genotype calls for each sample, we developed an empirical model relating the read depth at a polymorphic site to the probability of calling the correct genotype at that site. We find that measured sensitivity in SNV detection is substantially worse than that predicted from the naive expectation of sampling from a binomial. This calibrated model allows us to produce single nucleotide resolution SNV sensitivity estimates which can be merged to give summary sensitivity measures for any arbitrary partition of the target sequences (nucleotide, exon, gene, pathway, exome). These metrics are directly comparable between platforms and can be combined between samples to give "power estimates" for an entire study. We estimate a local read depth of 13X is required to detect the alleles and genotype of a heterozygous SNV 95% of the time, but only 3X for a homozygous SNV. At a mean on-target read depth of 20X, commonly used for rare disease exome sequencing studies, we predict 5-15% of heterozygous and 1-4% of homozygous SNVs in the targeted regions will be missed.

CONCLUSIONS

Non-reference alleles in the heterozygote state have a high chance of being missed when commonly applied read coverage thresholds are used despite the widely held assumption that there is good polymorphism detection at these coverage levels. Such alleles are likely to be of functional importance in population based studies of rare diseases, somatic mutations in cancer and explaining the "missing heritability" of quantitative traits.

摘要

背景

靶向捕获和测序基因组区域已迅速证明其在遗传研究中的实用性。该技术固有的目标覆盖的异质性预计会系统地影响我们检测真实多态性的敏感性。为了充分解释遗传研究中鉴定的多态性,通常需要既检测多态性,又了解实际多态性可能错过的位置和概率。

结果

我们使用 30 个深度测序外显子的下采样和每个样本的一组金标准单核苷酸变异 (SNV) 基因型调用,开发了一种经验模型,将多态性位点的读取深度与在该位点正确调用基因型的概率相关联。我们发现,SNV 检测的测量灵敏度明显低于从二项式中抽样的简单预期。该校准模型使我们能够生成单核苷酸分辨率 SNV 灵敏度估计值,这些估计值可以合并以给出目标序列(核苷酸、外显子、基因、途径、外显子)任意分区的汇总灵敏度度量。这些指标在平台之间是直接可比的,并且可以在样本之间组合以给出整个研究的“功率估计”。我们估计,需要局部读取深度为 13X 才能 95%的时间检测到杂合 SNV 的等位基因和基因型,但对于纯合 SNV 仅需 3X。在通常用于罕见疾病外显子测序研究的目标区域平均靶读深度为 20X 的情况下,我们预测靶向区域中 5-15%的杂合子和 1-4%的纯合子 SNV 将被错过。

结论

尽管普遍认为在这些覆盖水平下有良好的多态性检测,但当应用常见的读取覆盖阈值时,杂合子状态下的非参考等位基因有很高的错过机会。在罕见疾病的基于人群的研究、癌症中的体细胞突变以及解释数量性状的“缺失遗传力”中,这些等位基因可能具有重要的功能意义。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/98e9/3695811/233ca4c643e2/1471-2105-14-195-1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验