Suppr超能文献

采用最佳覆盖深度和具有成本效益的重叠池测序鉴定罕见变异。

Identifying rare variants with optimal depth of coverage and cost-effective overlapping pool sequencing.

机构信息

State Key Laboratory of Bioelectronics, School of Biological Science and Medical Engineering, Southeast University, Nanjing, China.

出版信息

Genet Epidemiol. 2013 Dec;37(8):820-30. doi: 10.1002/gepi.21769. Epub 2013 Oct 28.

Abstract

Genome-wide association studies have identified hundreds of genetic variants associated with complex diseases although most variants identified so far explain only a small proportion of heritability, suggesting that rare variants are responsible for missing heritability. Identification of rare variants through large-scale resequencing becomes increasing important but still prohibitively expensive despite the rapid decline in the sequencing costs. Nevertheless, group testing based overlapping pool sequencing in which pooled rather than individual samples are sequenced will greatly reduces the efforts of sample preparation as well as the costs to screen for rare variants. Here, we proposed an overlapping pool sequencing to screen rare variants with optimal sequencing depth and a corresponding cost model. We formulated a model to compute the optimal depth for sufficient observations of variants in pooled sequencing. Utilizing shifted transversal design algorithm, appropriate parameters for overlapping pool sequencing could be selected to minimize cost and guarantee accuracy. Due to the mixing constraint and high depth for pooled sequencing, results showed that it was more cost-effective to divide a large population into smaller blocks which were tested using optimized strategies independently. Finally, we conducted an experiment to screen variant carriers with frequency equaled 1%. With simulated pools and publicly available human exome sequencing data, the experiment achieved 99.93% accuracy. Utilizing overlapping pool sequencing, the cost for screening variant carriers with frequency equaled 1% in 200 diploid individuals dropped to at least 66% at which target sequencing region was set to 30 Mb.

摘要

全基因组关联研究已经确定了数百个与复杂疾病相关的遗传变异,尽管迄今为止大多数已确定的变异仅解释了遗传力的一小部分,这表明罕见变异是遗传力缺失的原因。通过大规模重测序鉴定稀有变异变得越来越重要,但尽管测序成本迅速下降,仍非常昂贵。尽管如此,基于重叠池测序的群体检测,其中对池化而不是单个样本进行测序,将大大减少样本制备的工作量以及筛选稀有变异的成本。在这里,我们提出了一种重叠池测序方法,以筛选具有最佳测序深度和相应成本模型的稀有变异。我们建立了一个模型,以计算在池化测序中充分观察变异所需的最佳深度。利用偏移横切设计算法,可以选择重叠池测序的适当参数,以最小化成本并保证准确性。由于混合约束和池化测序的高深度,结果表明,将大人群划分为较小的块,然后使用优化策略独立进行测试,更为经济有效。最后,我们进行了一项实验,以筛选频率为 1%的变异携带者。利用模拟池和公开的人类外显子组测序数据,该实验实现了 99.93%的准确率。利用重叠池测序,在 200 个二倍体个体中筛选频率为 1%的变异携带者的成本降低到至少 66%,此时目标测序区域设置为 30 Mb。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验