• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于定量分组检测的重叠池测序技术,用于鉴定罕见变异携带者。

Quantitative group testing-based overlapping pool sequencing to identify rare variant carriers.

机构信息

State Key Laboratory of Bioelectronics, School of Biological Science and Medical Engineering, Southeast University, Nanjing, China.

出版信息

BMC Bioinformatics. 2014 Jun 17;15:195. doi: 10.1186/1471-2105-15-195.

DOI:10.1186/1471-2105-15-195
PMID:24934981
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC4229885/
Abstract

BACKGROUND

Genome-wide association studies have revealed that rare variants are responsible for a large portion of the heritability of some complex human diseases. This highlights the increasing importance of detecting and screening for rare variants. Although the massively parallel sequencing technologies have greatly reduced the cost of DNA sequencing, the identification of rare variant carriers by large-scale re-sequencing remains prohibitively expensive because of the huge challenge of constructing libraries for thousands of samples. Recently, several studies have reported that techniques from group testing theory and compressed sensing could help identify rare variant carriers in large-scale samples with few pooled sequencing experiments and a dramatically reduced cost.

RESULTS

Based on quantitative group testing, we propose an efficient overlapping pool sequencing strategy that allows the efficient recovery of variant carriers in numerous individuals with much lower costs than conventional methods. We used random k-set pool designs to mix samples, and optimized the design parameters according to an indicative probability. Based on a mathematical model of sequencing depth distribution, an optimal threshold was selected to declare a pool positive or negative. Then, using the quantitative information contained in the sequencing results, we designed a heuristic Bayesian probability decoding algorithm to identify variant carriers. Finally, we conducted in silico experiments to find variant carriers among 200 simulated Escherichia coli strains. With the simulated pools and publicly available Illumina sequencing data, our method correctly identified the variant carriers for 91.5-97.9% variants with the variant frequency ranging from 0.5 to 1.5%.

CONCLUSIONS

Using the number of reads, variant carriers could be identified precisely even though samples were randomly selected and pooled. Our method performed better than the published DNA Sudoku design and compressed sequencing, especially in reducing the required data throughput and cost.

摘要

背景

全基因组关联研究表明,稀有变异是导致一些复杂人类疾病遗传的主要原因。这凸显了检测和筛选稀有变异的重要性日益增加。尽管大规模并行测序技术大大降低了 DNA 测序的成本,但由于构建数千个样本文库的巨大挑战,通过大规模重测序来识别稀有变异携带者仍然过于昂贵。最近,有几项研究报告称,群组测试理论和压缩感知技术可以帮助在少量 pooled 测序实验和显著降低成本的情况下,在大规模样本中识别稀有变异携带者。

结果

基于定量群组测试,我们提出了一种高效的重叠池测序策略,该策略允许在比传统方法低得多的成本下,从大量个体中高效地回收变异携带者。我们使用随机 k 集池设计来混合样本,并根据指示概率优化设计参数。基于测序深度分布的数学模型,选择一个最优阈值来宣布池阳性或阴性。然后,使用测序结果中包含的定量信息,我们设计了一个启发式贝叶斯概率解码算法来识别变异携带者。最后,我们进行了模拟实验,在 200 个模拟大肠杆菌菌株中找到了变异携带者。利用模拟池和公开的 Illumina 测序数据,我们的方法正确识别了频率在 0.5 到 1.5%之间的 91.5-97.9%的变异携带者。

结论

即使样本是随机选择和混合的,也可以通过测序读数数量精确识别变异携带者。与已发表的 DNA Sudoku 设计和压缩测序相比,我们的方法表现更好,尤其是在降低所需数据吞吐量和成本方面。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bc42/4229885/51c412eeeb84/1471-2105-15-195-6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bc42/4229885/3353f9031a34/1471-2105-15-195-1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bc42/4229885/48d319bf6e2f/1471-2105-15-195-2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bc42/4229885/72a00826eabc/1471-2105-15-195-3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bc42/4229885/2a06816e13cd/1471-2105-15-195-4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bc42/4229885/beda3e05a49e/1471-2105-15-195-5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bc42/4229885/51c412eeeb84/1471-2105-15-195-6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bc42/4229885/3353f9031a34/1471-2105-15-195-1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bc42/4229885/48d319bf6e2f/1471-2105-15-195-2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bc42/4229885/72a00826eabc/1471-2105-15-195-3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bc42/4229885/2a06816e13cd/1471-2105-15-195-4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bc42/4229885/beda3e05a49e/1471-2105-15-195-5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bc42/4229885/51c412eeeb84/1471-2105-15-195-6.jpg

相似文献

1
Quantitative group testing-based overlapping pool sequencing to identify rare variant carriers.基于定量分组检测的重叠池测序技术,用于鉴定罕见变异携带者。
BMC Bioinformatics. 2014 Jun 17;15:195. doi: 10.1186/1471-2105-15-195.
2
Identifying rare variants with optimal depth of coverage and cost-effective overlapping pool sequencing.采用最佳覆盖深度和具有成本效益的重叠池测序鉴定罕见变异。
Genet Epidemiol. 2013 Dec;37(8):820-30. doi: 10.1002/gepi.21769. Epub 2013 Oct 28.
3
Pooled-DNA Sequencing for Elucidating New Genomic Risk Factors, Rare Variants Underlying Alzheimer's Disease.用于阐明新的基因组风险因素、阿尔茨海默病潜在罕见变异的混合DNA测序
Methods Mol Biol. 2016;1303:299-314. doi: 10.1007/978-1-4939-2627-5_18.
4
Rare variant discovery and calling by sequencing pooled samples with overlaps.重叠测序池样本进行罕见变异发现和调用。
Bioinformatics. 2013 Jan 1;29(1):29-38. doi: 10.1093/bioinformatics/bts645. Epub 2012 Oct 27.
5
Weighted pooling--practical and cost-effective techniques for pooled high-throughput sequencing.加权池化——高通量测序池化的实用且具有成本效益的技术。
Bioinformatics. 2012 Jun 15;28(12):i197-206. doi: 10.1093/bioinformatics/bts208.
6
SNP calling by sequencing pooled samples.基于测序的混合样本 SNP 检测。
BMC Bioinformatics. 2012 Sep 20;13:239. doi: 10.1186/1471-2105-13-239.
7
Efficient identification of rare variants in large populations: deep re-sequencing the CRP locus in the CARDIA study.在大人群中高效识别罕见变异:在 CARDIA 研究中对 CRP 基因座进行深度重测序。
Nucleic Acids Res. 2013 Apr;41(7):e85. doi: 10.1093/nar/gkt092. Epub 2013 Feb 13.
8
A unified approach for allele frequency estimation, SNP detection and association studies based on pooled sequencing data using EM algorithms.基于 EM 算法的基于测序数据的等位基因频率估计、SNP 检测和关联研究的统一方法。
BMC Genomics. 2013;14 Suppl 1(Suppl 1):S1. doi: 10.1186/1471-2164-14-S1-S1. Epub 2013 Jan 21.
9
Effective discovery of rare variants by pooled target capture sequencing: A comparative analysis with individually indexed target capture sequencing.通过混合目标捕获测序有效发现罕见变异:与个体索引目标捕获测序的比较分析。
Mutat Res. 2018 May;809:24-31. doi: 10.1016/j.mrfmmm.2018.03.007. Epub 2018 Mar 30.
10
Detection of rare genomic variants from pooled sequencing using SPLINTER.使用SPLINTER从混合测序中检测罕见基因组变异。
J Vis Exp. 2012 Jun 23(64):3943. doi: 10.3791/3943.

引用本文的文献

1
A joint use of pooling and imputation for genotyping SNPs.联合使用池化和插补进行 SNP 基因分型。
BMC Bioinformatics. 2022 Oct 13;23(1):421. doi: 10.1186/s12859-022-04974-7.
2
High-throughput estimation of allele frequencies using combined pooled-population sequencing and haplotype-based data processing.利用合并群体测序和基于单倍型的数据处理进行等位基因频率的高通量估计。
Plant Methods. 2022 Mar 21;18(1):34. doi: 10.1186/s13007-022-00852-8.
3
An accurate clone-based haplotyping method by overlapping pool sequencing.一种通过重叠池测序实现的基于克隆的精确单倍型分型方法。

本文引用的文献

1
Identifying rare variants with optimal depth of coverage and cost-effective overlapping pool sequencing.采用最佳覆盖深度和具有成本效益的重叠池测序鉴定罕见变异。
Genet Epidemiol. 2013 Dec;37(8):820-30. doi: 10.1002/gepi.21769. Epub 2013 Oct 28.
2
Weighted pooling--practical and cost-effective techniques for pooled high-throughput sequencing.加权池化——高通量测序池化的实用且具有成本效益的技术。
Bioinformatics. 2012 Jun 15;28(12):i197-206. doi: 10.1093/bioinformatics/bts208.
3
An abundance of rare functional variants in 202 drug target genes sequenced in 14,002 people.
Nucleic Acids Res. 2016 Jul 8;44(12):e112. doi: 10.1093/nar/gkw284. Epub 2016 Apr 19.
在 14002 个人中对 202 个药物靶标基因进行测序,发现了大量罕见的功能变异。
Science. 2012 Jul 6;337(6090):100-4. doi: 10.1126/science.1217876. Epub 2012 May 17.
4
Evolution and functional impact of rare coding variation from deep sequencing of human exomes.人类外显子组深度测序中罕见编码变异的进化和功能影响。
Science. 2012 Jul 6;337(6090):64-9. doi: 10.1126/science.1219240. Epub 2012 May 17.
5
ReadDepth: a parallel R package for detecting copy number alterations from short sequencing reads.ReadDepth:一个用于从短测序读取中检测拷贝数改变的并行 R 包。
PLoS One. 2011 Jan 31;6(1):e16327. doi: 10.1371/journal.pone.0016327.
6
Differential expression analysis for sequence count data.差异表达分析序列计数数据。
Genome Biol. 2010;11(10):R106. doi: 10.1186/gb-2010-11-10-r106. Epub 2010 Oct 27.
7
Identification of rare alleles and their carriers using compressed se(que)nsing.利用压缩测序技术鉴定罕见等位基因及其携带者。
Nucleic Acids Res. 2010 Oct;38(19):e179. doi: 10.1093/nar/gkq675. Epub 2010 Aug 10.
8
Finding the missing heritability of complex diseases.寻找复杂疾病中缺失的遗传力。
Nature. 2009 Oct 8;461(7265):747-53. doi: 10.1038/nature08494.
9
Combinatorics and next-generation sequencing.组合数学与下一代测序技术
Nat Biotechnol. 2009 Sep;27(9):826-7. doi: 10.1038/nbt0909-826.
10
Targeted capture and massively parallel sequencing of 12 human exomes.12个人类外显子组的靶向捕获和大规模平行测序
Nature. 2009 Sep 10;461(7261):272-6. doi: 10.1038/nature08250. Epub 2009 Aug 16.