DNA数独——利用高通量测序进行多重样本分析。

DNA Sudoku--harnessing high-throughput sequencing for multiplexed specimen analysis.

作者信息

Erlich Yaniv, Chang Kenneth, Gordon Assaf, Ronen Roy, Navon Oron, Rooks Michelle, Hannon Gregory J

机构信息

Watson School of Biological Sciences, Howard Hughes Medical Institute, Cold Spring Harbor Laboratory, Cold Spring Harbor, New York 11724, USA.

出版信息

Genome Res. 2009 Jul;19(7):1243-53. doi: 10.1101/gr.092957.109. Epub 2009 May 15.

DOI:10.1101/gr.092957.109

PMID:19447965

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC2704425/

Abstract

Next-generation sequencers have sufficient power to analyze simultaneously DNAs from many different specimens, a practice known as multiplexing. Such schemes rely on the ability to associate each sequence read with the specimen from which it was derived. The current practice of appending molecular barcodes prior to pooling is practical for parallel analysis of up to many dozen samples. Here, we report a strategy that permits simultaneous analysis of tens of thousands of specimens. Our approach relies on the use of combinatorial pooling strategies in which pools rather than individual specimens are assigned barcodes. Thus, the identity of each specimen is encoded within the pooling pattern rather than by its association with a particular sequence tag. Decoding the pattern allows the sequence of an original specimen to be inferred with high confidence. We verified the ability of our encoding and decoding strategies to accurately report the sequence of individual samples within a large number of mixed specimens in two ways. First, we simulated data both from a clone library and from a human population in which a sequence variant associated with cystic fibrosis was present. Second, we actually pooled, sequenced, and decoded identities within two sets of 40,000 bacterial clones comprising approximately 20,000 different artificial microRNAs targeting Arabidopsis or human genes. We achieved greater than 97% accuracy in these trials. The strategies reported here can be applied to a wide variety of biological problems, including the determination of genotypic variation within large populations of individuals.

摘要

新一代测序仪有足够的能力同时分析来自许多不同样本的DNA，这种做法称为多重分析。此类方案依赖于将每个测序读数与其来源样本相关联的能力。目前在混合样本之前附加分子条形码的做法对于多达几十份样本的平行分析是可行的。在此，我们报告一种允许同时分析数以万计样本的策略。我们的方法依赖于使用组合混合策略，其中是对混合样本而非单个样本进行条形码标记。因此，每个样本的身份是通过混合模式进行编码，而不是通过与特定序列标签的关联来编码。对该模式进行解码可使原始样本的序列得以高度准确地推断。我们通过两种方式验证了我们的编码和解码策略在大量混合样本中准确报告单个样本序列的能力。首先，我们模拟了来自克隆文库以及存在与囊性纤维化相关序列变异的人类群体的数据。其次，我们对两组各40000个细菌克隆进行了实际混合、测序并解码其身份，这些克隆包含大约20000种靶向拟南芥或人类基因的不同人工微小RNA。在这些试验中，我们实现了超过97%的准确率。本文报道的策略可应用于各种各样的生物学问题，包括确定大量个体群体中的基因型变异。

相似文献

DNA Sudoku--harnessing high-throughput sequencing for multiplexed specimen analysis.DNA数独——利用高通量测序进行多重样本分析。

Genome Res. 2009 Jul;19(7):1243-53. doi: 10.1101/gr.092957.109. Epub 2009 May 15.

Cloning the mouse homolog of the human cystic fibrosis transmembrane conductance regulator gene.克隆人类囊性纤维化跨膜传导调节因子基因的小鼠同源基因。

Genomics. 1991 Jun;10(2):301-7. doi: 10.1016/0888-7543(91)90312-3.

Comparison of next generation sequencing technologies for transcriptome characterization.用于转录组特征分析的新一代测序技术比较

BMC Genomics. 2009 Aug 1;10:347. doi: 10.1186/1471-2164-10-347.

Insertion of natural intron 6a-6b into a human cDNA-derived gene therapy vector for cystic fibrosis improves plasmid stability and permits facile RNA/DNA discrimination.将天然内含子6a - 6b插入用于囊性纤维化的人源cDNA基因治疗载体中可提高质粒稳定性并便于区分RNA/DNA。

J Gene Med. 1999 Sep-Oct;1(5):312-21. doi: 10.1002/(SICI)1521-2254(199909/10)1:5<312::AID-JGM55>3.0.CO;2-#.

Microarray analysis in cystic fibrosis.囊性纤维化中的微阵列分析。

J Cyst Fibros. 2004 Aug;3 Suppl 2:29-33. doi: 10.1016/j.jcf.2004.05.006.

Analysis of 31 CFTR mutations by polymerase chain reaction/oligonucleotide ligation assay in a pilot screening of 4476 newborns for cystic fibrosis.在对4476名新生儿进行囊性纤维化初步筛查中，通过聚合酶链反应/寡核苷酸连接分析检测31种CFTR突变。

J Med Screen. 1999;6(2):67-9. doi: 10.1136/jms.6.2.67.

High-throughput SuperSAGE for digital gene expression analysis of multiple samples using next generation sequencing.高通量 SuperSAGE 用于下一代测序的多个样本的数字基因表达分析。

PLoS One. 2010 Aug 6;5(8):e12010. doi: 10.1371/journal.pone.0012010.

A pyrosequencing-tailored nucleotide barcode design unveils opportunities for large-scale sample multiplexing.一种焦磷酸测序定制的核苷酸条形码设计揭示了大规模样本多重分析的机会。

Nucleic Acids Res. 2007;35(19):e130. doi: 10.1093/nar/gkm760. Epub 2007 Oct 11.

Genetic variation within the ovine cystic fibrosis transmembrane conductance regulator gene.绵羊囊性纤维化跨膜传导调节因子基因内的遗传变异

Mutat Res. 1998 May;382(3-4):93-8. doi: 10.1016/s1383-5726(97)00012-5.

Cystic fibrosis mutation detection by hybridization to light-generated DNA probe arrays.通过与光生成的DNA探针阵列杂交检测囊性纤维化突变

Hum Mutat. 1996;7(3):244-55. doi: 10.1002/(SICI)1098-1004(1996)7:3<244::AID-HUMU9>3.0.CO;2-A.

引用本文的文献

Combining whole genome sequencing and non-adaptive group testing for large-scale ethnicity screens.结合全基因组测序和非适应性分组测试进行大规模种族筛查。

BMC Bioinformatics. 2025 Jul 24;26(1):192. doi: 10.1186/s12859-025-06192-3.

An exciting future for microbial molecular biology and physiology.微生物分子生物学与生理学的激动人心的未来。

mBio. 2025 Aug 13;16(8):e0069425. doi: 10.1128/mbio.00694-25. Epub 2025 Jun 30.

Optimized Replication of Arrayed Bacterial Mutant Libraries Increases Access to Biological Resources.优化的阵列细菌突变文库复制增加了对生物资源的获取。

Microbiol Spectr. 2023 Aug 17;11(4):e0169323. doi: 10.1128/spectrum.01693-23. Epub 2023 Jul 11.

Optimized replication of arrayed bacterial mutant libraries increase access to biological resources.阵列式细菌突变体文库的优化复制增加了对生物资源的获取。

bioRxiv. 2023 Apr 25:2023.04.25.537918. doi: 10.1101/2023.04.25.537918.

Best Practices in Designing, Sequencing, and Identifying Random DNA Barcodes.设计、测序和鉴定随机 DNA 条码的最佳实践。

J Mol Evol. 2023 Jun;91(3):263-280. doi: 10.1007/s00239-022-10083-z. Epub 2023 Jan 18.

Performance Analysis of Electromyogram Signal Compression Sampling in a Wireless Body Area Network.无线体域网中肌电信号压缩采样的性能分析

Micromachines (Basel). 2022 Oct 15;13(10):1748. doi: 10.3390/mi13101748.

A joint use of pooling and imputation for genotyping SNPs.联合使用池化和插补进行 SNP 基因分型。

BMC Bioinformatics. 2022 Oct 13;23(1):421. doi: 10.1186/s12859-022-04974-7.

Barcoded bulk QTL mapping reveals highly polygenic and epistatic architecture of complex traits in yeast.条码化 bulk QTL 作图揭示了酵母中复杂性状的高度多基因和上位性结构。

Elife. 2022 Feb 11;11:e73983. doi: 10.7554/eLife.73983.

Unified platform for genetic and serological detection of COVID-19 with single-molecule technology.基于单分子技术的新冠病毒基因和血清学联合检测通用平台。

PLoS One. 2021 Jul 26;16(7):e0255096. doi: 10.1371/journal.pone.0255096. eCollection 2021.

Complex natural product production methods and options.复杂天然产物的生产方法与选择

Synth Syst Biotechnol. 2021 Jan 5;6(1):1-11. doi: 10.1016/j.synbio.2020.12.001. eCollection 2021 Mar.

本文引用的文献

Quantification of rare allelic variants from pooled genomic DNA.从混合基因组DNA中对罕见等位基因变异进行定量分析。

Nat Methods. 2009 Apr;6(4):263-5. doi: 10.1038/nmeth.1307. Epub 2009 Mar 1.

Real-time DNA sequencing from single polymerase molecules.来自单个聚合酶分子的实时DNA测序。

Science. 2009 Jan 2;323(5910):133-8. doi: 10.1126/science.1162986. Epub 2008 Nov 20.

Accurate whole human genome sequencing using reversible terminator chemistry.使用可逆终止子化学法进行准确的全人类基因组测序。

Nature. 2008 Nov 6;456(7218):53-9. doi: 10.1038/nature07517.

Identification of genetic variants using bar-coded multiplexed sequencing.使用条形码多重测序鉴定基因变异体。

Nat Methods. 2008 Oct;5(10):887-93. doi: 10.1038/nmeth.1251. Epub 2008 Sep 14.

Multiplex sequencing of plant chloroplast genomes using Solexa sequencing-by-synthesis technology.利用Solexa合成测序技术对植物叶绿体基因组进行多重测序。

Nucleic Acids Res. 2008 Nov;36(19):e122. doi: 10.1093/nar/gkn502. Epub 2008 Aug 27.

Alta-Cyclic: a self-optimizing base caller for next-generation sequencing.Alta-Cyclic：一种用于下一代测序的自优化碱基识别器。

Nat Methods. 2008 Aug;5(8):679-82. doi: 10.1038/nmeth.1230. Epub 2008 Jul 6.

Error-correcting barcoded primers for pyrosequencing hundreds of samples in multiplex.用于多重焦磷酸测序数百个样本的纠错条形码引物。

Nat Methods. 2008 Mar;5(3):235-7. doi: 10.1038/nmeth.1184. Epub 2008 Feb 10.

Anticipating the 1,000 dollar genome.展望千美元基因组时代。

Genome Biol. 2006;7(7):112. doi: 10.1186/gb-2006-7-7-112.

Production of complex nucleic acid libraries using highly parallel in situ oligonucleotide synthesis.使用高度并行的原位寡核苷酸合成技术生产复杂核酸文库。

Nat Methods. 2004 Dec;1(3):241-8. doi: 10.1038/nmeth724. Epub 2004 Nov 18.

BLAT--the BLAST-like alignment tool.BLAT——类BLAST比对工具。

Genome Res. 2002 Apr;12(4):656-64. doi: 10.1101/gr.229202.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验