Suppr超能文献

DNA数独——利用高通量测序进行多重样本分析。

DNA Sudoku--harnessing high-throughput sequencing for multiplexed specimen analysis.

作者信息

Erlich Yaniv, Chang Kenneth, Gordon Assaf, Ronen Roy, Navon Oron, Rooks Michelle, Hannon Gregory J

机构信息

Watson School of Biological Sciences, Howard Hughes Medical Institute, Cold Spring Harbor Laboratory, Cold Spring Harbor, New York 11724, USA.

出版信息

Genome Res. 2009 Jul;19(7):1243-53. doi: 10.1101/gr.092957.109. Epub 2009 May 15.

Abstract

Next-generation sequencers have sufficient power to analyze simultaneously DNAs from many different specimens, a practice known as multiplexing. Such schemes rely on the ability to associate each sequence read with the specimen from which it was derived. The current practice of appending molecular barcodes prior to pooling is practical for parallel analysis of up to many dozen samples. Here, we report a strategy that permits simultaneous analysis of tens of thousands of specimens. Our approach relies on the use of combinatorial pooling strategies in which pools rather than individual specimens are assigned barcodes. Thus, the identity of each specimen is encoded within the pooling pattern rather than by its association with a particular sequence tag. Decoding the pattern allows the sequence of an original specimen to be inferred with high confidence. We verified the ability of our encoding and decoding strategies to accurately report the sequence of individual samples within a large number of mixed specimens in two ways. First, we simulated data both from a clone library and from a human population in which a sequence variant associated with cystic fibrosis was present. Second, we actually pooled, sequenced, and decoded identities within two sets of 40,000 bacterial clones comprising approximately 20,000 different artificial microRNAs targeting Arabidopsis or human genes. We achieved greater than 97% accuracy in these trials. The strategies reported here can be applied to a wide variety of biological problems, including the determination of genotypic variation within large populations of individuals.

摘要

新一代测序仪有足够的能力同时分析来自许多不同样本的DNA,这种做法称为多重分析。此类方案依赖于将每个测序读数与其来源样本相关联的能力。目前在混合样本之前附加分子条形码的做法对于多达几十份样本的平行分析是可行的。在此,我们报告一种允许同时分析数以万计样本的策略。我们的方法依赖于使用组合混合策略,其中是对混合样本而非单个样本进行条形码标记。因此,每个样本的身份是通过混合模式进行编码,而不是通过与特定序列标签的关联来编码。对该模式进行解码可使原始样本的序列得以高度准确地推断。我们通过两种方式验证了我们的编码和解码策略在大量混合样本中准确报告单个样本序列的能力。首先,我们模拟了来自克隆文库以及存在与囊性纤维化相关序列变异的人类群体的数据。其次,我们对两组各40000个细菌克隆进行了实际混合、测序并解码其身份,这些克隆包含大约20000种靶向拟南芥或人类基因的不同人工微小RNA。在这些试验中,我们实现了超过97%的准确率。本文报道的策略可应用于各种各样的生物学问题,包括确定大量个体群体中的基因型变异。

相似文献

5
Microarray analysis in cystic fibrosis.囊性纤维化中的微阵列分析。
J Cyst Fibros. 2004 Aug;3 Suppl 2:29-33. doi: 10.1016/j.jcf.2004.05.006.

引用本文的文献

7
A joint use of pooling and imputation for genotyping SNPs.联合使用池化和插补进行 SNP 基因分型。
BMC Bioinformatics. 2022 Oct 13;23(1):421. doi: 10.1186/s12859-022-04974-7.
10
Complex natural product production methods and options.复杂天然产物的生产方法与选择
Synth Syst Biotechnol. 2021 Jan 5;6(1):1-11. doi: 10.1016/j.synbio.2020.12.001. eCollection 2021 Mar.

本文引用的文献

2
Real-time DNA sequencing from single polymerase molecules.来自单个聚合酶分子的实时DNA测序。
Science. 2009 Jan 2;323(5910):133-8. doi: 10.1126/science.1162986. Epub 2008 Nov 20.
8
Anticipating the 1,000 dollar genome.展望千美元基因组时代。
Genome Biol. 2006;7(7):112. doi: 10.1186/gb-2006-7-7-112.
10
BLAT--the BLAST-like alignment tool.BLAT——类BLAST比对工具。
Genome Res. 2002 Apr;12(4):656-64. doi: 10.1101/gr.229202.

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验