Cronn Richard, Liston Aaron, Parks Matthew, Gernandt David S, Shen Rongkun, Mockler Todd
Pacific Northwest Research Station, USDA Forest Service, Corvallis, OR 97331, USA.
Nucleic Acids Res. 2008 Nov;36(19):e122. doi: 10.1093/nar/gkn502. Epub 2008 Aug 27.
Organellar DNA sequences are widely used in evolutionary and population genetic studies, however, the conservative nature of chloroplast gene and genome evolution often limits phylogenetic resolution and statistical power. To gain maximal access to the historical record contained within chloroplast genomes, we have adapted multiplex sequencing-by-synthesis (MSBS) to simultaneously sequence multiple genomes using the Illumina Genome Analyzer. We PCR-amplified approximately 120 kb plastomes from eight species (seven Pinus, one Picea) in 35 reactions. Pooled products were ligated to modified adapters that included 3 bp indexing tags and samples were multiplexed at four genomes per lane. Tagged microreads were assembled by de novo and reference-guided assembly methods, using previously published Pinus plastomes as surrogate references. Assemblies for these eight genomes are estimated at 88-94% complete, with an average sequence depth of 55x to 186x. Mononucleotide repeats interrupt contig assembly with increasing repeat length, and we estimate that the limit for their assembly is 16 bp. Comparisons to 37 kb of Sanger sequence show a validated error rate of 0.056%, and conspicuous errors are evident from the assembly process. This efficient sequencing approach yields high-quality draft genomes and should have immediate applicability to genomes with comparable complexity.
细胞器DNA序列在进化和群体遗传学研究中被广泛应用,然而,叶绿体基因和基因组进化的保守性常常限制系统发育分辨率和统计效力。为了最大程度地获取叶绿体基因组中包含的历史记录,我们采用了多重合成测序(MSBS)技术,利用Illumina基因组分析仪对多个基因组进行同步测序。我们通过35次反应,从8个物种(7个松属物种,1个云杉属物种)中PCR扩增出约120 kb的质体基因组。将混合产物连接到包含3 bp索引标签的修饰衔接子上,并将样本以每个泳道4个基因组的方式进行多重化。带标签的短读长序列通过从头组装和参考引导组装方法进行组装,使用先前发表的松树质体基因组作为替代参考。这8个基因组的组装估计完成度为88 - 94%,平均序列深度为55x至186x。单核苷酸重复序列随着重复长度的增加会中断重叠群组装,我们估计其组装极限为16 bp。与37 kb的桑格测序结果比较显示,验证后的错误率为0.056%,并且在组装过程中明显的错误很明显。这种高效的测序方法产生了高质量的基因组草图,并且应该可以立即应用于具有可比复杂性的基因组。