Tyler Andrea D, Christianson Sara, Knox Natalie C, Mabon Philip, Wolfe Joyce, Van Domselaar Gary, Graham Morag R, Sharma Meenu K
National Microbiology Laboratory, National Reference Centre for Mycobacteriology, Public Health Agency of Canada, Winnipeg, Manitoba, Canada.
Science Technology Cores & Services Division, National Microbiology Laboratory, Public Health Agency of Canada, Winnipeg, Manitoba, Canada.
PLoS One. 2016 Feb 5;11(2):e0148676. doi: 10.1371/journal.pone.0148676. eCollection 2016.
The advent and widespread application of next-generation sequencing (NGS) technologies to the study of microbial genomes has led to a substantial increase in the number of studies in which whole genome sequencing (WGS) is applied to the analysis of microbial genomic epidemiology. However, microorganisms such as Mycobacterium tuberculosis (MTB) present unique problems for sequencing and downstream analysis based on their unique physiology and the composition of their genomes. In this study, we compare the quality of sequence data generated using the Nextera and TruSeq isolate preparation kits for library construction prior to Illumina sequencing-by-synthesis. Our results confirm that MTB NGS data quality is highly dependent on the purity of the DNA sample submitted for sequencing and its guanine-cytosine content (or GC-content). Our data additionally demonstrate that the choice of library preparation method plays an important role in mitigating downstream sequencing quality issues. Importantly for MTB, the Illumina TruSeq library preparation kit produces more uniform data quality than the Nextera XT method, regardless of the quality of the input DNA. Furthermore, specific genomic sequence motifs are commonly missed by the Nextera XT method, as are regions of especially high GC-content relative to the rest of the MTB genome. As coverage bias is highly undesirable, this study illustrates the importance of appropriate protocol selection when performing NGS studies in order to ensure that sound inferences can be made regarding mycobacterial genomes.
下一代测序(NGS)技术在微生物基因组研究中的出现和广泛应用,使得将全基因组测序(WGS)应用于微生物基因组流行病学分析的研究数量大幅增加。然而,诸如结核分枝杆菌(MTB)等微生物,因其独特的生理学特性和基因组组成,在测序及下游分析方面存在独特问题。在本研究中,我们比较了在Illumina合成测序之前,使用Nextera和TruSeq分离株制备试剂盒构建文库所产生的序列数据质量。我们的结果证实,MTB NGS数据质量高度依赖于提交用于测序的DNA样本的纯度及其鸟嘌呤 - 胞嘧啶含量(或GC含量)。我们的数据还表明,文库制备方法的选择在减轻下游测序质量问题方面起着重要作用。对于MTB而言重要的是,无论输入DNA的质量如何,Illumina TruSeq文库制备试剂盒产生的数据质量比Nextera XT方法更均匀。此外,Nextera XT方法通常会遗漏特定的基因组序列基序,以及相对于MTB基因组其余部分而言GC含量特别高的区域。由于覆盖偏差是非常不可取的,本研究说明了在进行NGS研究时选择合适方案的重要性,以确保能够对分枝杆菌基因组做出可靠的推断。