Van den Hoecke Silvie, Verhelst Judith, Vuylsteke Marnik, Saelens Xavier
Department of Medical Protein Research, VIB, B-9052, Ghent, Belgium.
Department of Biomedical Molecular Biology, Ghent University, B-9052, Ghent, Belgium.
BMC Genomics. 2015 Feb 14;16(1):79. doi: 10.1186/s12864-015-1284-z.
Influenza viruses exist as a large group of closely related viral genomes, also called quasispecies. The composition of this influenza viral quasispecies can be determined by an accurate and sensitive sequencing technique and data analysis pipeline. We compared the suitability of two benchtop next-generation sequencers for whole genome influenza A quasispecies analysis: the Illumina MiSeq sequencing-by-synthesis and the Ion Torrent PGM semiconductor sequencing technique.
We first compared the accuracy and sensitivity of both sequencers using plasmid DNA and different ratios of wild type and mutant plasmid. Illumina MiSeq sequencing reads were one and a half times more accurate than those of the Ion Torrent PGM. The majority of sequencing errors were substitutions on the Illumina MiSeq and insertions and deletions, mostly in homopolymer regions, on the Ion Torrent PGM. To evaluate the suitability of the two techniques for determining the genome diversity of influenza A virus, we generated plasmid-derived PR8 virus and grew this virus in vitro. We also optimized an RT-PCR protocol to obtain uniform coverage of all eight genomic RNA segments. The sequencing reads obtained with both sequencers could successfully be assembled de novo into the segmented influenza virus genome. After mapping of the reads to the reference genome, we found that the detection limit for reliable recognition of variants in the viral genome required a frequency of 0.5% or higher. This threshold exceeds the background error rate resulting from the RT-PCR reaction and the sequencing method. Most of the variants in the PR8 virus genome were present in hemagglutinin, and these mutations were detected by both sequencers.
Our approach underlines the power and limitations of two commonly used next-generation sequencers for the analysis of influenza virus gene diversity. We conclude that the Illumina MiSeq platform is better suited for detecting variant sequences whereas the Ion Torrent PGM platform has a shorter turnaround time. The data analysis pipeline that we propose here will also help to standardize variant calling in small RNA genomes based on next-generation sequencing data.
流感病毒以一大群密切相关的病毒基因组形式存在,也被称为准种。这种流感病毒准种的组成可通过准确且灵敏的测序技术及数据分析流程来确定。我们比较了两款台式下一代测序仪用于甲型流感病毒全基因组准种分析的适用性:Illumina MiSeq合成测序法和Ion Torrent PGM半导体测序技术。
我们首先使用质粒DNA以及不同比例的野生型和突变体质粒比较了两款测序仪的准确性和灵敏性。Illumina MiSeq测序读数的准确性比Ion Torrent PGM高1.5倍。Illumina MiSeq上的大多数测序错误是替换,而Ion Torrent PGM上的错误主要是插入和缺失,大多发生在同聚物区域。为评估这两种技术用于确定甲型流感病毒基因组多样性的适用性,我们构建了源自质粒的PR8病毒并在体外培养该病毒。我们还优化了逆转录聚合酶链式反应(RT-PCR)方案以实现对所有八个基因组RNA片段的均匀覆盖。两款测序仪获得的测序读数都能成功地从头组装成分段的流感病毒基因组。将读数比对到参考基因组后,我们发现可靠识别病毒基因组中变异体的检测限要求变异频率达到0.5%或更高。这个阈值超过了RT-PCR反应和测序方法产生的背景错误率。PR8病毒基因组中的大多数变异体存在于血凝素中,并且这两款测序仪都检测到了这些突变。
我们的方法凸显了两种常用的下一代测序仪在分析流感病毒基因多样性方面的优势和局限性。我们得出结论,Illumina MiSeq平台更适合检测变异序列,而Ion Torrent PGM平台的周转时间更短。我们在此提出的数据分析流程也将有助于基于下一代测序数据对小RNA基因组中的变异调用进行标准化。