Faculty of Veterinary Medicine, Department of Virology, Parasitology and Immunology, Ghent University, Salisburylaan 133, 9820, Merelbeke, Belgium.
PathoSense, Merelbeke, Belgium.
BMC Bioinformatics. 2020 Nov 11;21(1):517. doi: 10.1186/s12859-020-03856-0.
Implementation of Third-Generation Sequencing approaches for Whole Genome Sequencing (WGS) all-in-one diagnostics in human and veterinary medicine, requires the rapid and accurate generation of consensus genomes. Over the last years, Oxford Nanopore Technologies (ONT) released various new devices (e.g. the Flongle R9.4.1 flow cell) and bioinformatics tools (e.g. the in 2019-released Bonito basecaller), allowing cheap and user-friendly cost-efficient introduction in various NGS workflows. While single read, overall consensus accuracies, and completeness of genome sequences has been improved dramatically, further improvements are required when working with non-frequently sequenced organisms like Mycoplasma bovis. As an important primary respiratory pathogen in cattle, rapid M. bovis diagnostics is crucial to allow timely and targeted disease control and prevention. Current complete diagnostics (including identification, strain typing, and antimicrobial resistance (AMR) detection) require combined culture-based and molecular approaches, of which the first can take 1-2 weeks. At present, cheap and quick long read all-in-one WGS approaches can only be implemented if increased accuracies and genome completeness can be obtained.
Here, a taxon-specific custom-trained Bonito v.0.1.3 basecalling model (custom-pg45) was implemented in various WGS assembly bioinformatics pipelines. Using MinION sequencing data, we showed improved consensus accuracies up to Q45.2 and Q46.7 for reference-based and Canu de novo assembled M. bovis genomes, respectively. Furthermore, the custom-pg45 model resulted in mean consensus accuracies of Q45.0 and genome completeness of 94.6% for nine M. bovis field strains. Improvements were also observed for the single-use Flongle sequencer (mean Q36.0 accuracies and 80.3% genome completeness).
These results implicate that taxon-specific basecalling of MinION and single-use Flongle Nanopore long reads are of great value to be implemented in rapid all-in-one WGS tools as evidenced for Mycoplasma bovis as an example.
在人类和兽医医学中,实施第三代测序方法进行全基因组测序(WGS)一体化诊断需要快速准确地生成共识基因组。在过去的几年中,牛津纳米孔技术(ONT)发布了各种新设备(例如 Flongle R9.4.1 流片)和生物信息学工具(例如 2019 年发布的 Bonito 碱基调用器),允许在各种 NGS 工作流程中进行廉价且用户友好的经济型引入。虽然单读、总体共识准确性和基因组序列的完整性得到了显著提高,但在处理像牛支原体这样不常测序的生物体时,还需要进一步改进。作为牛的重要原发性呼吸道病原体,快速检测牛支原体对于及时和有针对性的疾病控制和预防至关重要。目前,包括鉴定、菌株分型和抗菌药物耐药性(AMR)检测在内的完整诊断需要结合基于培养的和分子方法,其中第一种方法可能需要 1-2 周。目前,如果能够获得更高的准确性和基因组完整性,只能实施廉价且快速的长读一体化 WGS 方法。
在这里,我们在各种 WGS 组装生物信息学管道中实施了针对特定分类群的定制训练的 Bonito v.0.1.3 碱基调用模型(custom-pg45)。使用 MinION 测序数据,我们展示了基于参考的和 Canu de novo 组装的牛支原体基因组的共识准确性分别提高到 Q45.2 和 Q46.7。此外,对于九个牛支原体现场菌株,custom-pg45 模型的平均共识准确性为 Q45.0,基因组完整性为 94.6%。Flongle 单通道测序仪的使用也得到了改进(平均 Q36.0 准确性和 80.3%的基因组完整性)。
这些结果表明,MinION 和单通道 Flongle 纳米孔长读的分类群特异性碱基调用对于快速一体化 WGS 工具的实施具有重要价值,正如以牛支原体为例所证明的那样。