David Matei, Dursi L J, Yao Delia, Boutros Paul C, Simpson Jared T
Ontario Institute for Cancer Research, Toronto M5G 0A3, Canada.
Department of Pharmacology and Toxicology, University of Toronto, Toronto M5S 1A8, Canada.
Bioinformatics. 2017 Jan 1;33(1):49-55. doi: 10.1093/bioinformatics/btw569. Epub 2016 Sep 10.
The highly portable Oxford Nanopore MinION sequencer has enabled new applications of genome sequencing directly in the field. However, the MinION currently relies on a cloud computing platform, Metrichor (metrichor.com), for translating locally generated sequencing data into basecalls.
To allow offline and private analysis of MinION data, we created Nanocall. Nanocall is the first freely available, open-source basecaller for Oxford Nanopore sequencing data and does not require an internet connection. Using R7.3 chemistry, on two E.coli and two human samples, with natural as well as PCR-amplified DNA, Nanocall reads have ∼68% identity, directly comparable to Metrichor '1D' data. Further, Nanocall is efficient, processing ∼2500 Kbp of sequence per core hour using the fastest settings, and fully parallelized. Using a 4 core desktop computer, Nanocall could basecall a MinION sequencing run in real time. Metrichor provides the ability to integrate the '1D' sequencing of template and complement strands of a single DNA molecule, and create a '2D' read. Nanocall does not currently integrate this technology, and addition of this capability will be an important future development. In summary, Nanocall is the first open-source, freely available, off-line basecaller for Oxford Nanopore sequencing data.
Nanocall is available at github.com/mateidavid/nanocall, released under the MIT license.
matei.david@oicr.on.caSupplementary information: Supplementary data are available at Bioinformatics online.
高度便携的牛津纳米孔MinION测序仪使基因组测序能够直接在现场进行新的应用。然而,MinION目前依赖于云计算平台Metrichor(metrichor.com)将本地生成的测序数据转换为碱基识别结果。
为了实现对MinION数据的离线和私密分析,我们创建了Nanocall。Nanocall是首个可免费获取的、用于牛津纳米孔测序数据的开源碱基识别程序,且无需互联网连接。使用R7.3化学方法,在两个大肠杆菌样本和两个人类样本上,对天然以及PCR扩增的DNA进行测试,Nanocall识别的碱基与Metrichor的“1D”数据具有约68%的一致性,具有直接可比性。此外,Nanocall效率高,在最快设置下每个核心小时可处理约2500 Kbp的序列,并且完全并行化。使用一台4核台式计算机,Nanocall可以实时对MinION测序运行结果进行碱基识别。Metrichor能够整合单个DNA分子模板链和互补链的“1D”测序,并创建“2D”读数。Nanocall目前尚未集成此技术,增加此功能将是未来的一个重要发展方向。总之,Nanocall是首个用于牛津纳米孔测序数据的开源、免费、离线碱基识别程序。
Nanocall可在github.com/mateidavid/nanocall获取,根据MIT许可发布。
补充数据可在《生物信息学》在线获取。