Interuniversity Institute of Bioinformatics in Brussels, Université Libre de Bruxelles and Vrije Universiteit Brussel, Triomflaan CP 263, 1050 Brussels, Belgium.
Nucleic Acids Res. 2017 Feb 28;45(4):e18. doi: 10.1093/nar/gkw955.
The evolution in next-generation sequencing (NGS) technology has led to the development of many different assembly algorithms, but few of them focus on assembling the organelle genomes. These genomes are used in phylogenetic studies, food identification and are the most deposited eukaryotic genomes in GenBank. Producing organelle genome assembly from whole genome sequencing (WGS) data would be the most accurate and least laborious approach, but a tool specifically designed for this task is lacking. We developed a seed-and-extend algorithm that assembles organelle genomes from whole genome sequencing (WGS) data, starting from a related or distant single seed sequence. The algorithm has been tested on several new (Gonioctena intermedia and Avicennia marina) and public (Arabidopsis thaliana and Oryza sativa) whole genome Illumina data sets where it outperforms known assemblers in assembly accuracy and coverage. In our benchmark, NOVOPlasty assembled all tested circular genomes in less than 30 min with a maximum memory requirement of 16 GB and an accuracy over 99.99%. In conclusion, NOVOPlasty is the sole de novo assembler that provides a fast and straightforward extraction of the extranuclear genomes from WGS data in one circular high quality contig. The software is open source and can be downloaded at https://github.com/ndierckx/NOVOPlasty.
下一代测序 (NGS) 技术的发展催生了许多不同的组装算法,但其中很少有专门针对细胞器基因组组装的算法。这些基因组用于系统发育研究、食物鉴定,是 GenBank 中储存最多的真核生物基因组。从全基因组测序 (WGS) 数据中生成细胞器基因组组装将是最准确和最省力的方法,但缺乏专门为此任务设计的工具。我们开发了一种从全基因组测序 (WGS) 数据开始,从相关或遥远的单个种子序列出发组装细胞器基因组的种子和扩展算法。该算法已在几个新的(Gonioctena intermedia 和 Avicennia marina)和公共(Arabidopsis thaliana 和 Oryza sativa)全基因组 Illumina 数据集上进行了测试,在组装准确性和覆盖度方面优于已知的组装器。在我们的基准测试中,NOVOPlasty 在不到 30 分钟的时间内用最大内存要求 16GB 组装了所有测试的圆形基因组,准确率超过 99.99%。总之,NOVOPlasty 是唯一的从头组装程序,可在一个圆形高质量连续体中快速、直接地从 WGS 数据中提取核外基因组。该软件是开源的,可在 https://github.com/ndierckx/NOVOPlasty 上下载。