Soorni Aboozar, Haak David, Zaitlin David, Bombarely Aureliano
Department of Horticulture, Virginia Polytechnic Institute and State University, Blacksburg, VA, 24061, USA.
Department of Horticulture, Faculty of Horticultural Sciences and Plant Protection, University of Tehran, Karaj, 31587, Iran.
BMC Genomics. 2017 Jan 7;18(1):49. doi: 10.1186/s12864-016-3412-9.
The development of long-read sequencing technologies, such as single-molecule real-time (SMRT) sequencing by PacBio, has produced a revolution in the sequencing of small genomes. Sequencing organelle genomes using PacBio long-read data is a cost effective, straightforward approach. Nevertheless, the availability of simple-to-use software to perform the assembly from raw reads is limited at present.
We present Organelle-PBA, a Perl program designed specifically for the assembly of chloroplast and mitochondrial genomes. For chloroplast genomes, the program selects the chloroplast reads from a whole genome sequencing pool, maps the reads to a reference sequence from a closely related species, and then performs read correction and de novo assembly using Sprai. Organelle-PBA completes the assembly process with the additional step of scaffolding by SSPACE-LongRead. The program then detects the chloroplast inverted repeats and reassembles and re-orients the assembly based on the organelle origin of the reference. We have evaluated the performance of the software using PacBio reads from different species, read coverage, and reference genomes. Finally, we present the assembly of two novel chloroplast genomes from the species Picea glauca (Pinaceae) and Sinningia speciosa (Gesneriaceae).
Organelle-PBA is an easy-to-use Perl-based software pipeline that was written specifically to assemble mitochondrial and chloroplast genomes from whole genome PacBio reads. The program is available at https://github.com/aubombarely/Organelle_PBA .
长读长测序技术的发展,如太平洋生物科学公司的单分子实时(SMRT)测序,在小基因组测序方面引发了一场革命。使用PacBio长读长数据对细胞器基因组进行测序是一种经济高效、直接的方法。然而,目前用于从原始读数进行组装的易用软件有限。
我们展示了Organelle-PBA,这是一个专门设计用于组装叶绿体和线粒体基因组的Perl程序。对于叶绿体基因组,该程序从全基因组测序池中选择叶绿体读数,将读数映射到来自密切相关物种的参考序列,然后使用Sprai进行读数校正和从头组装。Organelle-PBA通过SSPACE-LongRead的额外支架步骤完成组装过程。然后,该程序检测叶绿体反向重复序列,并根据参考序列的细胞器来源重新组装和重新定向组装。我们使用来自不同物种的PacBio读数、读数覆盖率和参考基因组评估了该软件的性能。最后,我们展示了来自白云杉(松科)和美丽辛夷(苦苣苔科)这两个物种的两个新的叶绿体基因组的组装。
Organelle-PBA是一个易于使用的基于Perl的软件管道,专门用于从全基因组PacBio读数组装线粒体和叶绿体基因组。该程序可在https://github.com/aubombarely/Organelle_PBA获取。