Rozov Roye, Brown Kav Aya, Bogumil David, Shterzer Naama, Halperin Eran, Mizrahi Itzhak, Shamir Ron
Blavatnik School of Computer Science, Tel-Aviv University, Tel Aviv, Israel.
The Department of Life Sciences & the National Institute for Biotechnology in the Negev, Ben-Gurion University of the Negev, Beer-Sheva, Israel.
Bioinformatics. 2017 Feb 15;33(4):475-482. doi: 10.1093/bioinformatics/btw651.
Plasmids and other mobile elements are central contributors to microbial evolution and genome innovation. Recently, they have been found to have important roles in antibiotic resistance and in affecting production of metabolites used in industrial and agricultural applications. However, their characterization through deep sequencing remains challenging, in spite of rapid drops in cost and throughput increases for sequencing. Here, we attempt to ameliorate this situation by introducing a new circular element assembly algorithm, leveraging assembly graphs provided by a conventional de novo assembler and alignments of paired-end reads to assemble cyclic sequences likely to be plasmids, phages and other circular elements.
We introduce Recycler, the first tool that can extract complete circular contigs from sequence data of isolate microbial genomes, plasmidome and metagenome sequence data. We show that Recycler greatly increases the number of true plasmids recovered relative to other approaches while remaining highly accurate. We demonstrate this trend via simulations of plasmidomes, comparisons of predictions with reference data for isolate samples, and assessments of annotation accuracy on metagenome data. In addition, we provide validation by DNA amplification of 77 plasmids predicted by Recycler from the different sequenced samples in which Recycler showed mean accuracy of 89% across all data types-isolate, microbiome and plasmidome.
Recycler is available at http://github.com/Shamir-Lab/Recycler.
Supplementary data are available at Bioinformatics online.
质粒和其他可移动元件是微生物进化和基因组创新的核心贡献者。最近,人们发现它们在抗生素耐药性以及影响工农业应用中使用的代谢物生产方面发挥着重要作用。然而,尽管测序成本迅速下降且通量增加,但通过深度测序对它们进行表征仍然具有挑战性。在这里,我们试图通过引入一种新的环状元件组装算法来改善这种情况,利用传统从头组装器提供的组装图和双端读段的比对来组装可能是质粒、噬菌体和其他环状元件的环状序列。
我们推出了Recycler,这是第一个能够从分离的微生物基因组、质粒组和宏基因组序列数据中提取完整环状重叠群的工具。我们表明,相对于其他方法,Recycler大大增加了回收的真实质粒数量,同时保持了高度准确性。我们通过质粒组模拟、将预测结果与分离样本的参考数据进行比较以及对宏基因组数据的注释准确性评估来证明这一趋势。此外,我们通过对Recycler从不同测序样本中预测的77个质粒进行DNA扩增来提供验证,Recycler在所有数据类型(分离样本、微生物组和质粒组)中显示出平均89%的准确率。
Recycler可在http://github.com/Shamir-Lab/Recycler获取。
补充数据可在《生物信息学》在线获取。