DTU Bioengineering, Technical University of Denmark, 2800 Kgs, Lyngby, Denmark.
Institute of Biology, Leiden University, 2333BE, Leiden, Netherlands.
Sci Data. 2024 Nov 21;11(1):1267. doi: 10.1038/s41597-024-04118-x.
This study showcases 121 new genomes of spore-forming Bacillales from strains collected globally from a variety of habitats, assembled using Oxford Nanopore long-read and MGI short-read sequences. Bacilli are renowned for their capacity to produce diverse secondary metabolites with use in agriculture, biotechnology, and medicine. These secondary metabolites are encoded within biosynthetic gene clusters (smBGCs). smBGCs have significant research interest due to their potential as sources of new bioactivate compounds. Our dataset includes 62 complete genomes, 2 at chromosome level, and 57 at contig level, covering a genomic size range from 3.50 Mb to 7.15 Mb. Phylotaxonomic analysis revealed that these genomes span 16 genera, with 69 of them belonging to Bacillus. A total of 1,176 predicted BGCs were identified by in silico genome mining. We anticipate that the open-access data presented here will expand the reported genomic information of spore-forming Bacillales and facilitate a deeper understanding of the genetic basis of Bacillales' potential for secondary metabolite production.
本研究展示了 121 个新的孢子形成芽孢杆菌基因组,这些基因组来自全球各种生境的菌株,使用 Oxford Nanopore 长读和 MGI 短读序列进行组装。芽孢杆菌以其产生多种用于农业、生物技术和医学的次级代谢产物的能力而闻名。这些次级代谢产物编码在生物合成基因簇(smBGCs)中。由于它们作为新型生物活性化合物的来源的潜力,smBGCs 引起了广泛的研究兴趣。我们的数据集包括 62 个完整基因组,其中 2 个为染色体水平,57 个为连续水平,基因组大小范围从 3.50 Mb 到 7.15 Mb。系统发育分类分析显示,这些基因组跨越 16 个属,其中 69 个属于芽孢杆菌属。通过计算机基因组挖掘共鉴定了 1176 个预测 BGC。我们预计,这里提供的公开访问数据将扩展报道的孢子形成芽孢杆菌的基因组信息,并促进对芽孢杆菌次级代谢产物产生潜力的遗传基础的更深入理解。