Korzhenkov A A, Toshchakov S V, Podosokorskaya O A, Patrushev M V, Kublanov I V
National Research Center "Kurchatov Institute", Moscow 123182, Russia.
Winogradsky Institute of Microbiology of Federal Research Centre "Fundamentals of Biotechnology" of the Russian Academy of Sciences, Russia, 117312, Moscow, 60-let Oktyabrya prospect 7/2.
Data Brief. 2020 Sep 24;33:106336. doi: 10.1016/j.dib.2020.106336. eCollection 2020 Dec.
The draft genome sequence of sp. strain 1523vc, a thermophilic bacterium, isolated from a hot spring of Uzon Caldera, (Kamchatka, Russia) is presented. The complete genome assembly was of 2 713 207 bp with predicted completeness of 99.38%. Genome structural annotation revealed 2674 protein-coding genes, 127 pseudogenes and 77 RNA genes. Pangenome analysis of 7 currently available high quality spp. genomes including 1523vc revealed 4673 gene clusters. Of them, 1130 clusters formed a core genome of genus . Of the rest 3543 pangenome genes, 385 were exclusively represented in 1523vc genome. 101 of 2801 CDS were found to be encoding carbohydrate-active enzymes (CAZymes). The majority of CAZymes were predicted to be involved in degradation of beta-linked polysaccharides as chitin, cellulose and hemicelluloses, reflecting the metabolism of strain 1523vc, isolated on cellulose. 5 of 101 CAZyme genes were found to be unique for the strain 1523vc and belonged to GH23, GT56, GH15 and two CE9 family proteins. The draft genome of strain 1523vc was deposited at DBJ/EMBL/GenBank under the accessions JABEQB000000000, PRJNA629090 and SAMN14766777 for Genome, Bioproject and Biosample, respectively.
本文介绍了从俄罗斯堪察加半岛乌宗火山口的温泉中分离出的嗜热细菌sp.菌株1523vc的基因组序列草图。完整的基因组组装大小为2713207 bp,预测完整性为99.38%。基因组结构注释显示有2674个蛋白质编码基因、127个假基因和77个RNA基因。对包括1523vc在内的7个目前可用的高质量spp.基因组进行泛基因组分析,共鉴定出4673个基因簇。其中,1130个簇构成了该属的核心基因组。其余3543个泛基因组基因中,有385个仅在1523vc基因组中出现。在2801个编码序列(CDS)中,发现101个编码碳水化合物活性酶(CAZyme)。大多数CAZyme预计参与β-连接多糖(如几丁质、纤维素和半纤维素)的降解,这反映了在纤维素上分离得到的1523vc菌株的代谢情况。101个CAZyme基因中有5个是1523vc菌株特有的,分别属于GH23、GT56、GH15和两个CE9家族蛋白。菌株1523vc的基因组草图已分别以登录号JABEQB000000000、PRJNA629090和SAMN14766777保存在日本DNA数据库(DBJ)/欧洲分子生物学实验室(EMBL)/美国国立生物技术信息中心(GenBank),分别对应基因组、生物项目和生物样本。