Yu Hye-Won, Im Ji-Hoon, Kong Won-Sik, Park Young-Jin
Department of Medicinal Biosciences, Research Institute for Biomedical & Health Science, College of Biomedical and Health Science, Konkuk University, 268 Chungwon-daero Chungju-si 27478, Korea.
Mushroom Research Division, National Institute of Horticultural and Herbal Science, Rural Development Administration, 92, Bisan-ro, Eumseong-gun 27709, Korea.
Microorganisms. 2020 Dec 23;9(1):20. doi: 10.3390/microorganisms9010020.
The purpose of this study was to determine the genome sequence of var. based on next-generation sequencing (NGS) and to identify the genes encoding carbohydrate-active enzymes (CAZymes) in the genome. The optimal assembly (71 kmer) based on ABySS de novo assembly revealed a total length of 33,223,357 bp (49.53% GC content). A total of 15,337 gene structures were identified in the var. genome using ab initio gene prediction method with Funannotate pipeline. Analysis of the orthologs revealed that 11,966 (96.6%) out of the 15,337 predicted genes belonged to the orthogroups and 170 genes were specific for var. . CAZymes are divided into six classes: auxiliary activities (AAs), glycosyltransferases (GTs), carbohydrate esterases (CEs), polysaccharide lyases (PLs), glycoside hydrolases (GHs), and carbohydrate-binding modules (CBMs). A total of 551 genes encoding CAZymes were identified in the var. genome by analyzing the dbCAN meta server database (HMMER, Hotpep, and DIAMOND searches), which consisted of 54-95 AAs, 145-188 GHs, 55-73 GTs, 6-19 PLs, 13-59 CEs, and 7-67 CBMs. CAZymes can be widely used to produce bio-based products (food, paper, textiles, animal feed, and biofuels). Therefore, information about the CAZyme repertoire of the var. genome will help in understanding the lignocellulosic machinery and in-depth studies will provide opportunities for using this fungus for biotechnological and industrial applications.
本研究的目的是基于下一代测序(NGS)确定变种的基因组序列,并鉴定基因组中编码碳水化合物活性酶(CAZymes)的基因。基于ABySS从头组装的最佳组装(71 kmer)显示总长度为33,223,357 bp(GC含量为49.53%)。使用Funannotate管道的从头基因预测方法在变种基因组中鉴定出总共15,337个基因结构。直系同源物分析表明,15,337个预测基因中的11,966个(96.6%)属于直系同源组,170个基因是变种特有的。CAZymes分为六类:辅助活性(AAs)、糖基转移酶(GTs)、碳水化合物酯酶(CEs)、多糖裂解酶(PLs)、糖苷水解酶(GHs)和碳水化合物结合模块(CBMs)。通过分析dbCAN元服务器数据库(HMMER、Hotpep和DIAMOND搜索),在变种基因组中鉴定出总共551个编码CAZymes的基因,其中包括54 - 95个AAs、145 - 188个GHs、55 - 73个GTs、6 - 19个PLs、13 - 59个CEs和7 - 67个CBMs。CAZymes可广泛用于生产生物基产品(食品、纸张、纺织品、动物饲料和生物燃料)。因此,关于变种基因组中CAZyme库的信息将有助于理解木质纤维素机制,深入研究将为利用这种真菌进行生物技术和工业应用提供机会。