dbCAN-PUL:一个经过实验验证的 CAZyme 基因簇及其底物数据库。
dbCAN-PUL: a database of experimentally characterized CAZyme gene clusters and their substrates.
机构信息
Department of Biological Sciences, Northern Illinois University, DeKalb, IL 60115, USA.
Nebraska Food for Health Center, Department of Food Science and Technology, University of Nebraska, Lincoln, NE 68588, USA.
出版信息
Nucleic Acids Res. 2021 Jan 8;49(D1):D523-D528. doi: 10.1093/nar/gkaa742.
PULs (polysaccharide utilization loci) are discrete gene clusters of CAZymes (Carbohydrate Active EnZymes) and other genes that work together to digest and utilize carbohydrate substrates. While PULs have been extensively characterized in Bacteroidetes, there exist PULs from other bacterial phyla, as well as archaea and metagenomes, that remain to be catalogued in a database for efficient retrieval. We have developed an online database dbCAN-PUL (http://bcb.unl.edu/dbCAN_PUL/) to display experimentally verified CAZyme-containing PULs from literature with pertinent metadata, sequences, and annotation. Compared to other online CAZyme and PUL resources, dbCAN-PUL has the following new features: (i) Batch download of PUL data by target substrate, species/genome, genus, or experimental characterization method; (ii) Annotation for each PUL that displays associated metadata such as substrate(s), experimental characterization method(s) and protein sequence information, (iii) Links to external annotation pages for CAZymes (CAZy), transporters (UniProt) and other genes, (iv) Display of homologous gene clusters in GenBank sequences via integrated MultiGeneBlast tool and (v) An integrated BLASTX service available for users to query their sequences against PUL proteins in dbCAN-PUL. With these features, dbCAN-PUL will be an important repository for CAZyme and PUL research, complementing our other web servers and databases (dbCAN2, dbCAN-seq).
PULs(多糖利用基因座)是 CAZymes(碳水化合物活性酶)和其他基因的离散基因簇,它们协同作用以消化和利用碳水化合物底物。虽然 PUL 在拟杆菌门中得到了广泛的研究,但其他细菌门、古菌和宏基因组中也存在 PUL,这些 PUL 有待在数据库中进行编目,以便进行有效的检索。我们开发了一个在线数据库 dbCAN-PUL(http://bcb.unl.edu/dbCAN_PUL/),用于显示文献中经过实验验证的含有 CAZyme 的 PUL,并附有相关的元数据、序列和注释。与其他在线 CAZyme 和 PUL 资源相比,dbCAN-PUL 具有以下新功能:(i)按目标底物、物种/基因组、属或实验特征化方法批量下载 PUL 数据;(ii)为每个 PUL 进行注释,显示相关元数据,如底物、实验特征化方法和蛋白质序列信息;(iii)链接到 CAZymes(CAZy)、转运蛋白(UniProt)和其他基因的外部注释页面;(iv)通过集成的 MultiGeneBlast 工具显示 GenBank 序列中的同源基因簇;(v)提供集成的 BLASTX 服务,供用户查询其序列与 dbCAN-PUL 中的 PUL 蛋白的匹配情况。有了这些功能,dbCAN-PUL 将成为 CAZyme 和 PUL 研究的重要资源库,补充了我们的其他网络服务器和数据库(dbCAN2、dbCAN-seq)。