Department of Biotechnology, Centre for Synthetic Biology (CSB), Ghent University, Ghent, Belgium.
Institute of Biochemistry and Technical Biochemistry, University of Stuttgart, Stuttgart, Germany.
PLoS One. 2024 Jul 11;19(7):e0306410. doi: 10.1371/journal.pone.0306410. eCollection 2024.
Carbohydrate-active enzymes (CAZymes) can be found in all domains of life and play a crucial role in metabolic and physiological processes. CAZymes often possess a modular structure, comprising not only catalytic domains but also associated domains such as carbohydrate-binding modules (CBMs) and linker domains. By exploring the modular diversity of CAZy families, catalysts with novel properties can be discovered and further insight in their biological functions and evolutionary relationships can be obtained. Here we present the carbohydrate-active enzyme domain analysis tool (CANDy), an assembly of several novel scripts, tools and databases that allows users to analyze the domain architecture of all protein sequences in a given CAZy family. CANDy's usability is shown on glycoside hydrolase family 48, a small yet underexplored family containing multi-domain enzymes. Our analysis reveals the existence of 35 distinct domain assemblies, including eight known architectures, with the remaining assemblies awaiting characterization. Moreover, we substantiate the occurrence of horizontal gene transfer from prokaryotes to insect orthologs and provide evidence for the subsequent removal of auxiliary domains, likely through a gene fission event. CANDy is available at https://github.com/PyEED/CANDy.
碳水化合物活性酶(CAZymes)存在于所有生命领域,在代谢和生理过程中起着至关重要的作用。CAZymes 通常具有模块化结构,不仅包含催化结构域,还包含相关结构域,如碳水化合物结合模块(CBMs)和连接结构域。通过探索 CAZy 家族的模块化多样性,可以发现具有新型特性的催化剂,并进一步深入了解其生物功能和进化关系。在这里,我们展示了碳水化合物活性酶结构域分析工具(CANDy),它是由几个新脚本、工具和数据库组成的集合,允许用户分析给定 CAZy 家族中所有蛋白质序列的结构域架构。我们通过糖苷水解酶家族 48 展示了 CANDy 的可用性,该家族是一个小型但研究不足的家族,包含多结构域酶。我们的分析揭示了 35 种不同的结构域组合的存在,包括 8 种已知的结构,其余的结构组合有待进一步描述。此外,我们证实了从原核生物到昆虫同源物的水平基因转移的发生,并提供了辅助结构域可能通过基因分裂事件丢失的证据。CANDy 可在 https://github.com/PyEED/CANDy 上获得。