Fungal Physiology, Westerdijk Fungal Biodiversity Institute, & Fungal Molecular Physiology, Utrecht University, Utrecht, The Netherlands.
Microb Genom. 2021 Dec;7(12). doi: 10.1099/mgen.0.000674.
Pectinolytic enzymes are a variety of enzymes involved in breaking down pectin, a complex and abundant plant cell-wall polysaccharide. In nature, pectinolytic enzymes play an essential role in allowing bacteria and fungi to depolymerize and utilize pectin. In addition, pectinases have been widely applied in various industries, such as the food, wine, textile, paper and pulp industries. Due to their important biological function and increasing industrial potential, discovery of novel pectinolytic enzymes has received global interest. However, traditional enzyme characterization relies heavily on biochemical experiments, which are time consuming, laborious and expensive. To accelerate identification of novel pectinolytic enzymes, an automatic approach is needed. We developed a machine learning (ML) approach for predicting pectinases in the industrial workhorse fungus, . The prediction integrated a diverse range of features, including evolutionary profile, gene expression, transcriptional regulation and biochemical characteristics. Results on both the training and the independent testing dataset showed that our method achieved over 90 % accuracy, and recalled over 60 % of pectinolytic genes. Application of the ML model on the genome led to the identification of 83 pectinases, covering both previously described pectinases and novel pectinases that do not belong to any known pectinolytic enzyme family. Our study demonstrated the tremendous potential of ML in discovery of new industrial enzymes through integrating heterogeneous (post-) genomimcs data.
果胶酶是一类参与分解果胶的酶,果胶是一种复杂且丰富的植物细胞壁多糖。在自然界中,果胶酶在允许细菌和真菌解聚和利用果胶方面起着至关重要的作用。此外,果胶酶已广泛应用于食品、葡萄酒、纺织、造纸和纸浆等行业。由于其重要的生物学功能和不断增加的工业潜力,新型果胶酶的发现受到了全球的关注。然而,传统的酶特性鉴定严重依赖生化实验,这些实验既耗时、费力又昂贵。为了加速新型果胶酶的鉴定,需要一种自动化的方法。我们开发了一种用于预测工业用真菌 中果胶酶的机器学习 (ML) 方法。该预测方法集成了多种特征,包括进化轮廓、基因表达、转录调控和生化特性。在训练和独立测试数据集上的结果表明,我们的方法达到了 90%以上的准确率,并且召回了 60%以上的果胶酶基因。将 ML 模型应用于 基因组,鉴定出了 83 种果胶酶,包括先前描述的果胶酶和不属于任何已知果胶酶家族的新型果胶酶。我们的研究表明,通过整合异构(后)基因组学数据,机器学习在发现新型工业酶方面具有巨大的潜力。