Department of Plant Biology, Carnegie Institution for Science, Stanford, CA, USA.
Brief Bioinform. 2018 Sep 28;19(5):1022-1034. doi: 10.1093/bib/bbx020.
Specialized metabolites (also called natural products or secondary metabolites) derived from bacteria, fungi, marine organisms and plants constitute an important source of antibiotics, anti-cancer agents, insecticides, immunosuppressants and herbicides. Many specialized metabolites in bacteria and fungi are biosynthesized via metabolic pathways whose enzymes are encoded by clustered genes on a chromosome. Metabolic gene clusters comprise a group of physically co-localized genes that together encode enzymes for the biosynthesis of a specific metabolite. Although metabolic gene clusters are generally not known to occur outside of microbes, several plant metabolic gene clusters have been discovered in recent years. The discovery of novel metabolic pathways is being enabled by the increasing availability of high-quality genome sequencing coupled with the development of powerful computational toolkits to identify metabolic gene clusters. To provide a comprehensive overview of various bioinformatics methods for detecting gene clusters, we compare and contrast key aspects of algorithmic logic behind several computational tools, including 'NP.searcher', 'ClustScan', 'CLUSEAN', 'antiSMASH', 'SMURF', 'MIDDAS-M', 'ClusterFinder', 'CASSIS/SMIPS' and 'C-Hunter' among others. We also review additional tools such as 'NRPSpredictor' and 'SBSPKS' that can infer substrate specificity for previously identified gene clusters. The continual development of bioinformatics methods to predict gene clusters will help shed light on how organisms assemble multi-step metabolic pathways for adaptation to various ecological niches.
来源于细菌、真菌、海洋生物和植物的特殊代谢物(也称为天然产物或次生代谢物)构成了抗生素、抗癌剂、杀虫剂、免疫抑制剂和除草剂的重要来源。许多细菌和真菌中的特殊代谢物是通过代谢途径生物合成的,其酶由染色体上簇集的基因编码。代谢基因簇包含一组物理上共定位的基因,它们共同编码用于合成特定代谢物的酶。尽管代谢基因簇通常不被认为存在于微生物之外,但近年来已经发现了几种植物代谢基因簇。高质量基因组测序的日益普及以及强大的计算工具包的开发,使发现新的代谢途径成为可能,这些工具包可用于识别代谢基因簇。为了全面概述用于检测基因簇的各种生物信息学方法,我们比较和对比了几种计算工具背后的算法逻辑的关键方面,包括“NP.searcher”、“ClustScan”、“CLUSEAN”、“antiSMASH”、“SMURF”、“MIDDAS-M”、“ClusterFinder”、“CASSIS/SMIPS”和“C-Hunter”等。我们还回顾了其他工具,如“NRPSpredictor”和“SBSPKS”,它们可以推断先前鉴定的基因簇的底物特异性。生物信息学方法预测基因簇的持续发展将有助于阐明生物体如何组装多步代谢途径以适应各种生态位。