Gerlt John A
Departments of Biochemistry and Chemistry, Institute for Genomic Biology, University of Illinois , Urbana-Champaign Urbana, Illinois 61801, United States.
Biochemistry. 2017 Aug 22;56(33):4293-4308. doi: 10.1021/acs.biochem.7b00614.
The exponentially increasing number of protein and nucleic acid sequences provides opportunities to discover novel enzymes, metabolic pathways, and metabolites/natural products, thereby adding to our knowledge of biochemistry and biology. The challenge has evolved from generating sequence information to mining the databases to integrating and leveraging the available information, i.e., the availability of "genomic enzymology" web tools. Web tools that allow identification of biosynthetic gene clusters are widely used by the natural products/synthetic biology community, thereby facilitating the discovery of novel natural products and the enzymes responsible for their biosynthesis. However, many novel enzymes with interesting mechanisms participate in uncharacterized small-molecule metabolic pathways; their discovery and functional characterization also can be accomplished by leveraging information in protein and nucleic acid databases. This Perspective focuses on two genomic enzymology web tools that assist the discovery novel metabolic pathways: (1) Enzyme Function Initiative-Enzyme Similarity Tool (EFI-EST) for generating sequence similarity networks to visualize and analyze sequence-function space in protein families and (2) Enzyme Function Initiative-Genome Neighborhood Tool (EFI-GNT) for generating genome neighborhood networks to visualize and analyze the genome context in microbial and fungal genomes. Both tools have been adapted to other applications to facilitate target selection for enzyme discovery and functional characterization. As the natural products community has demonstrated, the enzymology community needs to embrace the essential role of web tools that allow the protein and genome sequence databases to be leveraged for novel insights into enzymological problems.
蛋白质和核酸序列数量呈指数级增长,为发现新型酶、代谢途径以及代谢产物/天然产物提供了契机,从而增进了我们对生物化学和生物学的了解。挑战已从生成序列信息演变为挖掘数据库,再到整合和利用现有信息,即“基因组酶学”网络工具的可用性。允许识别生物合成基因簇的网络工具被天然产物/合成生物学界广泛使用,从而促进了新型天然产物以及负责其生物合成的酶的发现。然而,许多具有有趣机制的新型酶参与了未被表征的小分子代谢途径;它们的发现和功能表征也可以通过利用蛋白质和核酸数据库中的信息来实现。本观点聚焦于两种有助于发现新型代谢途径的基因组酶学网络工具:(1)酶功能倡议 - 酶相似性工具(EFI - EST),用于生成序列相似性网络,以可视化和分析蛋白质家族中的序列 - 功能空间;(2)酶功能倡议 - 基因组邻域工具(EFI - GNT),用于生成基因组邻域网络,以可视化和分析微生物和真菌基因组中的基因组背景。这两种工具都已被应用于其他用途,以促进酶发现和功能表征的靶点选择。正如天然产物领域所展示的那样,酶学领域需要认识到网络工具的重要作用,这些工具能够利用蛋白质和基因组序列数据库,为酶学问题提供新的见解。