Jeon Jehyun, Lee Jaehee, Jung Se-Min, Shin Jae Hong, Song Woon Ju, Rho Mina
Department of Computer Science, Hanyang University, Seoul, Republic of Korea.
Department of Chemistry, Seoul National University, Seoul, Republic of Korea.
mSystems. 2021 Jun 29;6(3):e0005321. doi: 10.1128/mSystems.00053-21. Epub 2021 May 27.
Halogenases create diverse natural products by utilizing halide ions and are of great interest in the synthesis of potential pharmaceuticals and agrochemicals. An increasing number of halogenases discovered in microorganisms are annotated as flavin-dependent halogenases (FDHs), but their chemical reactivities are markedly different and the genomic contents associated with such functional distinction have not been revealed yet. Even though the reactivity and regioselectivity of FDHs are essential in the halogenation activity, these FDHs are annotated inaccurately in the protein sequence repositories without characterizing their functional activities. We carried out a comprehensive sequence analysis and biochemical characterization of FDHs. Using a probabilistic model that we built in this study, FDHs were discovered from 2,787 bacterial genomes and 17 sediment metagenomes. We analyzed the essential genomic determinants that are responsible for substrate binding and subsequent reactions: four flavin adenine dinucleotide-binding, one halide-binding, and four tryptophan-binding sites. Compared with previous studies, our study utilizes large-scale genomic information to propose a comprehensive set of sequence motifs that are related to the active sites and regioselectivity. We reveal that the genomic patterns and phylogenetic locations of the FDHs determine the enzymatic reactivities, which was experimentally validated in terms of the substrate scope and regioselectivity. A large portion of publicly available FDHs needs to be reevaluated to designate their correct functions. Our genomic models establish comprehensive links among genotypic information, reactivity, and regioselectivity of FDHs, thereby laying an important foundation for future discovery and classification of novel FDHs. Halogenases are playing an important role as tailoring enzymes in biosynthetic pathways. Flavin-dependent tryptophan halogenases (Trp-FDHs) are among the enzymes that have broad substrate scope and high selectivity. From bacterial genomes and metagenomes, we found highly diverse halogenase sequences by using a well-trained profile hidden Markov model built from the experimentally validated halogenases. The characterization of genotype, steady-state activity, substrate scope, and regioselectivity has established comprehensive links between the information encoded in the genomic sequence and reactivity of FDHs reported here. By constructing models for accurate and detailed sequence markers, our work should guide future discovery and classification of novel FDHs.
卤化酶通过利用卤离子生成多种天然产物,在潜在药物和农用化学品的合成中具有重要意义。在微生物中发现的卤化酶越来越多地被注释为黄素依赖性卤化酶(FDHs),但其化学活性明显不同,且与这种功能差异相关的基因组内容尚未揭示。尽管FDHs的反应性和区域选择性在卤化活性中至关重要,但这些FDHs在蛋白质序列库中的注释并不准确,并未对其功能活性进行表征。我们对FDHs进行了全面的序列分析和生化表征。利用我们在本研究中构建的概率模型,从2787个细菌基因组和17个沉积物宏基因组中发现了FDHs。我们分析了负责底物结合及后续反应的关键基因组决定因素:四个黄素腺嘌呤二核苷酸结合位点、一个卤离子结合位点和四个色氨酸结合位点。与之前的研究相比,我们的研究利用大规模基因组信息提出了一套与活性位点和区域选择性相关的全面序列基序。我们揭示了FDHs的基因组模式和系统发育位置决定了酶活性,这在底物范围和区域选择性方面得到了实验验证。大部分公开可用的FDHs需要重新评估以确定其正确功能。我们的基因组模型在FDHs的基因型信息、反应性和区域选择性之间建立了全面联系,从而为未来新型FDHs的发现和分类奠定了重要基础。卤化酶作为生物合成途径中的修饰酶发挥着重要作用。黄素依赖性色氨酸卤化酶(Trp-FDHs)是具有广泛底物范围和高选择性的酶之一。通过使用由经过实验验证的卤化酶构建的训练有素的轮廓隐马尔可夫模型,我们从细菌基因组和宏基因组中发现了高度多样的卤化酶序列。基因型、稳态活性、底物范围和区域选择性的表征在此处报道的FDHs基因组序列编码信息与反应性之间建立了全面联系。通过构建准确详细的序列标记模型,我们的工作应为未来新型FDHs的发现和分类提供指导。