Holliday Gemma L, Akiva Eyal, Meng Elaine C, Brown Shoshana D, Calhoun Sara, Pieper Ursula, Sali Andrej, Booker Squire J, Babbitt Patricia C
Department of Bioengineering and Therapeutic Sciences, University of California, San Francisco, CA, United States.
Department of Bioengineering and Therapeutic Sciences, University of California, San Francisco, CA, United States.
Methods Enzymol. 2018;606:1-71. doi: 10.1016/bs.mie.2018.06.004. Epub 2018 Jul 24.
The radical SAM superfamily contains over 100,000 homologous enzymes that catalyze a remarkably broad range of reactions required for life, including metabolism, nucleic acid modification, and biogenesis of cofactors. While the highly conserved SAM-binding motif responsible for formation of the key 5'-deoxyadenosyl radical intermediate is a key structural feature that simplifies identification of superfamily members, our understanding of their structure-function relationships is complicated by the modular nature of their structures, which exhibit varied and complex domain architectures. To gain new insight about these relationships, we classified the entire set of sequences into similarity-based subgroups that could be visualized using sequence similarity networks. This superfamily-wide analysis reveals important features that had not previously been appreciated from studies focused on one or a few members. Functional information mapped to the networks indicates which members have been experimentally or structurally characterized, their known reaction types, and their phylogenetic distribution. Despite the biological importance of radical SAM chemistry, the vast majority of superfamily members have never been experimentally characterized in any way, suggesting that many new reactions remain to be discovered. In addition to 20 subgroups with at least one known function, we identified additional subgroups made up entirely of sequences of unknown function. Importantly, our results indicate that even general reaction types fail to track well with our sequence similarity-based subgroupings, raising major challenges for function prediction for currently identified and new members that continue to be discovered. Interactive similarity networks and other data from this analysis are available from the Structure-Function Linkage Database.
自由基S-腺苷甲硫氨酸(radical SAM)超家族包含超过100,000种同源酶,这些酶催化生命所需的一系列极为广泛的反应,包括新陈代谢、核酸修饰以及辅因子的生物合成。虽然负责形成关键的5'-脱氧腺苷自由基中间体的高度保守的SAM结合基序是简化超家族成员识别的关键结构特征,但由于其结构的模块化性质,我们对它们结构-功能关系的理解变得复杂,其结构呈现出多样且复杂的结构域架构。为了深入了解这些关系,我们将整个序列集分类为基于相似性的亚组,这些亚组可以使用序列相似性网络进行可视化。这种全超家族分析揭示了以前从专注于一个或几个成员的研究中未被认识到的重要特征。映射到网络的功能信息表明哪些成员已通过实验或结构表征、它们已知的反应类型以及它们的系统发育分布。尽管自由基SAM化学具有生物学重要性,但绝大多数超家族成员从未以任何方式进行过实验表征,这表明许多新反应仍有待发现。除了20个具有至少一种已知功能的亚组外,我们还鉴定出了完全由未知功能序列组成的其他亚组。重要的是,我们的结果表明,即使是一般的反应类型也无法很好地与我们基于序列相似性的亚组划分相匹配,这给当前已鉴定成员和不断发现的新成员的功能预测带来了重大挑战。来自该分析的交互式相似性网络和其他数据可从结构-功能联系数据库获得。