Moriya Yuki, Yamada Takuji, Okuda Shujiro, Nakagawa Zenichi, Kotera Masaaki, Tokimatsu Toshiaki, Kanehisa Minoru, Goto Susumu
Bioinformatics Center, Institute for Chemical Research, Kyoto University , Uji, Kyoto 611-0011, Japan.
Graduate School of Bioscience and Biotechnology, Tokyo Institute of Technology , 2-12-1 Ookayama, Meguro-ku, Tokyo, 152-8550, Japan.
J Chem Inf Model. 2016 Mar 28;56(3):510-6. doi: 10.1021/acs.jcim.5b00216. Epub 2016 Feb 17.
Although there are several databases that contain data on many metabolites and reactions in biochemical pathways, there is still a big gap in the numbers between experimentally identified enzymes and metabolites. It is supposed that many catalytic enzyme genes are still unknown. Although there are previous studies that estimate the number of candidate enzyme genes, these studies required some additional information aside from the structures of metabolites such as gene expression and order in the genome. In this study, we developed a novel method to identify a candidate enzyme gene of a reaction using the chemical structures of the substrate-product pair (reactant pair). The proposed method is based on a search for similar reactant pairs in a reference database and offers ortholog groups that possibly mediate the given reaction. We applied the proposed method to two experimentally validated reactions. As a result, we confirmed that the histidine transaminase was correctly identified. Although our method could not directly identify the asparagine oxo-acid transaminase, we successfully found the paralog gene most similar to the correct enzyme gene. We also applied our method to infer candidate enzyme genes in the mesaconate pathway. The advantage of our method lies in the prediction of possible genes for orphan enzyme reactions where any associated gene sequences are not determined yet. We believe that this approach will facilitate experimental identification of genes for orphan enzymes.
尽管有几个数据库包含生化途径中许多代谢物和反应的数据,但实验鉴定的酶和代谢物在数量上仍存在很大差距。据推测,许多催化酶基因仍然未知。虽然以前有研究估计候选酶基因的数量,但这些研究除了代谢物结构外还需要一些额外信息,如基因表达和基因组中的顺序。在本研究中,我们开发了一种新方法,利用底物 - 产物对(反应物对)的化学结构来鉴定反应的候选酶基因。所提出的方法基于在参考数据库中搜索相似的反应物对,并提供可能介导给定反应的直系同源组。我们将所提出的方法应用于两个经过实验验证的反应。结果,我们证实组氨酸转氨酶被正确鉴定。虽然我们的方法不能直接鉴定天冬酰胺氧酸转氨酶,但我们成功找到了与正确酶基因最相似的旁系同源基因。我们还将我们的方法应用于推断中康酸途径中的候选酶基因。我们方法的优点在于预测孤儿酶反应的可能基因,这些反应的任何相关基因序列尚未确定。我们相信这种方法将有助于孤儿酶基因的实验鉴定。