Poot-Hernandez Augusto Cesar, Rodriguez-Vazquez Katya, Perez-Rueda Ernesto
Unidad de Bioinformática y Manejo de la Información. Instituto de Fisiología Celular. Universidad Nacional Autónoma de México, Ciudad Universitaria, México, Mexico.
Departamento de Ingeniería de Sistemas Computacionales y Automatización, Instituto de Investigaciones en Matemáticas Aplicadas y en Sistemas, Universidad Nacional Autónoma de México, Ciudad Universitaria, México, Mexico.
MethodsX. 2023 Mar 11;10:102118. doi: 10.1016/j.mex.2023.102118. eCollection 2023.
An easy and fast strategy to compare functionally the metabolic maps is described. The KEGG metabolic maps are transformed into linear Enzymatic Step Sequences (ESS) using the Breadth First Search (BFS) algorithm. To do this, the KGML files are retrieved, and directed graph representations are created; where the nodes represent enzymes or enzymatic complexes, and the edges represent a compound, that is the 'product' from one reaction and a 'substrate' for the next. Then, a set of initialization nodes are selected, and used as the root for the construction of the BFS tree. This tree is used as a guide to the construction of the ESS. From each leaf (terminal node), the path is traced backwards until it reaches the root metabolic map and with two or fewer neighbors in the graph. In a second step, the ESS are compared with a Dynamic Programing algorithm, considering an "ad hoc" substitution matrix, and minimizing the global score. The dissimilarity values between two EC numbers ranged from 0 to 1, where 0 indicates similar EC numbers, and 1 indicates different EC numbers. Finally, the alignment is evaluated by using the normalized entropy-based function, considering a threshold of ≤ 0.27 as significant.•The KEGG metabolic maps are transformed into linear Enzymatic Step Sequences (ESS) using the Breadth First Search (BFS) algorithm.•Nodes represent enzymes or enzymatic complexes, and the edges represent a compound, that is 'product' from one reaction and a 'substrate' for the next.•The ESS are compared with a Dynamic Programing algorithm, considering an "ad hoc" substitution matrix, and minimizing the global score.
本文描述了一种简单快速的方法来对代谢图谱进行功能比较。使用广度优先搜索(BFS)算法将KEGG代谢图谱转化为线性酶促步骤序列(ESS)。为此,检索KGML文件并创建有向图表示;其中节点代表酶或酶复合物,边代表一种化合物,即一个反应的“产物”和下一个反应的“底物”。然后,选择一组初始化节点,并将其用作构建BFS树的根。该树用作构建ESS的指南。从每个叶节点(终端节点)开始,向后追溯路径,直到到达根代谢图谱且在图中的邻居数量为两个或更少。在第二步中,使用动态规划算法比较ESS,考虑一个 “临时” 替换矩阵,并使全局得分最小化。两个酶委员会编号(EC编号)之间的差异值范围为0到1,其中0表示相似的EC编号,1表示不同的EC编号。最后,使用基于归一化熵的函数评估比对结果,将≤ 0.27的阈值视为显著。
• 使用广度优先搜索(BFS)算法将KEGG代谢图谱转化为线性酶促步骤序列(ESS)。
• 节点代表酶或酶复合物,边代表一种化合物,即一个反应的“产物”和下一个反应的“底物”。
• 使用动态规划算法比较ESS,考虑一个 “临时” 替换矩阵,并使全局得分最小化。