College of Information Engineering, Shanghai Maritime University, Shanghai 201306, China.
State Key Laboratory of Livestock and Poultry Breeding, Guangdong Public Laboratory of Animal Breeding and Nutrition, Guangdong Provincial Key Laboratory of Animal Breeding and Nutrition, Institute of Animal Science, Guangdong Academy of Agricultural Sciences, Guangzhou 510640, China.
Comput Math Methods Med. 2021 Jan 4;2021:6683051. doi: 10.1155/2021/6683051. eCollection 2021.
Metabolic pathway is an important type of biological pathways. It produces essential molecules and energies to maintain the life of living organisms. Each metabolic pathway consists of a chain of chemical reactions, which always need enzymes to participate in. Thus, chemicals and enzymes are two major components for each metabolic pathway. Although several metabolic pathways have been uncovered, the metabolic pathway system is still far from complete. Some hidden chemicals or enzymes are not discovered in a certain metabolic pathway. Besides the traditional experiments to detect hidden chemicals or enzymes, an alternative pipeline is to design efficient computational methods. In this study, we proposed a powerful multilabel classifier, called iMPTCE-Hnetwork, to uniformly assign chemicals and enzymes to metabolic pathway types reported in KEGG. Such classifier adopted the embedding features derived from a heterogeneous network, which defined chemicals and enzymes as nodes and the interactions between chemicals and enzymes as edges, through a powerful network embedding algorithm, Mashup. The popular RAndom k-labELsets (RAKEL) algorithm was employed to construct the classifier, which incorporated the support vector machine (polynomial kernel) as the basic classifier. The ten-fold cross-validation results indicated that such a classifier had good performance with accuracy higher than 0.800 and exact match higher than 0.750. Several comparisons were done to indicate the superiority of the iMPTCE-Hnetwork.
代谢途径是一种重要的生物途径类型。它产生维持生物生命所需的基本分子和能量。每条代谢途径由一系列化学反应组成,这些反应通常需要酶的参与。因此,化学物质和酶是每条代谢途径的两个主要组成部分。尽管已经发现了几种代谢途径,但代谢途径系统还远未完善。某些代谢途径中仍有一些隐藏的化学物质或酶未被发现。除了传统的检测隐藏化学物质或酶的实验方法外,还可以设计高效的计算方法。在这项研究中,我们提出了一种强大的多标签分类器,称为 iMPTCE-Hnetwork,用于将 KEGG 中报道的代谢途径类型统一分配给化学物质和酶。该分类器采用了来自异构网络的嵌入特征,通过强大的网络嵌入算法 Mashup 将化学物质和酶定义为节点,将化学物质和酶之间的相互作用定义为边。流行的 RAndom k-labELsets (RAKEL) 算法被用于构建分类器,该算法将支持向量机(多项式核)作为基本分类器。十折交叉验证结果表明,该分类器具有良好的性能,准确率高于 0.800,精确匹配率高于 0.750。进行了几次比较以表明 iMPTCE-Hnetwork 的优越性。