Niu Bing, Huang Guohua, Zheng Linfeng, Wang Xueyuan, Chen Fuxue, Zhang Yuhui, Huang Tao
Shanghai Key Laboratory of Bio-Energy Crops, School of Life Science, Shanghai University, 333 Nancheng Road, Shanghai 200444, China.
Institute of Systems Biology, Shanghai University, Shanghai, China ; Institute of Health Sciences, Shanghai Institutes for Biological Sciences, Shanghai 200444, China.
Biomed Res Int. 2013;2013:674215. doi: 10.1155/2013/674215. Epub 2013 Dec 22.
It is important to correctly and efficiently predict the interaction of substrate-enzyme and to predict their product in metabolic pathway. In this work, a novel approach was introduced to encode substrate/product and enzyme molecules with molecular descriptors and physicochemical properties, respectively. Based on this encoding method, KNN was adopted to build the substrate-enzyme-product interaction network. After selecting the optimal features that are able to represent the main factors of substrate-enzyme-product interaction in our prediction, totally 160 features out of 290 features were attained which can be clustered into ten categories: elemental analysis, geometry, chemistry, amino acid composition, predicted secondary structure, hydrophobicity, polarizability, solvent accessibility, normalized van der Waals volume, and polarity. As a result, our predicting model achieved an MCC of 0.423 and an overall prediction accuracy of 89.1% for 10-fold cross-validation test.
正确且高效地预测底物 - 酶的相互作用以及预测它们在代谢途径中的产物非常重要。在这项工作中,引入了一种新颖的方法,分别用分子描述符和物理化学性质对底物/产物和酶分子进行编码。基于这种编码方法,采用K近邻算法构建底物 - 酶 - 产物相互作用网络。在我们的预测中,选择能够代表底物 - 酶 - 产物相互作用主要因素的最优特征后,从290个特征中总共获得了160个特征,这些特征可分为十类:元素分析、几何形状、化学性质、氨基酸组成、预测的二级结构、疏水性、极化率、溶剂可及性、归一化范德华体积和极性。结果,我们的预测模型在10折交叉验证测试中获得了0.423的马修斯相关系数和89.1%的总体预测准确率。