Department of Biotechnology, Faculty of Advanced Sciences and Technologies, University of Isfahan, Isfahan, Iran.
J Theor Biol. 2011 Jul 21;281(1):18-23. doi: 10.1016/j.jtbi.2011.04.017. Epub 2011 Apr 28.
The amino acid gamma-aminobutyric-acid receptors (GABA(A)Rs) belong to the ligand-gated ion channels (LGICs) superfamily. GABA(A)Rs are highly diverse in the central nervous system. These channels play a key role in regulating behavior. As a result, the prediction of GABA(A)Rs from the amino acid sequence would be helpful for research on these receptors. We have developed a method to predict these proteins using the features obtained from Chou's pseudo-amino acid composition concept and support vector machine as a powerful machine learning approach. The predictor efficiency was assessed by five-fold cross-validation. This method achieved an overall accuracy and Matthew's correlation coefficient (MCC) of 94.12% and 0.88, respectively. Furthermore, to evaluate the effect and power of each feature, the minimum Redundancy and Maximum Relevance (mRMR) feature selection method was implemented. An interesting finding in this study is the presence of all six characters (hydrophobicity, hydrophilicity, side chain mass, pK1, pK2 and pI) or combination of the characters among the 5 higher ranked features (pk2 and pI, hydrophobicity and mass, pk1, hydrophilicity and mass) obtained from the mRMR feature selection method. The results show a biologically justifiable ranked attributes of pk2 and pI; hydrophobicity, hydrophilicity and mass; mass and pk1; pk2 and mass. Based on our results, using the concept of Chou's pseudo-amino acid composition and support vector machine is an effective approach for the prediction of GABA(A)Rs.
γ-氨基丁酸受体(GABA(A)Rs)属于配体门控离子通道(LGICs)超家族。GABA(A)Rs 在中枢神经系统中高度多样化。这些通道在调节行为方面起着关键作用。因此,从氨基酸序列预测 GABA(A)Rs 将有助于这些受体的研究。我们已经开发了一种使用 Chou 的伪氨基酸组成概念获得的特征和支持向量机作为强大的机器学习方法来预测这些蛋白质的方法。通过五重交叉验证评估预测器的效率。该方法的总体准确性和 Matthew 相关系数(MCC)分别为 94.12%和 0.88。此外,为了评估每个特征的效果和能力,实施了最小冗余和最大相关性(mRMR)特征选择方法。这项研究的一个有趣发现是,在所获得的 5 个排名较高的特征(pk2 和 pI、疏水性和质量、pk1、亲水性和质量)中,存在所有 6 个字符(疏水性、亲水性、侧链质量、pK1、pK2 和 pI)或字符的组合。mRMR 特征选择方法。结果表明,pk2 和 pI;疏水性、亲水性和质量;质量和 pk1;pk2 和质量具有合理的生物学排名属性。基于我们的结果,使用 Chou 的伪氨基酸组成和支持向量机的概念是预测 GABA(A)Rs 的有效方法。