Suppr超能文献

通过将复合蛋白质序列特征融合到伪氨基酸组成中来预测膜蛋白类型。

Predicting membrane protein types by fusing composite protein sequence features into pseudo amino acid composition.

作者信息

Hayat Maqsood, Khan Asifullah

机构信息

Department of Computer and Information Sciences, Pakistan Institute of Engineering and Applied Sciences, Nilore, Islamabad, Pakistan.

出版信息

J Theor Biol. 2011 Feb 21;271(1):10-7. doi: 10.1016/j.jtbi.2010.11.017. Epub 2010 Nov 24.

Abstract

Membrane proteins are vital type of proteins that serve as channels, receptors, and energy transducers in a cell. Prediction of membrane protein types is an important research area in bioinformatics. Knowledge of membrane protein types provides some valuable information for predicting novel example of the membrane protein types. However, classification of membrane protein types can be both time consuming and susceptible to errors due to the inherent similarity of membrane protein types. In this paper, neural networks based membrane protein type prediction system is proposed. Composite protein sequence representation (CPSR) is used to extract the features of a protein sequence, which includes seven feature sets; amino acid composition, sequence length, 2 gram exchange group frequency, hydrophobic group, electronic group, sum of hydrophobicity, and R-group. Principal component analysis is then employed to reduce the dimensionality of the feature vector. The probabilistic neural network (PNN), generalized regression neural network, and support vector machine (SVM) are used as classifiers. A high success rate of 86.01% is obtained using SVM for the jackknife test. In case of independent dataset test, PNN yields the highest accuracy of 95.73%. These classifiers exhibit improved performance using other performance measures such as sensitivity, specificity, Mathew's correlation coefficient, and F-measure. The experimental results show that the prediction performance of the proposed scheme for classifying membrane protein types is the best reported, so far. This performance improvement may largely be credited to the learning capabilities of neural networks and the composite feature extraction strategy, which exploits seven different properties of protein sequences. The proposed Mem-Predictor can be accessed at http://111.68.99.218/Mem-Predictor.

摘要

膜蛋白是一类重要的蛋白质,在细胞中充当通道、受体和能量转换器。膜蛋白类型的预测是生物信息学中的一个重要研究领域。膜蛋白类型的知识为预测膜蛋白类型的新实例提供了一些有价值的信息。然而,由于膜蛋白类型的内在相似性,膜蛋白类型的分类既耗时又容易出错。本文提出了基于神经网络的膜蛋白类型预测系统。复合蛋白质序列表示(CPSR)用于提取蛋白质序列的特征,它包括七个特征集:氨基酸组成、序列长度、二联体交换基团频率、疏水基团、电子基团、疏水性总和以及R基团。然后采用主成分分析来降低特征向量的维度。概率神经网络(PNN)、广义回归神经网络和支持向量机(SVM)用作分类器。使用SVM进行留一法检验获得了86.01%的高成功率。在独立数据集测试中,PNN的准确率最高,为95.73%。这些分类器在使用其他性能指标(如灵敏度、特异性、马修斯相关系数和F值)时表现出更好的性能。实验结果表明,到目前为止,所提出的膜蛋白类型分类方案的预测性能是已报道的最佳性能。这种性能的提高很大程度上归功于神经网络的学习能力和复合特征提取策略,该策略利用了蛋白质序列的七种不同特性。所提出的Mem-Predictor可通过http://111.68.99.218/Mem-Predictor访问。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验