Suppr超能文献

用于评估在确定蛋白质结构类别中起作用的参数的新型混合方法。

Novel hybrid method for the evaluation of parameters contributing in determination of protein structural classes.

作者信息

Jahandideh Samad, Abdolmaleki Parviz, Jahandideh Mina, Hayatshahi Sayyed Hamed Sadat

机构信息

Department of Biophysics, Faculty of Science, Tarbiat Modares University, Gisha, Tehran, Iran.

出版信息

J Theor Biol. 2007 Jan 21;244(2):275-81. doi: 10.1016/j.jtbi.2006.08.011. Epub 2006 Aug 22.

Abstract

Due to the increasing gap between structure-determined and sequenced proteins, prediction of protein structural classes has been an important problem. It is very important to use efficient sequential parameters for developing class predictors because of the close sequence-structure relationship. The multinomial logistic regression model was used for the first time to evaluate the contribution of sequence parameters in determining the protein structural class. An in-house program generated parameters including single amino acid and all dipeptide composition frequencies. Then, the most effective parameters were selected by a multinomial logistic regression. Selected variables in the multinomial logistic model were Valine among single amino acid composition frequencies and Ala-Gly, Cys-Arg, Asp-Cys, Glu-Tyr, Gly-Glu, His-Tyr, Lys-Lys, Leu-Asp, Leu-Arg, Pro-Cys, Gln-Met, Gln-Thr, Ser-Trp, Val-Asn and Trp-Asn among dipeptide composition frequencies. Also a neural network model was constructed and fed by the parameters selected by multinomial logistic regression to build a hybrid predictor. In this study, self-consistency and jackknife tests on a database constructed by Zhou [1998. An intriguing controversy over protein structural class prediction. J. Protein Chem. 17(8), 729-738] containing 498 proteins are used to verify the performance of this hybrid method, and are compared with some of prior works. The results showed that our two-stage hybrid model approach is very promising and may play a complementary role to the existing powerful approaches.

摘要

由于已确定结构的蛋白质与已测序蛋白质之间的差距不断增大,蛋白质结构类别的预测一直是一个重要问题。由于序列与结构之间存在密切关系,因此使用有效的序列参数来开发类别预测器非常重要。首次使用多项逻辑回归模型来评估序列参数在确定蛋白质结构类别中的作用。一个内部程序生成了包括单个氨基酸和所有二肽组成频率在内的参数。然后,通过多项逻辑回归选择最有效的参数。多项逻辑模型中选择的变量包括单个氨基酸组成频率中的缬氨酸,以及二肽组成频率中的丙氨酸-甘氨酸、半胱氨酸-精氨酸、天冬氨酸-半胱氨酸、谷氨酸-酪氨酸、甘氨酸-谷氨酸、组氨酸-酪氨酸、赖氨酸-赖氨酸、亮氨酸-天冬氨酸、亮氨酸-精氨酸、脯氨酸-半胱氨酸、谷氨酰胺-蛋氨酸、谷氨酰胺-苏氨酸、丝氨酸-色氨酸、缬氨酸-天冬酰胺和色氨酸-天冬酰胺。还构建了一个神经网络模型,并输入由多项逻辑回归选择的参数以构建一个混合预测器。在本研究中,对周[1998年。蛋白质结构类别预测中的一个有趣争议。《蛋白质化学杂志》17(8),729 - 738]构建的包含498种蛋白质的数据库进行自一致性和留一法检验,以验证这种混合方法的性能,并与一些先前的工作进行比较。结果表明,我们的两阶段混合模型方法非常有前景,可能会对现有的强大方法起到补充作用。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验