Suppr超能文献

利用支持向量机从蛋白质序列的三维伪折叠图表示中对蛋白质突变体的构象稳定性进行分类。

Classification of conformational stability of protein mutants from 3D pseudo-folding graph representation of protein sequences using support vector machines.

作者信息

Fernández Michael, Caballero Julio, Fernández Leyden, Abreu Jose Ignacio, Acosta Gianco

机构信息

Molecular Modeling Group, Center for Biotechnological Studies, Faculty of Agronomy, University of Matanzas, 44740 Matanzas, Cuba.

出版信息

Proteins. 2008 Jan 1;70(1):167-75. doi: 10.1002/prot.21524.

Abstract

This work reports a novel 3D pseudo-folding graph representation of protein sequences for modeling purposes. Amino acids euclidean distances matrices (EDMs) encode primary structural information. Amino Acid Pseudo-Folding 3D Distances Count (AAp3DC) descriptors, calculated from the EDMs of a large data set of 1363 single protein mutants of 64 proteins, were tested for building a classifier for the signs of the change of thermal unfolding Gibbs free energy change (DeltaDeltaG) upon single mutations. An optimum support vector machine (SVM) with a radial basis function (RBF) kernel well recognized stable and unstable mutants with accuracies over 70% in crossvalidation test. To the best of our knowledge, this result for stable mutant recognition is the highest ever reported for a sequence-based predictor with more than 1000 mutants. Furthermore, the model adequately classified mutations associated to diseases of human prion protein and human transthyretin.

摘要

这项工作报告了一种用于建模目的的新型蛋白质序列三维伪折叠图表示法。氨基酸欧几里得距离矩阵(EDM)编码一级结构信息。从64种蛋白质的1363个单蛋白突变体的大数据集的EDM中计算出的氨基酸伪折叠三维距离计数(AAp3DC)描述符,被用于构建一个分类器,以预测单突变时热解折叠吉布斯自由能变化(DeltaDeltaG)的变化迹象。具有径向基函数(RBF)核的最优支持向量机(SVM)在交叉验证测试中能够很好地识别稳定和不稳定突变体,准确率超过70%。据我们所知,对于基于序列的预测器识别稳定突变体的这一结果,是超过1000个突变体的报道中最高的。此外,该模型能够充分地对与人类朊病毒蛋白和人类转甲状腺素蛋白疾病相关的突变进行分类。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验