Fariselli Piero, Martelli Pier Luigi, Savojardo Castrense, Casadio Rita
Biocomputing Group, Department of Biology, University of Bologna, 40126 Bologna and Department of Computer Science and Engineering, University of Bologna, 40127 Bologna, Italy.
Biocomputing Group, Department of Biology, University of Bologna, 40126 Bologna and.
Bioinformatics. 2015 Sep 1;31(17):2816-21. doi: 10.1093/bioinformatics/btv291. Epub 2015 May 7.
A tool for reliably predicting the impact of variations on protein stability is extremely important for both protein engineering and for understanding the effects of Mendelian and somatic mutations in the genome. Next Generation Sequencing studies are constantly increasing the number of protein sequences. Given the huge disproportion between protein sequences and structures, there is a need for tools suited to annotate the effect of mutations starting from protein sequence without relying on the structure. Here, we describe INPS, a novel approach for annotating the effect of non-synonymous mutations on the protein stability from its sequence. INPS is based on SVM regression and it is trained to predict the thermodynamic free energy change upon single-point variations in protein sequences.
We show that INPS performs similarly to the state-of-the-art methods based on protein structure when tested in cross-validation on a non-redundant dataset. INPS performs very well also on a newly generated dataset consisting of a number of variations occurring in the tumor suppressor protein p53. Our results suggest that INPS is a tool suited for computing the effect of non-synonymous polymorphisms on protein stability when the protein structure is not available. We also show that INPS predictions are complementary to those of the state-of-the-art, structure-based method mCSM. When the two methods are combined, the overall prediction on the p53 set scores significantly higher than those of the single methods.
The presented method is available as web server at http://inps.biocomp.unibo.it.
Supplementary Materials are available at Bioinformatics online.
一种能够可靠预测变异对蛋白质稳定性影响的工具,对于蛋白质工程以及理解基因组中孟德尔突变和体细胞突变的影响都极为重要。下一代测序研究不断增加蛋白质序列的数量。鉴于蛋白质序列与结构之间存在巨大差异,需要有不依赖于结构、适用于从蛋白质序列注释突变影响的工具。在此,我们描述了INPS,一种从序列注释非同义突变对蛋白质稳定性影响的新方法。INPS基于支持向量机回归,经训练可预测蛋白质序列单点变异时的热力学自由能变化。
我们表明,在非冗余数据集上进行交叉验证测试时,INPS的表现与基于蛋白质结构的现有方法相似。在由肿瘤抑制蛋白p53中出现的一些变异组成的新生成数据集上,INPS也表现出色。我们的结果表明,当蛋白质结构不可用时,INPS是一种适用于计算非同义多态性对蛋白质稳定性影响的工具。我们还表明,INPS的预测与现有基于结构的方法mCSM的预测互补。当这两种方法结合使用时,对p53数据集的总体预测得分显著高于单一方法。
所提出的方法可作为网络服务器在http://inps.biocomp.unibo.it上获取。
补充材料可在《生物信息学》在线获取。