Calabrese Remo, Capriotti Emidio, Fariselli Piero, Martelli Pier Luigi, Casadio Rita
Laboratory of Biocomputing, CIRB/Department of Biology, University of Bologna, Bologna 40126, Italy.
Hum Mutat. 2009 Aug;30(8):1237-44. doi: 10.1002/humu.21047.
Single nucleotide polymorphisms (SNPs) are the simplest and most frequent form of human DNA variation, also valuable as genetic markers of disease susceptibility. The most investigated SNPs are missense mutations resulting in residue substitutions in the protein. Here we propose SNPs&GO, an accurate method that, starting from a protein sequence, can predict whether a mutation is disease related or not by exploiting the protein functional annotation. The scoring efficiency of SNPs&GO is as high as 82%, with a Matthews correlation coefficient equal to 0.63 over a wide set of annotated nonsynonymous mutations in proteins, including 16,330 disease-related and 17,432 neutral polymorphisms. SNPs&GO collects in unique framework information derived from protein sequence, evolutionary information, and function as encoded in the Gene Ontology terms, and outperforms other available predictive methods.
单核苷酸多态性(SNPs)是人类DNA变异中最简单且最常见的形式,也是作为疾病易感性遗传标记很有价值的形式。研究最多的单核苷酸多态性是导致蛋白质中氨基酸残基替换的错义突变。在此我们提出了SNPs&GO,这是一种准确的方法,从蛋白质序列出发,通过利用蛋白质功能注释能够预测一个突变是否与疾病相关。SNPs&GO的评分效率高达82%,在大量注释的蛋白质非同义突变中,马修斯相关系数等于0.63,其中包括16330个与疾病相关的突变和17432个中性多态性。SNPs&GO在一个独特的框架中收集了源自蛋白质序列、进化信息以及基因本体术语中编码的功能等信息,并且优于其他现有的预测方法。