Suppr超能文献

利用SNPs&GO对有害氨基酸变异进行盲预测。

Blind prediction of deleterious amino acid variations with SNPs&GO.

作者信息

Capriotti Emidio, Martelli Pier Luigi, Fariselli Piero, Casadio Rita

机构信息

Biocomputing Group, BiGeA / Giorgio Prodi Interdepartmental Center for Cancer Research, University of Bologna University of Bologna, Bologna, Italy.

Department of Comparative Biomedicine and Food Science, University of Padova, Legnaro, Padova, Italy.

出版信息

Hum Mutat. 2017 Sep;38(9):1064-1071. doi: 10.1002/humu.23179. Epub 2017 May 2.

Abstract

SNPs&GO is a machine learning method for predicting the association of single amino acid variations (SAVs) to disease, considering protein functional annotation. The method is a binary classifier that implements a support vector machine algorithm to discriminate between disease-related and neutral SAVs. SNPs&GO combines information from protein sequence with functional annotation encoded by gene ontology (GO) terms. Tested in sequence mode on more than 38,000 SAVs from the SwissVar dataset, our method reached 81% overall accuracy and an area under the receiving operating characteristic curve of 0.88 with low false-positive rate. In almost all the editions of the Critical Assessment of Genome Interpretation (CAGI) experiments, SNPs&GO ranked among the most accurate algorithms for predicting the effect of SAVs. In this paper, we summarize the best results obtained by SNPs&GO on disease-related variations of four CAGI challenges relative to the following genes: CHEK2 (CAGI 2010), RAD50 (CAGI 2011), p16-INK (CAGI 2013), and NAGLU (CAGI 2016). Result evaluation provides insights about the accuracy of our algorithm and the relevance of GO terms in annotating the effect of the variants. It also helps to define good practices for the detection of deleterious SAVs.

摘要

SNPs&GO是一种机器学习方法,用于预测单氨基酸变异(SAVs)与疾病的关联,同时考虑蛋白质功能注释。该方法是一个二元分类器,它实现了支持向量机算法来区分与疾病相关的SAVs和中性SAVs。SNPs&GO将蛋白质序列信息与由基因本体(GO)术语编码的功能注释相结合。在对来自SwissVar数据集的38000多个SAVs进行序列模式测试时,我们的方法总体准确率达到81%,接收操作特征曲线下面积为0.88,假阳性率较低。在几乎所有版本的基因组解释关键评估(CAGI)实验中,SNPs&GO在预测SAVs效应的最准确算法中名列前茅。在本文中,我们总结了SNPs&GO在四个CAGI挑战中与以下基因相关的疾病相关变异上取得的最佳结果:CHEK2(CAGI 2010)、RAD50(CAGI 2011)、p16-INK(CAGI 2013)和NAGLU(CAGI 2016)。结果评估提供了关于我们算法准确性以及GO术语在注释变异效应方面相关性的见解。它还有助于定义检测有害SAVs的良好实践。

相似文献

引用本文的文献

7
Resources and tools for rare disease variant interpretation.罕见病变异解读的资源与工具。
Front Mol Biosci. 2023 May 10;10:1169109. doi: 10.3389/fmolb.2023.1169109. eCollection 2023.

本文引用的文献

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验