利用结构和进化信息预测非同义单核苷酸多态性的表型效应。

Prediction of the phenotypic effects of non-synonymous single nucleotide polymorphisms using structural and evolutionary information.

作者信息

Bao Lei, Cui Yan

机构信息

Department of Molecular Sciences, Center of Genomics and Bioinformatics, University of Tennessee Health Science Center, 858 Madison Avenue, Memphis, TN 38163, USA.

出版信息

Bioinformatics. 2005 May 15;21(10):2185-90. doi: 10.1093/bioinformatics/bti365. Epub 2005 Mar 3.

DOI:10.1093/bioinformatics/bti365

PMID:15746281

Abstract

MOTIVATION

There has been great expectation that the knowledge of an individual's genotype will provide a basis for assessing susceptibility to diseases and designing individualized therapy. Non-synonymous single nucleotide polymorphisms (nsSNPs) that lead to an amino acid change in the protein product are of particular interest because they account for nearly half of the known genetic variations related to human inherited diseases. To facilitate the identification of disease-associated nsSNPs from a large number of neutral nsSNPs, it is important to develop computational tools to predict the phenotypic effects of nsSNPs.

RESULTS

We prepared a training set based on the variant phenotypic annotation of the Swiss-Prot database and focused our analysis on nsSNPs having homologous 3D structures. Structural environment parameters derived from the 3D homologous structure as well as evolutionary information derived from the multiple sequence alignment were used as predictors. Two machine learning methods, support vector machine and random forest, were trained and evaluated. We compared the performance of our method with that of the SIFT algorithm, which is one of the best predictive methods to date. An unbiased evaluation study shows that for nsSNPs with sufficient evolutionary information (with not <10 homologous sequences), the performance of our method is comparable with the SIFT algorithm, while for nsSNPs with insufficient evolutionary information (<10 homologous sequences), our method outperforms the SIFT algorithm significantly. These findings indicate that incorporating structural information is critical to achieving good prediction accuracy when sufficient evolutionary information is not available.

AVAILABILITY

The codes and curated dataset are available at http://compbio.utmem.edu/snp/dataset/

摘要

动机

人们一直寄予厚望，认为个体基因型知识将为评估疾病易感性和设计个性化治疗提供依据。导致蛋白质产物中氨基酸变化的非同义单核苷酸多态性（nsSNPs）尤其令人关注，因为它们占已知与人类遗传性疾病相关的遗传变异的近一半。为了便于从大量中性nsSNPs中识别与疾病相关的nsSNPs，开发计算工具来预测nsSNPs的表型效应很重要。

结果

我们基于Swiss-Prot数据库的变异表型注释准备了一个训练集，并将分析重点放在具有同源三维结构的nsSNPs上。从三维同源结构导出的结构环境参数以及从多序列比对导出的进化信息被用作预测因子。对支持向量机和随机森林这两种机器学习方法进行了训练和评估。我们将我们方法的性能与SIFT算法（迄今为止最好的预测方法之一）的性能进行了比较。一项无偏评估研究表明，对于具有足够进化信息（同源序列不少于10个）的nsSNPs，我们方法的性能与SIFT算法相当，而对于进化信息不足（同源序列少于10个）的nsSNPs，我们的方法明显优于SIFT算法。这些发现表明，当没有足够的进化信息时，纳入结构信息对于实现良好的预测准确性至关重要。

可用性

代码和经过整理的数据集可在http://compbio.utmem.edu/snp/dataset/获取。

相似文献

Prediction of the phenotypic effects of non-synonymous single nucleotide polymorphisms using structural and evolutionary information.利用结构和进化信息预测非同义单核苷酸多态性的表型效应。

Bioinformatics. 2005 May 15;21(10):2185-90. doi: 10.1093/bioinformatics/bti365. Epub 2005 Mar 3.

Statistical geometry based prediction of nonsynonymous SNP functional effects using random forest and neuro-fuzzy classifiers.基于统计几何学，使用随机森林和神经模糊分类器预测非同义单核苷酸多态性的功能效应

Proteins. 2008 Jun;71(4):1930-9. doi: 10.1002/prot.21838.

Predicting the insurgence of human genetic diseases associated to single point protein mutations with support vector machines and evolutionary information.利用支持向量机和进化信息预测与单点蛋白质突变相关的人类遗传疾病的发生。

Bioinformatics. 2006 Nov 15;22(22):2729-34. doi: 10.1093/bioinformatics/btl423. Epub 2006 Aug 7.

A bioinformatics approach for the phenotype prediction of nonsynonymous single nucleotide polymorphisms in human cytochromes P450.一种用于预测人类细胞色素P450中非同义单核苷酸多态性表型的生物信息学方法。

Drug Metab Dispos. 2009 May;37(5):977-91. doi: 10.1124/dmd.108.026047. Epub 2009 Feb 9.

LS-SNP: large-scale annotation of coding non-synonymous SNPs based on multiple information sources.LS-SNP：基于多信息源的编码非同义单核苷酸多态性的大规模注释

Bioinformatics. 2005 Jun 15;21(12):2814-20. doi: 10.1093/bioinformatics/bti442. Epub 2005 Apr 12.

Computational prediction of the effects of non-synonymous single nucleotide polymorphisms in human DNA repair genes.人类DNA修复基因中非同义单核苷酸多态性影响的计算预测。

Neuroscience. 2007 Apr 14;145(4):1273-9. doi: 10.1016/j.neuroscience.2006.09.004. Epub 2006 Oct 19.

Structure SNP (StSNP): a web server for mapping and modeling nsSNPs on protein structures with linkage to metabolic pathways.结构单核苷酸多态性（StSNP）：一个用于在蛋白质结构上对非同义单核苷酸多态性进行映射和建模并与代谢途径相联系的网络服务器。

Nucleic Acids Res. 2007 Jul;35(Web Server issue):W384-92. doi: 10.1093/nar/gkm232. Epub 2007 May 30.

Knowledge-based computational mutagenesis for predicting the disease potential of human non-synonymous single nucleotide polymorphisms.基于知识的计算突变分析预测人类非同义单核苷酸多态性的疾病潜能。

J Theor Biol. 2010 Oct 21;266(4):560-8. doi: 10.1016/j.jtbi.2010.07.026. Epub 2010 Jul 23.

Use of estimated evolutionary strength at the codon level improves the prediction of disease-related protein mutations in humans.使用密码子水平的估计进化强度可改善对人类疾病相关蛋白质突变的预测。

Hum Mutat. 2008 Jan;29(1):198-204. doi: 10.1002/humu.20628.

Predicting disulfide connectivity from protein sequence using multiple sequence feature vectors and secondary structure.使用多序列特征向量和二级结构从蛋白质序列预测二硫键连接性。

Bioinformatics. 2007 Dec 1;23(23):3147-54. doi: 10.1093/bioinformatics/btm505. Epub 2007 Oct 17.

引用本文的文献

Rapid discrimination between deleterious and benign missense mutations in the CAGI 6 experiment.在 CAGI 6 实验中快速区分有害和良性错义突变。

Hum Genomics. 2024 Aug 27;18(1):89. doi: 10.1186/s40246-024-00655-z.

Functional and Structural Impact of Deleterious Missense Single Nucleotide Polymorphisms in the NR3C1, CYP3A5, and TNF-α Genes: An In Silico Analysis.NR3C1、CYP3A5 和 TNF-α 基因中有害错义单核苷酸多态性的功能和结构影响：计算机分析。

Biomolecules. 2022 Sep 16;12(9):1307. doi: 10.3390/biom12091307.

Immune Alterations in a Patient With Hyperornithinemia-Hyperammonemia-Homocitrullinuria Syndrome: A Case Report.高鸟氨酸血症-高氨血症-同型瓜氨酸尿症综合征患者的免疫改变：病例报告。

Front Immunol. 2022 May 27;13:861516. doi: 10.3389/fimmu.2022.861516. eCollection 2022.

LYRUS: a machine learning model for predicting the pathogenicity of missense variants.LYRUS：一种用于预测错义变异致病性的机器学习模型。

Bioinform Adv. 2021 Dec 25;2(1):vbab045. doi: 10.1093/bioadv/vbab045. eCollection 2022.

Translating cancer genomics into precision medicine with artificial intelligence: applications, challenges and future perspectives.将癌症基因组学转化为人工智能导向的精准医学：应用、挑战和未来展望。

Hum Genet. 2019 Feb;138(2):109-124. doi: 10.1007/s00439-019-01970-5. Epub 2019 Jan 22.

Accurate prediction of functional, structural, and stability changes in PITX2 mutations using in silico bioinformatics algorithms.利用计算机生物信息学算法准确预测 PITX2 突变的功能、结构和稳定性变化。

PLoS One. 2018 Apr 17;13(4):e0195971. doi: 10.1371/journal.pone.0195971. eCollection 2018.

Leucine to proline substitution by SNP at position 197 in Caspase-9 gene expression leads to neuroblastoma: a bioinformatics analysis.半胱天冬酶-9基因表达中第197位单核苷酸多态性导致的亮氨酸到脯氨酸的替换引发神经母细胞瘤：一项生物信息学分析。

3 Biotech. 2013 Jun;3(3):225-234. doi: 10.1007/s13205-012-0088-y. Epub 2012 Sep 18.

Assessment of the predictive accuracy of five in silico prediction tools, alone or in combination, and two metaservers to classify long QT syndrome gene mutations.评估五种计算机预测工具单独或联合使用时以及两种元服务器对长QT综合征基因突变进行分类的预测准确性。

BMC Med Genet. 2015 May 13;16:34. doi: 10.1186/s12881-015-0176-z.

Single nucleotide variations: biological impact and theoretical interpretation.单核苷酸变异：生物学影响与理论阐释

Protein Sci. 2014 Dec;23(12):1650-66. doi: 10.1002/pro.2552. Epub 2014 Oct 20.

SuSPect: enhanced prediction of single amino acid variant (SAV) phenotype using network features.SuSPect：利用网络特征增强对单氨基酸变异（SAV）表型的预测。

J Mol Biol. 2014 Jul 15;426(14):2692-701. doi: 10.1016/j.jmb.2014.04.026. Epub 2014 May 5.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

利用结构和进化信息预测非同义单核苷酸多态性的表型效应。

Prediction of the phenotypic effects of non-synonymous single nucleotide polymorphisms using structural and evolutionary information.

作者信息

机构信息

出版信息

MOTIVATION

RESULTS

AVAILABILITY

动机

结果

可用性

相似文献

引用本文的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献