• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

EFIN:预测人类基因组中非同义单核苷酸多态性的功能影响。

EFIN: predicting the functional impact of nonsynonymous single nucleotide polymorphisms in human genome.

机构信息

Department of Paediatrics and Adolescent Medicine, LKS Faculty of Medicine, The University of Hong Kong, 5 Sassoon Road, Hong Kong, China.

出版信息

BMC Genomics. 2014 Jun 10;15(1):455. doi: 10.1186/1471-2164-15-455.

DOI:10.1186/1471-2164-15-455
PMID:24916671
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC4061446/
Abstract

BACKGROUND

Predicting the functional impact of amino acid substitutions (AAS) caused by nonsynonymous single nucleotide polymorphisms (nsSNPs) is becoming increasingly important as more and more novel variants are being discovered. Bioinformatics analysis is essential to predict potentially causal or contributing AAS to human diseases for further analysis, as for each genome, thousands of rare or private AAS exist and only a very small number of which are related to an underlying disease. Existing algorithms in this field still have high false prediction rate and novel development is needed to take full advantage of vast amount of genomic data.

RESULTS

Here we report a novel algorithm that features two innovative changes: 1. making better use of sequence conservation information by grouping the homologous protein sequences into six blocks according to evolutionary distances to human and evaluating sequence conservation in each block independently, and 2. including as many such homologous sequences as possible in analyses. Random forests are used to evaluate sequence conservation in each block and to predict potential impact of an AAS on protein function. Testing of this algorithm on a comprehensive dataset showed significant improvement on prediction accuracy upon currently widely-used programs. The algorithm and a web-based application tool implementing it, EFIN (Evaluation of Functional Impact of Nonsynonymous SNPs) were made freely available (http://paed.hku.hk/efin/) to the public.

CONCLUSIONS

Grouping homologous sequences into different blocks according to the evolutionary distance of the species to human and evaluating sequence conservation in each group independently significantly improved prediction accuracy. This approach may help us better understand the roles of genetic variants in human disease and health.

摘要

背景

随着越来越多的新型变异被发现,预测由非同义单核苷酸多态性(nsSNP)引起的氨基酸替换(AAS)对功能的影响变得越来越重要。生物信息学分析对于预测可能导致人类疾病的因果或贡献性 AAS 至关重要,因为对于每个基因组,都存在数千种罕见或特定的 AAS,其中只有极少数与潜在疾病有关。该领域现有的算法仍然存在很高的假阳性预测率,因此需要新的开发来充分利用大量的基因组数据。

结果

我们在此报告了一种新的算法,其具有两个创新的变化:1. 通过根据与人类的进化距离将同源蛋白序列分为六个块,更好地利用序列保守性信息,并独立评估每个块中的序列保守性;2. 在分析中尽可能多地包含此类同源序列。随机森林用于评估每个块中的序列保守性,并预测 AAS 对蛋白质功能的潜在影响。在一个综合数据集上对该算法进行测试表明,与当前广泛使用的程序相比,预测准确性有了显著提高。该算法和一个基于网络的应用程序工具 EFIN(非同义 SNP 功能影响评估)已免费向公众提供(http://paed.hku.hk/efin/)。

结论

根据与人类的进化距离将同源序列分为不同的块,并独立评估每个组中的序列保守性,显著提高了预测准确性。这种方法可能有助于我们更好地理解遗传变异在人类疾病和健康中的作用。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bf1d/4061446/f335f49c2b5d/12864_2013_6120_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bf1d/4061446/0d3f28867748/12864_2013_6120_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bf1d/4061446/28bcd6613438/12864_2013_6120_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bf1d/4061446/50096951e0e6/12864_2013_6120_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bf1d/4061446/cb890a61ab8e/12864_2013_6120_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bf1d/4061446/f335f49c2b5d/12864_2013_6120_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bf1d/4061446/0d3f28867748/12864_2013_6120_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bf1d/4061446/28bcd6613438/12864_2013_6120_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bf1d/4061446/50096951e0e6/12864_2013_6120_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bf1d/4061446/cb890a61ab8e/12864_2013_6120_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bf1d/4061446/f335f49c2b5d/12864_2013_6120_Fig5_HTML.jpg

相似文献

1
EFIN: predicting the functional impact of nonsynonymous single nucleotide polymorphisms in human genome.EFIN:预测人类基因组中非同义单核苷酸多态性的功能影响。
BMC Genomics. 2014 Jun 10;15(1):455. doi: 10.1186/1471-2164-15-455.
2
Statistical geometry based prediction of nonsynonymous SNP functional effects using random forest and neuro-fuzzy classifiers.基于统计几何学,使用随机森林和神经模糊分类器预测非同义单核苷酸多态性的功能效应
Proteins. 2008 Jun;71(4):1930-9. doi: 10.1002/prot.21838.
3
Predicting the effects of amino acid substitutions on protein function.预测氨基酸取代对蛋白质功能的影响。
Annu Rev Genomics Hum Genet. 2006;7:61-80. doi: 10.1146/annurev.genom.7.080505.115630.
4
Predicting the functional effect of amino acid substitutions and indels.预测氨基酸替换和缺失的功能效应。
PLoS One. 2012;7(10):e46688. doi: 10.1371/journal.pone.0046688. Epub 2012 Oct 8.
5
SNPdryad: predicting deleterious non-synonymous human SNPs using only orthologous protein sequences.SNPdryad:仅使用直系同源蛋白质序列预测有害的非同义人类单核苷酸多态性
Bioinformatics. 2014 Apr 15;30(8):1112-1119. doi: 10.1093/bioinformatics/btt769. Epub 2014 Jan 2.
6
Performance of In Silico Tools for the Evaluation of UGT1A1 Missense Variants.用于评估UGT1A1错义变体的计算机工具的性能
Hum Mutat. 2015 Dec;36(12):1215-25. doi: 10.1002/humu.22903. Epub 2015 Oct 5.
7
Predicting the insurgence of human genetic diseases associated to single point protein mutations with support vector machines and evolutionary information.利用支持向量机和进化信息预测与单点蛋白质突变相关的人类遗传疾病的发生。
Bioinformatics. 2006 Nov 15;22(22):2729-34. doi: 10.1093/bioinformatics/btl423. Epub 2006 Aug 7.
8
A bioinformatics approach for the phenotype prediction of nonsynonymous single nucleotide polymorphisms in human cytochromes P450.一种用于预测人类细胞色素P450中非同义单核苷酸多态性表型的生物信息学方法。
Drug Metab Dispos. 2009 May;37(5):977-91. doi: 10.1124/dmd.108.026047. Epub 2009 Feb 9.
9
Predicting the functional, molecular, and phenotypic consequences of amino acid substitutions using hidden Markov models.使用隐马尔可夫模型预测氨基酸取代的功能、分子和表型后果。
Hum Mutat. 2013 Jan;34(1):57-65. doi: 10.1002/humu.22225. Epub 2012 Nov 2.
10
WS-SNPs&GO: a web server for predicting the deleterious effect of human protein variants using functional annotation.WS-SNPs&GO:一个使用功能注释预测人类蛋白质变异体有害影响的网络服务器。
BMC Genomics. 2013;14 Suppl 3(Suppl 3):S6. doi: 10.1186/1471-2164-14-S3-S6. Epub 2013 May 28.

引用本文的文献

1
Variant Impact Predictor database (VIPdb), version 2: trends from three decades of genetic variant impact predictors.变异影响预测器数据库(VIPdb),版本 2:三十年来遗传变异影响预测器的趋势。
Hum Genomics. 2024 Aug 28;18(1):90. doi: 10.1186/s40246-024-00663-z.
2
Variant Impact Predictor database (VIPdb), version 2: Trends from 25 years of genetic variant impact predictors.变异影响预测数据库(VIPdb),版本2:25年基因变异影响预测的趋势
bioRxiv. 2024 Jun 28:2024.06.25.600283. doi: 10.1101/2024.06.25.600283.
3
APOGEE 2: multi-layer machine-learning model for the interpretable prediction of mitochondrial missense variants.

本文引用的文献

1
PriVar: a toolkit for prioritizing SNVs and indels from next-generation sequencing data.PriVar:用于对下一代测序数据中的 SNVs 和 indels 进行优先级排序的工具包。
Bioinformatics. 2013 Jan 1;29(1):124-5. doi: 10.1093/bioinformatics/bts627. Epub 2012 Oct 25.
2
A comprehensive framework for prioritizing variants in exome sequencing studies of Mendelian diseases.孟德尔疾病外显子组测序研究中变异优先级的综合框架。
Nucleic Acids Res. 2012 Apr;40(7):e53. doi: 10.1093/nar/gkr1257. Epub 2012 Jan 12.
3
dbNSFP: a lightweight database of human nonsynonymous SNPs and their functional predictions.
APOGEE 2:用于可解释预测线粒体错义变异的多层机器学习模型。
Nat Commun. 2023 Aug 19;14(1):5058. doi: 10.1038/s41467-023-40797-7.
4
Genome interpretation using in silico predictors of variant impact.使用变异影响的计算机预测因子进行基因组解读。
Hum Genet. 2022 Oct;141(10):1549-1577. doi: 10.1007/s00439-022-02457-6. Epub 2022 Apr 30.
5
Prediction of disease-associated nsSNPs by integrating multi-scale ResNet models with deep feature fusion.通过整合多尺度 ResNet 模型与深度特征融合来预测疾病相关的 nsSNPs。
Brief Bioinform. 2022 Jan 17;23(1). doi: 10.1093/bib/bbab530.
6
In silico screening and analysis of nonsynonymous SNPs in human CYP1A2 to assess possible associations with pathogenicity and cancer susceptibility.计算机筛选和分析人类 CYP1A2 中的非同义 SNP,以评估其与致病性和癌症易感性的可能关联。
Sci Rep. 2021 Mar 2;11(1):4977. doi: 10.1038/s41598-021-83696-x.
7
Possible A2E Mutagenic Effects on RPE Mitochondrial DNA from Innovative RNA-Seq Bioinformatics Pipeline.创新型RNA测序生物信息学流程对视网膜色素上皮细胞线粒体DNA可能产生的A2E诱变效应。
Antioxidants (Basel). 2020 Nov 20;9(11):1158. doi: 10.3390/antiox9111158.
8
VIPdb, a genetic Variant Impact Predictor Database.VIPdb,一个遗传变异影响预测数据库。
Hum Mutat. 2019 Sep;40(9):1202-1214. doi: 10.1002/humu.23858. Epub 2019 Aug 17.
9
In silico analyses of deleterious missense SNPs of human apolipoprotein E3.人载脂蛋白 E3 有害错义 SNP 的计算机分析。
Sci Rep. 2017 May 30;7(1):2509. doi: 10.1038/s41598-017-01737-w.
10
Disease-associated mitochondrial mutations and the evolution of primate mitogenomes.与疾病相关的线粒体突变与灵长类动物线粒体基因组的进化
PLoS One. 2017 May 16;12(5):e0177403. doi: 10.1371/journal.pone.0177403. eCollection 2017.
dbNSFP:一个轻量级的人类非同义 SNP 及其功能预测数据库。
Hum Mutat. 2011 Aug;32(8):894-9. doi: 10.1002/humu.21517.
4
Improving the assessment of the outcome of nonsynonymous SNVs with a consensus deleteriousness score, Condel.利用共识致病变异评分提高非同义 SNV 结果的评估,Condel。
Am J Hum Genet. 2011 Apr 8;88(4):440-9. doi: 10.1016/j.ajhg.2011.03.004. Epub 2011 Mar 31.
5
Identifying a high fraction of the human genome to be under selective constraint using GERP++.使用 GERP++ 鉴定人类基因组中受到选择压力的部分。
PLoS Comput Biol. 2010 Dec 2;6(12):e1001025. doi: 10.1371/journal.pcbi.1001025.
6
MutationTaster evaluates disease-causing potential of sequence alterations.MutationTaster评估序列改变的致病潜力。
Nat Methods. 2010 Aug;7(8):575-6. doi: 10.1038/nmeth0810-575.
7
ANNOVAR: functional annotation of genetic variants from high-throughput sequencing data.ANNOVAR:从高通量测序数据中注释遗传变异的功能。
Nucleic Acids Res. 2010 Sep;38(16):e164. doi: 10.1093/nar/gkq603. Epub 2010 Jul 3.
8
A method and server for predicting damaging missense mutations.一种预测有害错义突变的方法及服务器。
Nat Methods. 2010 Apr;7(4):248-9. doi: 10.1038/nmeth0410-248.
9
Detection of nonneutral substitution rates on mammalian phylogenies.检测哺乳动物系统发育上的非中性替代率。
Genome Res. 2010 Jan;20(1):110-21. doi: 10.1101/gr.097857.109. Epub 2009 Oct 26.
10
Predicting the effects of coding non-synonymous variants on protein function using the SIFT algorithm.使用SIFT算法预测编码非同义变体对蛋白质功能的影响。
Nat Protoc. 2009;4(7):1073-81. doi: 10.1038/nprot.2009.86. Epub 2009 Jun 25.