• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

VariBench:一个变异基准数据库。

VariBench: a benchmark database for variations.

机构信息

Institute of Biomedical Technology, University of Tampere, Tampere, Finland.

出版信息

Hum Mutat. 2013 Jan;34(1):42-9. doi: 10.1002/humu.22204. Epub 2012 Oct 11.

DOI:10.1002/humu.22204
PMID:22903802
Abstract

Several computational methods have been developed for predicting the effects of rapidly expanding variation data. Comparison of the performance of tools has been very difficult as the methods have been trained and tested with different datasets. Until now, unbiased and representative benchmark datasets have been missing. We have developed a benchmark database suite, VariBench, to overcome this problem. VariBench contains datasets of experimentally verified high-quality variation data carefully chosen from literature and relevant databases. It provides the mapping of variation position to different levels (protein, RNA and DNA sequences, protein three-dimensional structure), along with identifier mapping to relevant databases. VariBench contains the first benchmark datasets for variation effect analysis, a field which is of high importance and where many developments are currently going on. VariBench datasets can be used, for example, to test performance of prediction tools as well as to train novel machine learning-based tools. New datasets will be included and the community is encouraged to submit high-quality datasets to the service. VariBench is freely available at http://structure.bmc.lu.se/VariBench.

摘要

已经开发了几种计算方法来预测快速扩展的变异数据的影响。由于这些方法是使用不同的数据集进行训练和测试的,因此很难比较工具的性能。到目前为止,还缺少无偏且具有代表性的基准数据集。我们已经开发了一个基准数据库套件 VariBench,以克服这个问题。VariBench 包含了从文献和相关数据库中精心挑选的实验验证的高质量变异数据的数据集。它提供了变异位置到不同层次(蛋白质、RNA 和 DNA 序列、蛋白质三维结构)的映射,以及到相关数据库的标识符映射。VariBench 包含了变异效应分析的第一个基准数据集,这个领域非常重要,目前正在进行许多开发。VariBench 数据集可用于测试预测工具的性能,也可用于训练新的基于机器学习的工具。将包含新的数据集,并鼓励社区向该服务提交高质量的数据集。VariBench 可在 http://structure.bmc.lu.se/VariBench 上免费获得。

相似文献

1
VariBench: a benchmark database for variations.VariBench:一个变异基准数据库。
Hum Mutat. 2013 Jan;34(1):42-9. doi: 10.1002/humu.22204. Epub 2012 Oct 11.
2
Variation benchmark datasets: update, criteria, quality and applications.变异基准数据集:更新、标准、质量和应用。
Database (Oxford). 2020 Jan 1;2020. doi: 10.1093/database/baz117.
3
VariSNP, a benchmark database for variations from dbSNP.VariSNP,一个来自dbSNP变异的基准数据库。
Hum Mutat. 2015 Feb;36(2):161-6. doi: 10.1002/humu.22727. Epub 2015 Jan 8.
4
Representativeness of variation benchmark datasets.变异性基准数据集的代表性。
BMC Bioinformatics. 2018 Nov 29;19(1):461. doi: 10.1186/s12859-018-2478-6.
5
OXBench: a benchmark for evaluation of protein multiple sequence alignment accuracy.OXBench:一种用于评估蛋白质多序列比对准确性的基准。
BMC Bioinformatics. 2003 Oct 10;4:47. doi: 10.1186/1471-2105-4-47.
6
MitoRes: a resource of nuclear-encoded mitochondrial genes and their products in Metazoa.MitoRes:后生动物中核编码线粒体基因及其产物的资源库。
BMC Bioinformatics. 2006 Jan 24;7:36. doi: 10.1186/1471-2105-7-36.
7
Use of estimated evolutionary strength at the codon level improves the prediction of disease-related protein mutations in humans.使用密码子水平的估计进化强度可改善对人类疾病相关蛋白质突变的预测。
Hum Mutat. 2008 Jan;29(1):198-204. doi: 10.1002/humu.20628.
8
Predicting the insurgence of human genetic diseases associated to single point protein mutations with support vector machines and evolutionary information.利用支持向量机和进化信息预测与单点蛋白质突变相关的人类遗传疾病的发生。
Bioinformatics. 2006 Nov 15;22(22):2729-34. doi: 10.1093/bioinformatics/btl423. Epub 2006 Aug 7.
9
Statistical geometry based prediction of nonsynonymous SNP functional effects using random forest and neuro-fuzzy classifiers.基于统计几何学,使用随机森林和神经模糊分类器预测非同义单核苷酸多态性的功能效应
Proteins. 2008 Jun;71(4):1930-9. doi: 10.1002/prot.21838.
10
A Protein Classification Benchmark collection for machine learning.一个用于机器学习的蛋白质分类基准数据集。
Nucleic Acids Res. 2007 Jan;35(Database issue):D232-6. doi: 10.1093/nar/gkl812. Epub 2006 Nov 16.

引用本文的文献

1
Protein Sequence Analysis landscape: A Systematic Review of Task Types, Databases, Datasets, Word Embeddings Methods, and Language Models.蛋白质序列分析全景:任务类型、数据库、数据集、词嵌入方法和语言模型的系统综述
Database (Oxford). 2025 May 30;2025. doi: 10.1093/database/baaf027.
2
Mass balance approximation of unfolding boosts potential-based protein stability predictions.去折叠的质量平衡近似提高了基于势能的蛋白质稳定性预测。
Protein Sci. 2025 May;34(5):e70134. doi: 10.1002/pro.70134.
3
PON-P3: Accurate Prediction of Pathogenicity of Amino Acid Substitutions.
PON-P3:氨基酸替换致病性的准确预测
Int J Mol Sci. 2025 Feb 25;26(5):2004. doi: 10.3390/ijms26052004.
4
Assessing the predicted impact of single amino acid substitutions in MAPK proteins for CAGI6 challenges.评估丝裂原活化蛋白激酶(MAPK)蛋白中单个氨基酸取代对CAGI6挑战的预测影响。
Hum Genet. 2025 Mar;144(2-3):265-280. doi: 10.1007/s00439-024-02724-8. Epub 2025 Feb 20.
5
FoldX force field revisited, an improved version.重新审视的FoldX力场,一个改进版本。
Bioinformatics. 2025 Feb 4;41(2). doi: 10.1093/bioinformatics/btaf064.
6
DDGemb: predicting protein stability change upon single- and multi-point variations with embeddings and deep learning.DDGemb:利用嵌入和深度学习预测单点和多点变异时蛋白质稳定性的变化
Bioinformatics. 2024 Dec 26;41(1). doi: 10.1093/bioinformatics/btaf019.
7
There will always be variants of uncertain significance. Analysis of VUSs.总会存在意义未明的变异体。意义未明变异体的分析。
NAR Genom Bioinform. 2024 Nov 21;6(4):lqae154. doi: 10.1093/nargab/lqae154. eCollection 2024 Dec.
8
Variant Impact Predictor database (VIPdb), version 2: trends from three decades of genetic variant impact predictors.变异影响预测器数据库(VIPdb),版本 2:三十年来遗传变异影响预测器的趋势。
Hum Genomics. 2024 Aug 28;18(1):90. doi: 10.1186/s40246-024-00663-z.
9
Enhancing predictions of protein stability changes induced by single mutations using MSA-based Language Models.使用基于多序列比对的语言模型增强对单突变诱导的蛋白质稳定性变化的预测。
Bioinformatics. 2024 Jul 16;40(7). doi: 10.1093/bioinformatics/btae447.
10
Variant Impact Predictor database (VIPdb), version 2: Trends from 25 years of genetic variant impact predictors.变异影响预测数据库(VIPdb),版本2:25年基因变异影响预测的趋势
bioRxiv. 2024 Jun 28:2024.06.25.600283. doi: 10.1101/2024.06.25.600283.