Department of Biochemistry and Molecular Biology, George Washington University Medical Center, Washington, DC 20037, USA.
Genomics Proteomics Bioinformatics. 2013 Apr;11(2):122-6. doi: 10.1016/j.gpb.2012.10.003. Epub 2012 Dec 5.
Amino acid changes due to non-synonymous variation are included as annotations for individual proteins in UniProtKB/Swiss-Prot and RefSeq which present biological data in a protein- or gene-centric fashion. Unfortunately, proteome-wide analysis of non-synonymous single-nucleotide variations (nsSNVs) is not easy to perform because information on nsSNVs and functionally important sites are not well integrated both within and between databases and their search engines. We have developed SNVDis that allows evaluation of proteome-wide nsSNV distribution in functional sites, domains and pathways. More specifically, we have integrated human-specific data from major variation databases (UniProtKB, dbSNP and COSMIC), comprehensive sequence feature annotation from UniProtKB, Pfam, RefSeq, Conserved Domain Database (CDD) and pathway information from Protein ANalysis THrough Evolutionary Relationships (PANTHER) and mapped all of them in a uniform and comprehensive way to the human reference proteome provided by UniProtKB/Swiss-Prot. Integrated information of active sites, pathways, binding sites, domains, which are extracted from a number of different sources, provides a detailed overview of how nsSNVs are distributed over the human proteome and pathways and how they intersect with functional sites of proteins. Additionally, it is possible to find out whether there is an over- or under-representation of nsSNVs in specific domains, pathways or user-defined protein lists. The underlying datasets are updated once every 3months. SNVDis is freely available at http://hive.biochemistry.gwu.edu/tool/snvdis.
由于非同义变异导致的氨基酸变化被包含在 UniProtKB/Swiss-Prot 和 RefSeq 中,作为个体蛋白质的注释,这些数据库以蛋白质或基因为中心的方式呈现生物学数据。不幸的是,非同义单核苷酸变异(nsSNV)的全蛋白质组分析并不容易进行,因为 nsSNV 和功能重要位点的信息在数据库内部和之间以及它们的搜索引擎中都没有很好地整合。我们开发了 SNVDis,它允许评估功能位点、结构域和途径中全蛋白质组的 nsSNV 分布。更具体地说,我们整合了来自主要变异数据库(UniProtKB、dbSNP 和 COSMIC)的人类特异性数据、来自 UniProtKB、Pfam、RefSeq、保守结构域数据库(CDD)的综合序列特征注释以及来自蛋白质分析通过进化关系(PANTHER)的途径信息,并以统一和全面的方式将它们映射到 UniProtKB/Swiss-Prot 提供的人类参考蛋白质组。从多个不同来源提取的活性位点、途径、结合位点、结构域的综合信息,提供了一个详细的概览,说明 nsSNV 如何分布在人类蛋白质组和途径中,以及它们如何与蛋白质的功能位点相交。此外,还可以确定 nsSNV 是否在特定的结构域、途径或用户定义的蛋白质列表中存在过度或不足的情况。基础数据集每 3 个月更新一次。SNVDis 可在 http://hive.biochemistry.gwu.edu/tool/snvdis 免费获得。