Suppr超能文献

通用蛋白质知识库:UniProt

UniProt: the universal protein knowledgebase.

出版信息

Nucleic Acids Res. 2017 Jan 4;45(D1):D158-D169. doi: 10.1093/nar/gkw1099. Epub 2016 Nov 29.

Abstract

The UniProt knowledgebase is a large resource of protein sequences and associated detailed annotation. The database contains over 60 million sequences, of which over half a million sequences have been curated by experts who critically review experimental and predicted data for each protein. The remainder are automatically annotated based on rule systems that rely on the expert curated knowledge. Since our last update in 2014, we have more than doubled the number of reference proteomes to 5631, giving a greater coverage of taxonomic diversity. We implemented a pipeline to remove redundant highly similar proteomes that were causing excessive redundancy in UniProt. The initial run of this pipeline reduced the number of sequences in UniProt by 47 million. For our users interested in the accessory proteomes, we have made available sets of pan proteome sequences that cover the diversity of sequences for each species that is found in its strains and sub-strains. To help interpretation of genomic variants, we provide tracks of detailed protein information for the major genome browsers. We provide a SPARQL endpoint that allows complex queries of the more than 22 billion triples of data in UniProt (http://sparql.uniprot.org/). UniProt resources can be accessed via the website at http://www.uniprot.org/.

摘要

通用蛋白质数据库(UniProt知识库)是一个包含蛋白质序列及相关详细注释的大型资源库。该数据库包含超过6000万个序列,其中超过50万个序列已由专家进行了整理,专家会严格审查每个蛋白质的实验数据和预测数据。其余序列则基于依赖专家整理知识的规则系统进行自动注释。自我们上次在2014年更新以来,我们已将参考蛋白质组的数量增加了一倍多,达到5631个,从而对分类多样性有了更大的覆盖范围。我们实施了一个流程来去除冗余的高度相似蛋白质组,这些蛋白质组在UniProt中造成了过多的冗余。该流程的首次运行使UniProt中的序列数量减少了4700万。对于对辅助蛋白质组感兴趣的用户,我们提供了泛蛋白质组序列集,这些序列集涵盖了在每个物种的菌株和亚菌株中发现的序列多样性。为了帮助解释基因组变异,我们为主要的基因组浏览器提供了详细的蛋白质信息轨迹。我们提供了一个SPARQL端点,允许对UniProt中超过220亿个三元组的数据进行复杂查询(http://sparql.uniprot.org/)。可以通过网站http://www.uniprot.org/访问UniProt资源。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4324/5210571/f66ed8c0469d/gkw1099fig1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验