Suppr超能文献

通用蛋白质资源(UniProt):不断扩展的蛋白质信息宇宙。

The Universal Protein Resource (UniProt): an expanding universe of protein information.

作者信息

Wu Cathy H, Apweiler Rolf, Bairoch Amos, Natale Darren A, Barker Winona C, Boeckmann Brigitte, Ferro Serenella, Gasteiger Elisabeth, Huang Hongzhan, Lopez Rodrigo, Magrane Michele, Martin Maria J, Mazumder Raja, O'Donovan Claire, Redaschi Nicole, Suzek Baris

机构信息

Department of Biochemistry and Molecular Biology, Georgetown University Medical Center, 3900 Reservoir Road, NW, Washington, DC 20057-1414, USA.

出版信息

Nucleic Acids Res. 2006 Jan 1;34(Database issue):D187-91. doi: 10.1093/nar/gkj161.

Abstract

The Universal Protein Resource (UniProt) provides a central resource on protein sequences and functional annotation with three database components, each addressing a key need in protein bioinformatics. The UniProt Knowledgebase (UniProtKB), comprising the manually annotated UniProtKB/Swiss-Prot section and the automatically annotated UniProtKB/TrEMBL section, is the preeminent storehouse of protein annotation. The extensive cross-references, functional and feature annotations and literature-based evidence attribution enable scientists to analyse proteins and query across databases. The UniProt Reference Clusters (UniRef) speed similarity searches via sequence space compression by merging sequences that are 100% (UniRef100), 90% (UniRef90) or 50% (UniRef50) identical. Finally, the UniProt Archive (UniParc) stores all publicly available protein sequences, containing the history of sequence data with links to the source databases. UniProt databases continue to grow in size and in availability of information. Recent and upcoming changes to database contents, formats, controlled vocabularies and services are described. New download availability includes all major releases of UniProtKB, sequence collections by taxonomic division and complete proteomes. A bibliography mapping service has been added, and an ID mapping service will be available soon. UniProt databases can be accessed online at http://www.uniprot.org or downloaded at ftp://ftp.uniprot.org/pub/databases/.

摘要

通用蛋白质资源(UniProt)提供了一个关于蛋白质序列和功能注释的核心资源,它有三个数据库组件,每个组件都满足蛋白质生物信息学中的一个关键需求。UniProt知识库(UniProtKB)由人工注释的UniProtKB/Swiss-Prot部分和自动注释的UniProtKB/TrEMBL部分组成,是蛋白质注释的卓越仓库。广泛的交叉引用、功能和特征注释以及基于文献的证据归属使科学家能够分析蛋白质并跨数据库进行查询。UniProt参考簇(UniRef)通过合并100%相同(UniRef100)、90%相同(UniRef90)或50%相同(UniRef50)的序列,通过序列空间压缩加快相似性搜索。最后,UniProt存档(UniParc)存储所有公开可用的蛋白质序列,包含序列数据的历史记录以及到源数据库的链接。UniProt数据库在规模和信息可用性方面持续增长。描述了数据库内容(格式、控制词汇和服务)的近期和即将发生的变化。新的下载可用性包括UniProtKB的所有主要版本、按分类划分的序列集合和完整的蛋白质组。增加了一个文献映射服务,并且一个ID映射服务将很快可用。可以通过http://www.uniprot.org在线访问UniProt数据库,或从ftp://ftp.uniprot.org/pub/databases/下载。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e609/1347523/c7d7679dd728/gkj161f1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验