Suppr超能文献

通用蛋白质数据库(UniProt):蛋白质信息中心。

UniProt: a hub for protein information.

出版信息

Nucleic Acids Res. 2015 Jan;43(Database issue):D204-12. doi: 10.1093/nar/gku989. Epub 2014 Oct 27.

Abstract

UniProt is an important collection of protein sequences and their annotations, which has doubled in size to 80 million sequences during the past year. This growth in sequences has prompted an extension of UniProt accession number space from 6 to 10 characters. An increasing fraction of new sequences are identical to a sequence that already exists in the database with the majority of sequences coming from genome sequencing projects. We have created a new proteome identifier that uniquely identifies a particular assembly of a species and strain or subspecies to help users track the provenance of sequences. We present a new website that has been designed using a user-experience design process. We have introduced an annotation score for all entries in UniProt to represent the relative amount of knowledge known about each protein. These scores will be helpful in identifying which proteins are the best characterized and most informative for comparative analysis. All UniProt data is provided freely and is available on the web at http://www.uniprot.org/.

摘要

通用蛋白质数据库(UniProt)是蛋白质序列及其注释的重要集合,在过去一年中其规模已翻倍至8000万个序列。序列数量的增长促使通用蛋白质数据库登录号空间从6个字符扩展到10个字符。新序列中与数据库中已存在序列相同的比例越来越高,其中大多数序列来自基因组测序项目。我们创建了一个新的蛋白质组标识符,用于唯一标识一个物种、菌株或亚种的特定组装体,以帮助用户追踪序列的来源。我们展示了一个采用用户体验设计流程设计的新网站。我们为通用蛋白质数据库中的所有条目引入了注释分数,以表示关于每种蛋白质已知知识的相对数量。这些分数将有助于确定哪些蛋白质特征最明确且对比较分析最具信息价值。所有通用蛋白质数据库数据均免费提供,可在网站http://www.uniprot.org/上获取。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5149/4384041/b49398c04a10/gku989fig1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验