Nucleic Acids Res. 2021 Jan 8;49(D1):D480-D489. doi: 10.1093/nar/gkaa1100.
The aim of the UniProt Knowledgebase is to provide users with a comprehensive, high-quality and freely accessible set of protein sequences annotated with functional information. In this article, we describe significant updates that we have made over the last two years to the resource. The number of sequences in UniProtKB has risen to approximately 190 million, despite continued work to reduce sequence redundancy at the proteome level. We have adopted new methods of assessing proteome completeness and quality. We continue to extract detailed annotations from the literature to add to reviewed entries and supplement these in unreviewed entries with annotations provided by automated systems such as the newly implemented Association-Rule-Based Annotator (ARBA). We have developed a credit-based publication submission interface to allow the community to contribute publications and annotations to UniProt entries. We describe how UniProtKB responded to the COVID-19 pandemic through expert curation of relevant entries that were rapidly made available to the research community through a dedicated portal. UniProt resources are available under a CC-BY (4.0) license via the web at https://www.uniprot.org/.
UniProt 知识库的目标是为用户提供一套全面、高质量且免费获取的蛋白质序列,这些序列都附有功能信息注释。在本文中,我们将介绍过去两年中对该资源所做的重大更新。尽管我们一直在努力减少蛋白质组水平的序列冗余,但 UniProtKB 中的序列数量已增加到约 1.9 亿。我们采用了新的方法来评估蛋白质组的完整性和质量。我们继续从文献中提取详细注释,添加到经过评审的条目,并在未经评审的条目用自动系统(如新实施的基于关联规则的注释器 (ARBA))提供的注释进行补充。我们开发了一个基于信用的出版物提交界面,允许社区为 UniProt 条目的出版物和注释做出贡献。我们将描述 UniProtKB 如何通过对相关条目的专家整理来应对 COVID-19 大流行,这些条目通过专门的门户迅速提供给研究社区。UniProt 资源可在 https://www.uniprot.org/ 以 CC-BY(4.0)许可证的形式通过网络获取。