Suppr超能文献

Genome3D:整合协作数据管道以扩展共识蛋白结构注释的深度和广度。

Genome3D: integrating a collaborative data pipeline to expand the depth and breadth of consensus protein structure annotation.

机构信息

Institute of Structural and Molecular Biology, UCL, Gower Street, London WC1E 6BT, UK.

MRC Laboratory of Molecular Biology, Francis Crick Avenue, Cambridge CB2 0QH, UK.

出版信息

Nucleic Acids Res. 2020 Jan 8;48(D1):D314-D319. doi: 10.1093/nar/gkz967.

Abstract

Genome3D (https://www.genome3d.eu) is a freely available resource that provides consensus structural annotations for representative protein sequences taken from a selection of model organisms. Since the last NAR update in 2015, the method of data submission has been overhauled, with annotations now being 'pushed' to the database via an API. As a result, contributing groups are now able to manage their own structural annotations, making the resource more flexible and maintainable. The new submission protocol brings a number of additional benefits including: providing instant validation of data and avoiding the requirement to synchronise releases between resources. It also makes it possible to implement the submission of these structural annotations as an automated part of existing internal workflows. In turn, these improvements facilitate Genome3D being opened up to new prediction algorithms and groups. For the latest release of Genome3D (v2.1), the underlying dataset of sequences used as prediction targets has been updated using the latest reference proteomes available in UniProtKB. A number of new reference proteomes have also been added of particular interest to the wider scientific community: cow, pig, wheat and mycobacterium tuberculosis. These additions, along with improvements to the underlying predictions from contributing resources, has ensured that the number of annotations in Genome3D has nearly doubled since the last NAR update article. The new API has also been used to facilitate the dissemination of Genome3D data into InterPro, thereby widening the visibility of both the annotation data and annotation algorithms.

摘要

Genome3D(https://www.genome3d.eu)是一个免费的资源,提供了从一系列模式生物中选取的代表性蛋白质序列的共识结构注释。自 2015 年 NAR 更新以来,数据提交的方法已经全面改革,现在通过 API 将注释“推送”到数据库中。因此,现在贡献者能够管理自己的结构注释,使资源更加灵活和易于维护。新的提交协议带来了许多其他好处,包括:即时验证数据,避免资源之间的版本同步要求。它还使得可以将这些结构注释的提交作为现有内部工作流程的自动化部分来实现。反过来,这些改进促进了 Genome3D 向新的预测算法和团体开放。对于 Genome3D 的最新版本(v2.1),用作预测目标的序列的基础数据集已使用 UniProtKB 中最新的参考蛋白质组进行了更新。还添加了一些新的参考蛋白质组,这些蛋白质组特别受到更广泛科学界的关注:奶牛、猪、小麦和结核分枝杆菌。这些新增内容以及来自贡献者资源的基础预测的改进,确保了自上次 NAR 更新文章以来,Genome3D 中的注释数量几乎翻了一番。新的 API 还被用于将 Genome3D 数据传播到 InterPro 中,从而扩大了注释数据和注释算法的可见性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/aadd/7139969/2c138ba0baba/gkz967fig1.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验