Suppr超能文献

AniProtDB:一个用于比较基因组学研究的后生动物蛋白质组一致生成集合。

AniProtDB: A Collection of Consistently Generated Metazoan Proteomes for Comparative Genomics Studies.

机构信息

Computational and Statistical Genomics Branch, Division of Intramural Research, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA.

出版信息

Mol Biol Evol. 2021 Sep 27;38(10):4628-4633. doi: 10.1093/molbev/msab165.

Abstract

To address the void in the availability of high-quality proteomic data traversing the animal tree, we have implemented a pipeline for generating de novo assemblies based on publicly available data from the NCBI Sequence Read Archive, yielding a comprehensive collection of proteomes from 100 species spanning 21 animal phyla. We have also created the Animal Proteome Database (AniProtDB), a resource providing open access to this collection of high-quality metazoan proteomes, along with information on predicted proteins and protein domains for each taxonomic classification and the ability to perform sequence similarity searches against all proteomes generated using this pipeline. This solution vastly increases the utility of these data by removing the barrier to access for research groups who do not have the expertise or resources to generate these data themselves and enables the use of data from nontraditional research organisms that have the potential to address key questions in biomedicine.

摘要

为了解决动物界中高质量蛋白质组学数据缺乏的问题,我们开发了一个基于 NCBI Sequence Read Archive 中公开数据生成从头组装的流程,生成了涵盖 21 个动物门的 100 个物种的全面蛋白质组数据集。我们还创建了动物蛋白质组数据库(AniProtDB),该资源提供了对这个高质量后生动物蛋白质组集合的开放访问,以及每个分类学分类的预测蛋白和蛋白域信息,以及对使用此流程生成的所有蛋白质组进行序列相似性搜索的能力。通过消除没有生成这些数据专业知识或资源的研究小组访问这些数据的障碍,这个解决方案极大地提高了这些数据的实用性,并使具有解决生物医学关键问题潜力的非传统研究生物的使用成为可能。

相似文献

本文引用的文献

1
UniProt: the universal protein knowledgebase in 2021.UniProt:2021 年的通用蛋白质知识库。
Nucleic Acids Res. 2021 Jan 8;49(D1):D480-D489. doi: 10.1093/nar/gkaa1100.
2
The international nucleotide sequence database collaboration.国际核苷酸序列数据库合作组织。
Nucleic Acids Res. 2021 Jan 8;49(D1):D121-D124. doi: 10.1093/nar/gkaa967.
3
Pfam: The protein families database in 2021.Pfam:2021 年的蛋白质家族数据库。
Nucleic Acids Res. 2021 Jan 8;49(D1):D412-D419. doi: 10.1093/nar/gkaa913.
4
CDD/SPARCLE: the conserved domain database in 2020.CDD/SPARCLE:2020 年的保守结构域数据库。
Nucleic Acids Res. 2020 Jan 8;48(D1):D265-D268. doi: 10.1093/nar/gkz991.
5
GenBank.GenBank
Nucleic Acids Res. 2020 Jan 8;48(D1):D84-D86. doi: 10.1093/nar/gkz956.
8
How the evolution of multicellularity set the stage for cancer.多细胞生物的进化如何为癌症奠定了基础。
Br J Cancer. 2018 Jan;118(2):145-152. doi: 10.1038/bjc.2017.398. Epub 2018 Jan 16.
10
To solve old problems, study new research organisms.要解决老问题,研究新的研究生物体。
Dev Biol. 2018 Jan 15;433(2):111-114. doi: 10.1016/j.ydbio.2017.09.018. Epub 2017 Nov 24.

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验