National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, 45 Center Drive, Bethesda, MD 20892, USA.
Nucleic Acids Res. 2012 Jan;40(Database issue):D57-63. doi: 10.1093/nar/gkr1163. Epub 2011 Dec 1.
As the volume and complexity of data sets archived at NCBI grow rapidly, so does the need to gather and organize the associated metadata. Although metadata has been collected for some archival databases, previously, there was no centralized approach at NCBI for collecting this information and using it across databases. The BioProject database was recently established to facilitate organization and classification of project data submitted to NCBI, EBI and DDBJ databases. It captures descriptive information about research projects that result in high volume submissions to archival databases, ties together related data across multiple archives and serves as a central portal by which to inform users of data availability. Concomitantly, the BioSample database is being developed to capture descriptive information about the biological samples investigated in projects. BioProject and BioSample records link to corresponding data stored in archival repositories. Submissions are supported by a web-based Submission Portal that guides users through a series of forms for input of rich metadata describing their projects and samples. Together, these databases offer improved ways for users to query, locate, integrate and interpret the masses of data held in NCBI's archival repositories. The BioProject and BioSample databases are available at http://www.ncbi.nlm.nih.gov/bioproject and http://www.ncbi.nlm.nih.gov/biosample, respectively.
随着 NCBI 存档数据集的数量和复杂性迅速增长,收集和组织相关元数据的需求也在增长。尽管一些存档数据库已经收集了元数据,但之前 NCBI 没有集中的方法来收集这些信息并在数据库之间使用。BioProject 数据库最近建立,旨在方便组织和分类提交给 NCBI、EBI 和 DDBJ 数据库的项目数据。它捕获了导致向存档数据库大量提交的研究项目的描述性信息,将多个存档中的相关数据联系起来,并作为一个中央门户,向用户提供数据可用性的信息。同时,正在开发 BioSample 数据库以捕获项目中研究的生物样本的描述性信息。BioProject 和 BioSample 记录链接到存储在存档存储库中的相应数据。提交由基于 Web 的提交门户支持,该门户引导用户通过一系列表单输入描述其项目和样本的丰富元数据。这两个数据库共同为用户提供了改进的方法,用于查询、定位、集成和解释 NCBI 存档存储库中存储的大量数据。BioProject 和 BioSample 数据库分别可在 http://www.ncbi.nlm.nih.gov/bioproject 和 http://www.ncbi.nlm.nih.gov/biosample 上获得。