Leibniz Institute DSMZ - German Collection of Microorganisms and Cell Cultures, Braunschweig, Germany.
Leibniz Institute DSMZ - German Collection of Microorganisms and Cell Cultures, Braunschweig, Germany.
J Biotechnol. 2017 Nov 10;261:187-193. doi: 10.1016/j.jbiotec.2017.05.004. Epub 2017 May 6.
Microbial data and metadata are scattered throughout the scientific literature, databases and unpublished lab notes and thereby often are difficult to access. Hot spots of (meta)data are internal descriptions of culture collections and initial descriptions of novel taxa in primary literature. Here we describe three exemplary mobilization projects which yielded metadata published through the prokaryotic metadatabase BacDive. The Reichenbach collection of myxobacteria includes information on 12,535 typewritten index cards which were digitized. A total of 37,156 data points were extracted by text mining. In the second mobilization project, Analytical Profile Index (API) tests on paper forms were targeted. Overall 6820 API tests were digitized, which provide physiological data of 4524 microbial strains. Thirdly, the extraction of metadata from 523 new species descriptions of the International Journal of Systematic and Evolutionary Microbiology, yielding 35,651 data points, is described. All data sets were integrated and published in BacDive. Thereby these metadata not only became accessible and searchable but were also linked to strain taxonomy, isolation source, cultivation condition, and molecular biology data.
微生物数据和元数据分散在科学文献、数据库和未发表的实验室记录中,因此通常难以获取。(元)数据的热点是培养物收藏的内部描述以及原始文献中新分类单元的初步描述。在这里,我们描述了三个示例动员项目,这些项目产生了通过原核元数据库 BacDive 发布的元数据。粘细菌的赖希恩巴赫收藏包括 12535 张打字索引卡的信息,这些索引卡已被数字化。通过文本挖掘共提取了 37156 个数据点。在第二个动员项目中,针对纸质形式的分析性 Profile Index (API) 测试。总共数字化了 6820 个 API 测试,这些测试提供了 4524 个微生物菌株的生理数据。第三,从《国际系统与进化微生物学杂志》的 523 个新物种描述中提取元数据,产生 35651 个数据点。所有数据集都在 BacDive 中进行了整合和发布。通过这种方式,这些元数据不仅变得可访问和可搜索,而且还与菌株分类学、分离源、培养条件和分子生物学数据相关联。