Finkelstein Joseph, Parvanova Irena, Zhang Frederick
Department of Population Health Science and Policy, Icahn School of Medicine at Mount Sinai, New York, NY, USA.
Center for Bioinformatics and Data Analytics, Columbia University, New York, NY, USA.
Stem Cells Cloning. 2020 Jan 28;13:1-20. doi: 10.2147/SCCAA.S237361. eCollection 2020.
As biomedical data integration and analytics play an increasing role in the field of stem cell research, it becomes important to develop ways to standardize, aggregate, and share data among researchers. For this reason, many databases have been developed in recent years in an attempt to systematically warehouse data from different stem cell projects and experiments at the same time. However, these databases vary widely in their implementation and structure. The aim of this scoping review is to characterize the main features of available stem cell databases in order to identify specifications useful for implementation in future stem cell databases. We conducted a scoping review of peer-reviewed literature and online resources to identify and review available stem cell databases. To identify the relevant databases, we performed a PubMed search using relevant MeSH terms followed by a web search for databases which may not have an associated journal article. In total, we identified 16 databases to include in this review. The data elements reported in these databases represented a broad spectrum of parameters from basic socio-demographic variables to various cells characteristics, cell surface markers expression, and clinical trial results. Three broad sets of functional features that provide utility for future stem cell research and facilitate bioinformatics workflows were identified. These features consisted of the following: common data elements, data visualization and analysis tools, and biomedical ontologies for data integration. Stem cell bioinformatics is a quickly evolving field that generates a growing number of heterogeneous data sets. Further progress in the stem cell research may be greatly facilitated by development of applications for intelligent stem cell data aggregation, sharing and collaboration process.
随着生物医学数据整合与分析在干细胞研究领域发挥着越来越重要的作用,开发在研究人员之间规范、汇总和共享数据的方法变得至关重要。因此,近年来已经开发了许多数据库,试图同时系统地存储来自不同干细胞项目和实验的数据。然而,这些数据库在实现方式和结构上差异很大。本范围综述的目的是描述现有干细胞数据库的主要特征,以便确定对未来干细胞数据库的实现有用的规范。我们对同行评审文献和在线资源进行了范围综述,以识别和审查现有的干细胞数据库。为了识别相关数据库,我们使用相关医学主题词(MeSH)在PubMed上进行搜索,随后在网络上搜索可能没有相关期刊文章的数据库。我们总共识别出16个数据库纳入本综述。这些数据库中报告的数据元素代表了广泛的参数,从基本的社会人口统计学变量到各种细胞特征、细胞表面标志物表达和临床试验结果。确定了为未来干细胞研究提供实用价值并促进生物信息学工作流程的三大类功能特征。这些特征包括:通用数据元素、数据可视化和分析工具以及用于数据整合的生物医学本体。干细胞生物信息学是一个快速发展的领域,产生了越来越多的异构数据集。智能干细胞数据聚合、共享和协作过程应用程序的开发可能会极大地促进干细胞研究的进一步进展。