Musen Mark A, Bean Carol A, Cheung Kei-Hoi, Dumontier Michel, Durante Kim A, Gevaert Olivier, Gonzalez-Beltran Alejandra, Khatri Purvesh, Kleinstein Steven H, O'Connor Martin J, Pouliot Yannick, Rocca-Serra Philippe, Sansone Susanna-Assunta, Wiser Jeffrey A
Stanford Center for Biomedical Informatics Research, Stanford University, Stanford, CA USA
Stanford Center for Biomedical Informatics Research, Stanford University, Stanford, CA USA.
J Am Med Inform Assoc. 2015 Nov;22(6):1148-52. doi: 10.1093/jamia/ocv048. Epub 2015 Jun 25.
The Center for Expanded Data Annotation and Retrieval is studying the creation of comprehensive and expressive metadata for biomedical datasets to facilitate data discovery, data interpretation, and data reuse. We take advantage of emerging community-based standard templates for describing different kinds of biomedical datasets, and we investigate the use of computational techniques to help investigators to assemble templates and to fill in their values. We are creating a repository of metadata from which we plan to identify metadata patterns that will drive predictive data entry when filling in metadata templates. The metadata repository not only will capture annotations specified when experimental datasets are initially created, but also will incorporate links to the published literature, including secondary analyses and possible refinements or retractions of experimental interpretations. By working initially with the Human Immunology Project Consortium and the developers of the ImmPort data repository, we are developing and evaluating an end-to-end solution to the problems of metadata authoring and management that will generalize to other data-management environments.
扩展数据注释与检索中心正在研究为生物医学数据集创建全面且富有表现力的元数据,以促进数据发现、数据解释和数据重用。我们利用新兴的基于社区的标准模板来描述不同类型的生物医学数据集,并研究使用计算技术来帮助研究人员组装模板并填写其值。我们正在创建一个元数据存储库,计划从中识别元数据模式,这些模式将在填写元数据模板时驱动预测性数据输入。该元数据存储库不仅会捕获实验数据集最初创建时指定的注释,还会纳入与已发表文献的链接,包括二次分析以及实验解释的可能改进或撤回。通过最初与人类免疫学项目联盟和ImmPort数据存储库的开发者合作,我们正在开发和评估一个针对元数据创作和管理问题的端到端解决方案,该方案将推广到其他数据管理环境。