扩展数据注释与检索中心

The center for expanded data annotation and retrieval.

作者信息

Musen Mark A, Bean Carol A, Cheung Kei-Hoi, Dumontier Michel, Durante Kim A, Gevaert Olivier, Gonzalez-Beltran Alejandra, Khatri Purvesh, Kleinstein Steven H, O'Connor Martin J, Pouliot Yannick, Rocca-Serra Philippe, Sansone Susanna-Assunta, Wiser Jeffrey A

机构信息

Stanford Center for Biomedical Informatics Research, Stanford University, Stanford, CA USA

Stanford Center for Biomedical Informatics Research, Stanford University, Stanford, CA USA.

出版信息

J Am Med Inform Assoc. 2015 Nov;22(6):1148-52. doi: 10.1093/jamia/ocv048. Epub 2015 Jun 25.

DOI:10.1093/jamia/ocv048

PMID:26112029

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC5009916/

Abstract

The Center for Expanded Data Annotation and Retrieval is studying the creation of comprehensive and expressive metadata for biomedical datasets to facilitate data discovery, data interpretation, and data reuse. We take advantage of emerging community-based standard templates for describing different kinds of biomedical datasets, and we investigate the use of computational techniques to help investigators to assemble templates and to fill in their values. We are creating a repository of metadata from which we plan to identify metadata patterns that will drive predictive data entry when filling in metadata templates. The metadata repository not only will capture annotations specified when experimental datasets are initially created, but also will incorporate links to the published literature, including secondary analyses and possible refinements or retractions of experimental interpretations. By working initially with the Human Immunology Project Consortium and the developers of the ImmPort data repository, we are developing and evaluating an end-to-end solution to the problems of metadata authoring and management that will generalize to other data-management environments.

摘要

扩展数据注释与检索中心正在研究为生物医学数据集创建全面且富有表现力的元数据，以促进数据发现、数据解释和数据重用。我们利用新兴的基于社区的标准模板来描述不同类型的生物医学数据集，并研究使用计算技术来帮助研究人员组装模板并填写其值。我们正在创建一个元数据存储库，计划从中识别元数据模式，这些模式将在填写元数据模板时驱动预测性数据输入。该元数据存储库不仅会捕获实验数据集最初创建时指定的注释，还会纳入与已发表文献的链接，包括二次分析以及实验解释的可能改进或撤回。通过最初与人类免疫学项目联盟和ImmPort数据存储库的开发者合作，我们正在开发和评估一个针对元数据创作和管理问题的端到端解决方案，该方案将推广到其他数据管理环境。

相似文献

The center for expanded data annotation and retrieval.

J Am Med Inform Assoc. 2015 Nov;22(6):1148-52. doi: 10.1093/jamia/ocv048. Epub 2015 Jun 25.

Development of an open metadata schema for prospective clinical research (openPCR) in China.

Methods Inf Med. 2014;53(1):39-46. doi: 10.3414/ME13-01-0008. Epub 2013 Dec 9.

Sustainable data and metadata management at the BD2K-LINCS Data Coordination and Integration Center.

Sci Data. 2018 Jun 19;5:180117. doi: 10.1038/sdata.2018.117.

The CEDAR Workbench: An Ontology-Assisted Environment for Authoring Metadata that Describe Scientific Experiments.

Semant Web ISWC. 2017 Oct;10588:103-110. doi: 10.1007/978-3-319-68204-4_10. Epub 2017 Oct 4.

linkedISA: semantic representation of ISA-Tab experimental metadata.

BMC Bioinformatics. 2014;15 Suppl 14(Suppl 14):S4. doi: 10.1186/1471-2105-15-S14-S4. Epub 2014 Nov 27.

The CAIRR Pipeline for Submitting Standards-Compliant B and T Cell Receptor Repertoire Sequencing Studies to the National Center for Biotechnology Information Repositories.

Front Immunol. 2018 Aug 16;9:1877. doi: 10.3389/fimmu.2018.01877. eCollection 2018.

Provenance for Biomedical Ontologies with RDF and Git.

Stud Health Technol Inform. 2019 Sep 3;267:230-237. doi: 10.3233/SHTI190832.

Using association rule mining and ontologies to generate metadata recommendations from multiple biomedical databases.

Database (Oxford). 2019 Jan 1;2019. doi: 10.1093/database/baz059.

ODMedit: uniform semantic annotation for data integration in medicine based on a public metadata repository.

BMC Med Res Methodol. 2016 Jun 1;16:65. doi: 10.1186/s12874-016-0164-9.

Scientific Reproducibility in Biomedical Research: Provenance Metadata Ontology for Semantic Annotation of Study Description.

AMIA Annu Symp Proc. 2017 Feb 10;2016:1070-1079. eCollection 2016.

引用本文的文献

The systematic assessment of completeness of public metadata accompanying omics studies in the Gene Expression Omnibus data repository.

Genome Biol. 2025 Sep 9;26(1):274. doi: 10.1186/s13059-025-03725-0.

A Cloud-Based Platform for Harmonized COVID-19 Data: Design and Implementation of the Rapid Acceleration of Diagnostics (RADx) Data Hub.

JMIR Public Health Surveill. 2025 Aug 20;11:e72677. doi: 10.2196/72677.

The systematic assessment of completeness of public metadata accompanying omics studies in the Gene Expression Omnibus.

bioRxiv. 2025 Jul 7:2021.11.22.469640. doi: 10.1101/2021.11.22.469640.

Standardizing Survey Data Collection to Enhance Reproducibility: Development and Comparative Evaluation of the ReproSchema Ecosystem.

J Med Internet Res. 2025 Jul 11;27:e63343. doi: 10.2196/63343.

The FAIR data point populator: collaborative FAIRification and population of FAIR data points.

BMC Med Inform Decis Mak. 2025 Jun 10;25(Suppl 1):211. doi: 10.1186/s12911-025-03022-7.

BioPortal: an open community resource for sharing, searching, and utilizing biomedical ontologies.

Nucleic Acids Res. 2025 Jul 7;53(W1):W84-W94. doi: 10.1093/nar/gkaf402.

Ensuring Adherence to Standards in Experiment-Related Metadata Entered Via Spreadsheets.

Sci Data. 2025 Feb 14;12(1):265. doi: 10.1038/s41597-025-04589-6.

An ecosystem for producing and sharing metadata within the web of FAIR Data.

Gigascience. 2025 Jan 6;14. doi: 10.1093/gigascience/giae111.

Construction, Deployment, and Usage of the Human Reference Atlas Knowledge Graph for Linked Open Data.

bioRxiv. 2025 Feb 19:2024.12.22.630006. doi: 10.1101/2024.12.22.630006.

FAIR Data Cube, a FAIR data infrastructure for integrated multi-omics data analysis.

J Biomed Semantics. 2024 Dec 28;15(1):20. doi: 10.1186/s13326-024-00321-2.

本文引用的文献

The National Institutes of Health's Big Data to Knowledge (BD2K) initiative: capitalizing on biomedical big data.

J Am Med Inform Assoc. 2014 Nov-Dec;21(6):957-8. doi: 10.1136/amiajnl-2014-002974. Epub 2014 Jul 9.

ImmPort: disseminating data to the public for the future of immunology.

Immunol Res. 2014 May;58(2-3):234-9. doi: 10.1007/s12026-014-8516-1.

Computational resources for high-dimensional immune analysis from the Human Immunology Project Consortium.

Nat Biotechnol. 2014 Feb;32(2):146-8. doi: 10.1038/nbt.2777. Epub 2014 Jan 19.

Biology's dry future.

Science. 2013 Oct 11;342(6155):186-9. doi: 10.1126/science.342.6155.186.

A sea of standards for omics data: sink or swim?

J Am Med Inform Assoc. 2014 Mar-Apr;21(2):200-3. doi: 10.1136/amiajnl-2013-002066. Epub 2013 Sep 27.

On the reproducibility of science: unique identification of research resources in the biomedical literature.

PeerJ. 2013 Sep 5;1:e148. doi: 10.7717/peerj.148. eCollection 2013.

Toward interoperable bioscience data.

Nat Genet. 2012 Jan 27;44(2):121-6. doi: 10.1038/ng.1054.

Science friction: data, metadata, and collaboration.

Soc Stud Sci. 2011 Oct;41(5):667-90. doi: 10.1177/0306312711413314.

The National Center for Biomedical Ontology.

J Am Med Inform Assoc. 2012 Mar-Apr;19(2):190-5. doi: 10.1136/amiajnl-2011-000523. Epub 2011 Nov 10.

Data sharing by scientists: practices and perceptions.

PLoS One. 2011;6(6):e21101. doi: 10.1371/journal.pone.0021101. Epub 2011 Jun 29.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

扩展数据注释与检索中心

The center for expanded data annotation and retrieval.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献