生物医学研究中的科学可重复性：用于研究描述语义注释的来源元数据本体论

Scientific Reproducibility in Biomedical Research: Provenance Metadata Ontology for Semantic Annotation of Study Description.

作者信息

Sahoo Satya S, Valdez Joshua, Rueschman Michael

机构信息

Division of Medical Informatics, School of Medicine, Case Western Reserve University, Cleveland, OH.

Department of Medicine, Brigham and Women's Hospital and Beth Israel Deaconess Medical Center, Harvard Medical School, Boston, MA.

出版信息

AMIA Annu Symp Proc. 2017 Feb 10;2016:1070-1079. eCollection 2016.

PMID:28269904

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC5333253/

Abstract

Scientific reproducibility is key to scientific progress as it allows the research community to build on validated results, protect patients from potentially harmful trial drugs derived from incorrect results, and reduce wastage of valuable resources. The National Institutes of Health (NIH) recently published a systematic guideline titled "Rigor and Reproducibility " for supporting reproducible research studies, which has also been accepted by several scientific journals. These journals will require published articles to conform to these new guidelines. Provenance metadata describes the history or origin of data and it has been long used in computer science to capture metadata information for ensuring data quality and supporting scientific reproducibility. In this paper, we describe the development of Provenance for Clinical and healthcare Research (ProvCaRe) framework together with a provenance ontology to support scientific reproducibility by formally modeling a core set of data elements representing details of research study. We extend the PROV Ontology (PROV-O), which has been recommended as the provenance representation model by World Wide Web Consortium (W3C), to represent both: (a) data provenance, and (b) process provenance. We use 124 study variables from 6 clinical research studies from the National Sleep Research Resource (NSRR) to evaluate the coverage of the provenance ontology. NSRR is the largest repository of NIH-funded sleep datasets with 50,000 studies from 36,000 participants. The provenance ontology reuses ontology concepts from existing biomedical ontologies, for example the Systematized Nomenclature of Medicine Clinical Terms (SNOMED CT), to model the provenance information of research studies. The ProvCaRe framework is being developed as part of the Big Data to Knowledge (BD2K) data provenance project.

摘要

科学可重复性是科学进步的关键，因为它使研究界能够基于经过验证的结果开展研究，保护患者免受源于错误结果的潜在有害试验药物的影响，并减少宝贵资源的浪费。美国国立卫生研究院（NIH）最近发布了一项名为《严谨性与可重复性》的系统指南，以支持可重复的研究，该指南也已被几家科学期刊所接受。这些期刊将要求发表的文章符合这些新指南。出处元数据描述了数据的历史或来源，并且长期以来一直在计算机科学中用于捕获元数据信息，以确保数据质量并支持科学可重复性。在本文中，我们描述了临床与医疗保健研究出处（ProvCaRe）框架的开发以及一个出处本体，通过对代表研究细节的一组核心数据元素进行形式化建模来支持科学可重复性。我们扩展了被万维网联盟（W3C）推荐为出处表示模型的PROV本体（PROV-O），以同时表示：（a）数据出处，以及（b）过程出处。我们使用来自国家睡眠研究资源（NSRR）的6项临床研究中的124个研究变量来评估出处本体的覆盖范围。NSRR是由NIH资助的睡眠数据集的最大存储库，包含来自36000名参与者的50000项研究。出处本体复用了现有生物医学本体中的本体概念，例如医学临床术语系统命名法（SNOMED CT），来对研究的出处信息进行建模。ProvCaRe框架作为大数据到知识（BD2K）数据出处项目的一部分正在开发中。

相似文献

Scientific Reproducibility in Biomedical Research: Provenance Metadata Ontology for Semantic Annotation of Study Description.

AMIA Annu Symp Proc. 2017 Feb 10;2016:1070-1079. eCollection 2016.

ProvCaRe Semantic Provenance Knowledgebase: Evaluating Scientific Reproducibility of Research Studies.

AMIA Annu Symp Proc. 2018 Apr 16;2017:1705-1714. eCollection 2017.

ProvCaRe: Characterizing scientific reproducibility of biomedical research studies using semantic provenance metadata.

Int J Med Inform. 2019 Jan;121:10-18. doi: 10.1016/j.ijmedinf.2018.10.009. Epub 2018 Nov 3.

Semantic Provenance Graph for Reproducibility of Biomedical Research Studies: Generating and Analyzing Graph Structures from Published Literature.

Stud Health Technol Inform. 2019 Aug 21;264:328-332. doi: 10.3233/SHTI190237.

An Ontology-Enabled Natural Language Processing Pipeline for Provenance Metadata Extraction from Biomedical Text (Short Paper).

On Move Meaningful Internet Syst. 2016 Oct;10033:699-708. doi: 10.1007/978-3-319-48472-3_43. Epub 2016 Oct 18.

A semantic proteomics dashboard (SemPoD) for data management in translational research.

BMC Syst Biol. 2012;6 Suppl 3(Suppl 3):S20. doi: 10.1186/1752-0509-6-S3-S20. Epub 2012 Dec 17.

Provenance for Biomedical Ontologies with RDF and Git.

Stud Health Technol Inform. 2019 Sep 3;267:230-237. doi: 10.3233/SHTI190832.

The BMS-LM ontology for biomedical data reporting throughout the lifecycle of a research study: From data model to ontology.

J Biomed Inform. 2022 Mar;127:104007. doi: 10.1016/j.jbi.2022.104007. Epub 2022 Feb 4.

: Semantic Provenance and Analysis Platform for Multi-center Neurology Healthcare Research.

Proceedings (IEEE Int Conf Bioinformatics Biomed). 2015 Nov;2015:731-736. doi: 10.1109/BIBM.2015.7359776.

NeuroBridge ontology: computable provenance metadata to give the long tail of neuroimaging data a FAIR chance for secondary use.

Front Neuroinform. 2023 Jul 24;17:1216443. doi: 10.3389/fninf.2023.1216443. eCollection 2023.

引用本文的文献

Demonstrating the data integrity of routinely collected healthcare systems data for clinical trials (DEDICaTe): A proof-of-concept study.

Health Informatics J. 2024 Jul-Sep;30(3):14604582241276969. doi: 10.1177/14604582241276969.

Provenance Information for Biomedical Data and Workflows: Scoping Review.

J Med Internet Res. 2024 Aug 23;26:e51297. doi: 10.2196/51297.

Traceable Research Data Sharing in a German Medical Data Integration Center With FAIR (Findability, Accessibility, Interoperability, and Reusability)-Geared Provenance Implementation: Proof-of-Concept Study.

JMIR Form Res. 2023 Dec 7;7:e50027. doi: 10.2196/50027.

A System to Easily Manage Metadata in Biomedical Research Labs Based on Open-source Software.

Bio Protoc. 2022 May 5;12(9):e4404. doi: 10.21769/BioProtoc.4404.

Natural Language Processing for the Evaluation of Methodological Standards and Best Practices of EHR-based Clinical Research.

AMIA Jt Summits Transl Sci Proc. 2020 May 30;2020:171-180. eCollection 2020.

Using digital health to enable ethical health research in conflict and other humanitarian settings.

Confl Health. 2018 May 14;12:23. doi: 10.1186/s13031-018-0163-z. eCollection 2018.

An Ontology-Enabled Natural Language Processing Pipeline for Provenance Metadata Extraction from Biomedical Text (Short Paper).

On Move Meaningful Internet Syst. 2016 Oct;10033:699-708. doi: 10.1007/978-3-319-48472-3_43. Epub 2016 Oct 18.

本文引用的文献

The Economics of Reproducibility in Preclinical Research.

PLoS Biol. 2015 Jun 9;13(6):e1002165. doi: 10.1371/journal.pbio.1002165. eCollection 2015 Jun.

Policy: NIH plans to enhance reproducibility.

Nature. 2014 Jan 30;505(7485):612-3. doi: 10.1038/505612a.

The Ontology of Clinical Research (OCRe): an informatics foundation for the science of clinical research.

J Biomed Inform. 2014 Dec;52:78-91. doi: 10.1016/j.jbi.2013.11.002. Epub 2013 Nov 13.

Epilepsy and seizure ontology: towards an epilepsy informatics infrastructure for clinical research and patient care.

J Am Med Inform Assoc. 2014 Jan-Feb;21(1):82-9. doi: 10.1136/amiajnl-2013-001696. Epub 2013 May 18.

A call for transparent reporting to optimize the predictive value of preclinical research.

Nature. 2012 Oct 11;490(7419):187-91. doi: 10.1038/nature11556.

Clinical coverage of an archetype repository over SNOMED-CT.

J Biomed Inform. 2012 Jun;45(3):408-18. doi: 10.1016/j.jbi.2011.12.001. Epub 2011 Dec 17.

The National Center for Biomedical Ontology.

J Am Med Inform Assoc. 2012 Mar-Apr;19(2):190-5. doi: 10.1136/amiajnl-2011-000523. Epub 2011 Nov 10.

Replication and reproducibility in spinal cord injury research.

Exp Neurol. 2012 Feb;233(2):597-605. doi: 10.1016/j.expneurol.2011.06.017. Epub 2011 Nov 10.

EliXR: an approach to eligibility criteria extraction and representation.

J Am Med Inform Assoc. 2011 Dec;18 Suppl 1(Suppl 1):i116-24. doi: 10.1136/amiajnl-2011-000321. Epub 2011 Jul 31.

Statistical design considerations in animal studies published recently in cancer research.

Cancer Res. 2011 Jan 15;71(2):625. doi: 10.1158/0008-5472.CAN-10-3296.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

生物医学研究中的科学可重复性：用于研究描述语义注释的来源元数据本体论

Scientific Reproducibility in Biomedical Research: Provenance Metadata Ontology for Semantic Annotation of Study Description.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献