Suppr超能文献

ProvCaRe语义溯源知识库:评估研究的科学可重复性。

ProvCaRe Semantic Provenance Knowledgebase: Evaluating Scientific Reproducibility of Research Studies.

作者信息

Valdez Joshua, Kim Matthew, Rueschman Michael, Socrates Vimig, Redline Susan, Sahoo Satya S

机构信息

Division of Medical Informatics, School of Medicine, Case Western Reserve University, Cleveland, OH.

Department of Medicine, Brigham and Women's Hospital and Beth Israel Deaconess Medical Center, Harvard Medical School, Harvard University Boston, MA.

出版信息

AMIA Annu Symp Proc. 2018 Apr 16;2017:1705-1714. eCollection 2017.

Abstract

Scientific reproducibility is critical for biomedical research as it enables us to advance science by building on previous results, helps ensure the success of increasingly expensive drug trials, and allows funding agencies to make informed decisions. However, there is a growing "crisis" of reproducibility as evidenced by a recent Nature journal survey of more than 1500 researchers that found that 70% of researchers were not able to replicate results from other research groups and more than 50% of researchers were not able reproduce their own research results. In 2016, the National Institutes of Health (NIH) announced the "Rigor and Reproducibility" guidelines to support reproducibility in biomedical research. A key component of the NIH Rigor and Reproducibility guidelines is the recording and analysis of "provenance" information, which describes the origin or history of data and plays a central role in ensuring scientific reproducibility. As part of the NIH Big Data to Knowledge (BD2K)-funded data provenance project, we have developed a new informatics framework called Provenance for Clinical and Healthcare Research (ProvCaRe) to extract, model, and analyze provenance information from published literature describing research studies. Using sleep medicine research studies that have made their data available through the National Sleep Research Resource (NSRR), we have developed an automated pipeline to identify and extract provenance metadata from published literature that is made available for analysis in the ProvCaRe knowledgebase. NSRR is the largest repository of sleep data from over 40,000 studies involving 36,000 participants and we used 75 published articles describing 6 research studies to populate the ProvCaRe knowledgebase. We evaluated the ProvCaRe knowledgebase with 28,474 "provenance triples" using hypothesis-driven queries to identify and rank research studies based on the provenance information extracted from published articles.

摘要

科学可重复性对于生物医学研究至关重要,因为它使我们能够在前人研究成果的基础上推动科学进步,有助于确保日益昂贵的药物试验取得成功,并使资助机构能够做出明智的决策。然而,可重复性“危机”正在加剧,《自然》杂志最近对1500多名研究人员进行的一项调查显示,70%的研究人员无法复制其他研究小组的结果,超过50%的研究人员无法复制自己的研究结果。2016年,美国国立卫生研究院(NIH)宣布了“严谨与可重复性”指南,以支持生物医学研究中的可重复性。NIH严谨与可重复性指南的一个关键组成部分是对“出处”信息的记录和分析,出处信息描述了数据的来源或历史,在确保科学可重复性方面发挥着核心作用。作为NIH大数据到知识(BD2K)资助的数据出处项目的一部分,我们开发了一个名为临床和医疗保健研究出处(ProvCaRe)的新信息学框架,用于从描述研究的已发表文献中提取、建模和分析出处信息。利用通过国家睡眠研究资源(NSRR)提供数据的睡眠医学研究,我们开发了一个自动化管道,以识别和从已发表文献中提取出处元数据,这些元数据可在ProvCaRe知识库中进行分析。NSRR是来自40000多项研究、涉及36000名参与者的最大睡眠数据存储库,我们使用了75篇描述6项研究的已发表文章来填充ProvCaRe知识库。我们使用假设驱动的查询,以28474个“出处三元组”对ProvCaRe知识库进行了评估,以便根据从已发表文章中提取的出处信息识别研究并对其进行排名。

相似文献

3
ProvCaRe: Characterizing scientific reproducibility of biomedical research studies using semantic provenance metadata.
Int J Med Inform. 2019 Jan;121:10-18. doi: 10.1016/j.ijmedinf.2018.10.009. Epub 2018 Nov 3.
5
An Ontology-Enabled Natural Language Processing Pipeline for Provenance Metadata Extraction from Biomedical Text (Short Paper).
On Move Meaningful Internet Syst. 2016 Oct;10033:699-708. doi: 10.1007/978-3-319-48472-3_43. Epub 2016 Oct 18.
6
A unified framework for managing provenance information in translational research.
BMC Bioinformatics. 2011 Nov 29;12:461. doi: 10.1186/1471-2105-12-461.
7
A semantic proteomics dashboard (SemPoD) for data management in translational research.
BMC Syst Biol. 2012;6 Suppl 3(Suppl 3):S20. doi: 10.1186/1752-0509-6-S3-S20. Epub 2012 Dec 17.
8
Data Provenance in Biomedical Research: Scoping Review.
J Med Internet Res. 2023 Mar 27;25:e42289. doi: 10.2196/42289.
10
Survey on Scientific Shared Resource Rigor and Reproducibility.
J Biomol Tech. 2019 Sep;30(3):36-44. doi: 10.7171/jbt.19-3003-001.

引用本文的文献

1
Semantically enabling clinical decision support recommendations.
J Biomed Semantics. 2023 Jul 18;14(1):8. doi: 10.1186/s13326-023-00285-9.
2
Data Provenance in Biomedical Research: Scoping Review.
J Med Internet Res. 2023 Mar 27;25:e42289. doi: 10.2196/42289.
3
Can reproducibility be improved in clinical natural language processing? A study of 7 clinical NLP suites.
J Am Med Inform Assoc. 2021 Mar 1;28(3):504-515. doi: 10.1093/jamia/ocaa261.
5
ProvCaRe: Characterizing scientific reproducibility of biomedical research studies using semantic provenance metadata.
Int J Med Inform. 2019 Jan;121:10-18. doi: 10.1016/j.ijmedinf.2018.10.009. Epub 2018 Nov 3.

本文引用的文献

1
1,500 scientists lift the lid on reproducibility.
Nature. 2016 May 26;533(7604):452-4. doi: 10.1038/533452a.
3
Reproducible Research Practices and Transparency across the Biomedical Literature.
PLoS Biol. 2016 Jan 4;14(1):e1002333. doi: 10.1371/journal.pbio.1002333. eCollection 2016 Jan.
4
The Economics of Reproducibility in Preclinical Research.
PLoS Biol. 2015 Jun 9;13(6):e1002165. doi: 10.1371/journal.pbio.1002165. eCollection 2015 Jun.
5
Policy: NIH plans to enhance reproducibility.
Nature. 2014 Jan 30;505(7485):612-3. doi: 10.1038/505612a.
6
The Ontology of Clinical Research (OCRe): an informatics foundation for the science of clinical research.
J Biomed Inform. 2014 Dec;52:78-91. doi: 10.1016/j.jbi.2013.11.002. Epub 2013 Nov 13.
7
The Cancer Genome Atlas Pan-Cancer analysis project.
Nat Genet. 2013 Oct;45(10):1113-20. doi: 10.1038/ng.2764.
8
A call for transparent reporting to optimize the predictive value of preclinical research.
Nature. 2012 Oct 11;490(7419):187-91. doi: 10.1038/nature11556.
9
The National Center for Biomedical Ontology.
J Am Med Inform Assoc. 2012 Mar-Apr;19(2):190-5. doi: 10.1136/amiajnl-2011-000523. Epub 2011 Nov 10.
10
Informatics and data mining tools and strategies for the human connectome project.
Front Neuroinform. 2011 Jun 27;5:4. doi: 10.3389/fninf.2011.00004. eCollection 2011.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验