• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

一个用于可重复性的基于协作语义的溯源管理平台。

A collaborative semantic-based provenance management platform for reproducibility.

作者信息

Samuel Sheeba, König-Ries Birgitta

机构信息

Michael Stifel Center Jena, Jena, Germany.

Heinz Nixdorf Chair for Distributed Information Systems, Friedrich-Schiller Universität Jena, Jena, Thuringia, Germany.

出版信息

PeerJ Comput Sci. 2022 Mar 10;8:e921. doi: 10.7717/peerj-cs.921. eCollection 2022.

DOI:10.7717/peerj-cs.921
PMID:35494870
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9044346/
Abstract

Scientific data management plays a key role in the reproducibility of scientific results. To reproduce results, not only the results but also the data and steps of scientific experiments must be made findable, accessible, interoperable, and reusable. Tracking, managing, describing, and visualizing provenance helps in the understandability, reproducibility, and reuse of experiments for the scientific community. Current systems lack a link between the data, steps, and results from the computational and non-computational processes of an experiment. Such a link, however, is vital for the reproducibility of results. We present a novel solution for the end-to-end provenance management of scientific experiments. We provide a framework, CAESAR (CollAborative Environment for Scientific Analysis with Reproducibility), which allows scientists to capture, manage, query and visualize the complete path of a scientific experiment consisting of computational and non-computational data and steps in an interoperable way. CAESAR integrates the REPRODUCE-ME provenance model, extended from existing semantic web standards, to represent the whole picture of an experiment describing the path it took from its design to its result. ProvBook, an extension for Jupyter Notebooks, is developed and integrated into CAESAR to support computational reproducibility. We have applied and evaluated our contributions to a set of scientific experiments in microscopy research projects.

摘要

科学数据管理在科学结果的可重复性方面起着关键作用。为了重现结果,不仅结果本身,而且科学实验的数据和步骤都必须是可查找、可访问、可互操作且可重复使用的。跟踪、管理、描述和可视化数据起源有助于科学界理解、重现和复用实验。当前的系统缺乏实验的计算和非计算过程中的数据、步骤与结果之间的联系。然而,这样的联系对于结果的可重复性至关重要。我们提出了一种用于科学实验端到端数据起源管理的新颖解决方案。我们提供了一个框架CAESAR(具有可重复性的科学分析协作环境),它允许科学家以可互操作的方式捕获、管理、查询和可视化由计算和非计算数据及步骤组成的科学实验的完整路径。CAESAR集成了从现有语义网标准扩展而来的REPRODUCE-ME数据起源模型,以呈现实验的全貌,描述从设计到结果所经过的路径。为Jupyter Notebook开发的扩展ProvBook被集成到CAESAR中以支持计算可重复性。我们已将我们的成果应用于显微镜研究项目中的一组科学实验并进行了评估。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/81b7/9044346/10a9851806b9/peerj-cs-08-921-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/81b7/9044346/b51e8a476f3b/peerj-cs-08-921-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/81b7/9044346/69ed130d5360/peerj-cs-08-921-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/81b7/9044346/83202c070b64/peerj-cs-08-921-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/81b7/9044346/10a9851806b9/peerj-cs-08-921-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/81b7/9044346/b51e8a476f3b/peerj-cs-08-921-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/81b7/9044346/69ed130d5360/peerj-cs-08-921-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/81b7/9044346/83202c070b64/peerj-cs-08-921-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/81b7/9044346/10a9851806b9/peerj-cs-08-921-g004.jpg

相似文献

1
A collaborative semantic-based provenance management platform for reproducibility.一个用于可重复性的基于协作语义的溯源管理平台。
PeerJ Comput Sci. 2022 Mar 10;8:e921. doi: 10.7717/peerj-cs.921. eCollection 2022.
2
End-to-End provenance representation for the understandability and reproducibility of scientific experiments using a semantic approach.采用语义方法实现科学实验可理解性和可重复性的端到端出处表示。
J Biomed Semantics. 2022 Jan 6;13(1):1. doi: 10.1186/s13326-021-00253-1.
3
A semantic proteomics dashboard (SemPoD) for data management in translational research.用于转化研究数据管理的语义蛋白质组学仪表板(SemPoD)。
BMC Syst Biol. 2012;6 Suppl 3(Suppl 3):S20. doi: 10.1186/1752-0509-6-S3-S20. Epub 2012 Dec 17.
4
Structure-based knowledge acquisition from electronic lab notebooks for research data provenance documentation.基于结构的电子实验记录本知识获取用于研究数据溯源文档。
J Biomed Semantics. 2022 Jan 31;13(1):4. doi: 10.1186/s13326-021-00257-x.
5
Scientific Reproducibility in Biomedical Research: Provenance Metadata Ontology for Semantic Annotation of Study Description.生物医学研究中的科学可重复性:用于研究描述语义注释的来源元数据本体论
AMIA Annu Symp Proc. 2017 Feb 10;2016:1070-1079. eCollection 2016.
6
ProvCaRe: Characterizing scientific reproducibility of biomedical research studies using semantic provenance metadata.ProvCaRe:使用语义来源元数据刻画生物医学研究的科学可重复性。
Int J Med Inform. 2019 Jan;121:10-18. doi: 10.1016/j.ijmedinf.2018.10.009. Epub 2018 Nov 3.
7
A posteriori metadata from automated provenance tracking: integration of AiiDA and TCOD.来自自动溯源跟踪的事后元数据:AiiDA与TCOD的集成。
J Cheminform. 2017 Nov 14;9(1):56. doi: 10.1186/s13321-017-0242-y.
8
Sharing interoperable workflow provenance: A review of best practices and their practical application in CWLProv.共享可互操作的工作流溯源:最佳实践综述及其在 CWLProv 中的实际应用。
Gigascience. 2019 Nov 1;8(11). doi: 10.1093/gigascience/giz095.
9
ProvCaRe Semantic Provenance Knowledgebase: Evaluating Scientific Reproducibility of Research Studies.ProvCaRe语义溯源知识库:评估研究的科学可重复性。
AMIA Annu Symp Proc. 2018 Apr 16;2017:1705-1714. eCollection 2017.
10
A unified framework for managing provenance information in translational research.用于转化研究中管理出处信息的统一框架。
BMC Bioinformatics. 2011 Nov 29;12:461. doi: 10.1186/1471-2105-12-461.

引用本文的文献

1
Reproducibility and replicability in research: What 452 professors think in Universities across the USA and India.研究中的可重复性和可再现性:美国和印度各大学452位教授的看法。
PLoS One. 2025 Mar 26;20(3):e0319334. doi: 10.1371/journal.pone.0319334. eCollection 2025.
2
Provenance Information for Biomedical Data and Workflows: Scoping Review.生物医学数据和工作流程的出处信息:范围综述。
J Med Internet Res. 2024 Aug 23;26:e51297. doi: 10.2196/51297.
3
Facilitating the Sharing of Electrophysiology Data Analysis Results Through In-Depth Provenance Capture.

本文引用的文献

1
Understanding experiments and research practices for reproducibility: an exploratory study.理解可重复性的实验与研究实践:一项探索性研究。
PeerJ. 2021 Apr 21;9:e11140. doi: 10.7717/peerj.11140. eCollection 2021.
2
The CEDAR Workbench: An Ontology-Assisted Environment for Authoring Metadata that Describe Scientific Experiments.CEDAR工作台:一个用于创作描述科学实验的元数据的本体辅助环境。
Semant Web ISWC. 2017 Oct;10588:103-110. doi: 10.1007/978-3-319-68204-4_10. Epub 2017 Oct 4.
3
The Image Data Resource: A Bioimage Data Integration and Publication Platform.
通过深入的溯源捕获来促进电生理数据分析结果的共享。
eNeuro. 2024 Jun 14;11(6). doi: 10.1523/ENEURO.0476-23.2024. Print 2024 Jun.
4
AI-SPedia: a novel ontology to evaluate the impact of research in the field of artificial intelligence.人工智能百科全书:一种用于评估人工智能领域研究影响的新型本体。
PeerJ Comput Sci. 2022 Sep 22;8:e1099. doi: 10.7717/peerj-cs.1099. eCollection 2022.
图像数据资源:一个生物图像数据整合与发布平台。
Nat Methods. 2017 Aug;14(8):775-781. doi: 10.1038/nmeth.4326. Epub 2017 Jun 19.
4
1,500 scientists lift the lid on reproducibility.1500名科学家揭开了可重复性的盖子。
Nature. 2016 May 26;533(7604):452-4. doi: 10.1038/533452a.
5
The cellular microscopy phenotype ontology.细胞显微镜表型本体论。
J Biomed Semantics. 2016 May 18;7:28. doi: 10.1186/s13326-016-0074-0. eCollection 2016.
6
The FAIR Guiding Principles for scientific data management and stewardship.科学数据管理和保存的 FAIR 指导原则。
Sci Data. 2016 Mar 15;3:160018. doi: 10.1038/sdata.2016.18.
7
The cancer test.癌症检测。
Science. 2015 Jun 26;348(6242):1411-3. doi: 10.1126/science.348.6242.1411.
8
Biological imaging software tools.生物成像软件工具。
Nat Methods. 2012 Jun 28;9(7):697-710. doi: 10.1038/nmeth.2084.
9
OMERO: flexible, model-driven data management for experimental biology.OMERO:用于实验生物学的灵活、模型驱动的数据管理。
Nat Methods. 2012 Feb 28;9(3):245-53. doi: 10.1038/nmeth.1896.
10
Galaxy: a comprehensive approach for supporting accessible, reproducible, and transparent computational research in the life sciences.Galaxy:一种支持生命科学领域可访问、可重现和透明计算研究的综合方法。
Genome Biol. 2010;11(8):R86. doi: 10.1186/gb-2010-11-8-r86. Epub 2010 Aug 25.