Gierend Kerstin, Krüger Frank, Waltemath Dagmar, Fünfgeld Maximilian, Ganslandt Thomas, Zeleke Atinkut Alamirrew
Department of Biomedical Informatics at the Center for Preventive Medicine and Digital Health, Medical Faculty Mannheim, Heidelberg University, Mannheim, Germany.
Department of Communications Engineering, University of Rostock, Rostock, Germany.
JMIR Res Protoc. 2021 Nov 22;10(11):e31750. doi: 10.2196/31750.
Provenance supports the understanding of data genesis, and it is a key factor to ensure the trustworthiness of digital objects containing (sensitive) scientific data. Provenance information contributes to a better understanding of scientific results and fosters collaboration on existing data as well as data sharing. This encompasses defining comprehensive concepts and standards for transparency and traceability, reproducibility, validity, and quality assurance during clinical and scientific data workflows and research.
The aim of this scoping review is to investigate existing evidence regarding approaches and criteria for provenance tracking as well as disclosing current knowledge gaps in the biomedical domain. This review covers modeling aspects as well as metadata frameworks for meaningful and usable provenance information during creation, collection, and processing of (sensitive) scientific biomedical data. This review also covers the examination of quality aspects of provenance criteria.
This scoping review will follow the methodological framework by Arksey and O'Malley. Relevant publications will be obtained by querying PubMed and Web of Science. All papers in English language will be included, published between January 1, 2006 and March 23, 2021. Data retrieval will be accompanied by manual search for grey literature. Potential publications will then be exported into a reference management software, and duplicates will be removed. Afterwards, the obtained set of papers will be transferred into a systematic review management tool. All publications will be screened, extracted, and analyzed: title and abstract screening will be carried out by 4 independent reviewers. Majority vote is required for consent to eligibility of papers based on the defined inclusion and exclusion criteria. Full-text reading will be performed independently by 2 reviewers and in the last step, key information will be extracted on a pretested template. If agreement cannot be reached, the conflict will be resolved by a domain expert. Charted data will be analyzed by categorizing and summarizing the individual data items based on the research questions. Tabular or graphical overviews will be given, if applicable.
The reporting follows the extension of the Preferred Reporting Items for Systematic reviews and Meta-Analyses statements for Scoping Reviews. Electronic database searches in PubMed and Web of Science resulted in 469 matches after deduplication. As of September 2021, the scoping review is in the full-text screening stage. The data extraction using the pretested charting template will follow the full-text screening stage. We expect the scoping review report to be completed by February 2022.
Information about the origin of healthcare data has a major impact on the quality and the reusability of scientific results as well as follow-up activities. This protocol outlines plans for a scoping review that will provide information about current approaches, challenges, or knowledge gaps with provenance tracking in biomedical sciences.
INTERNATIONAL REGISTERED REPORT IDENTIFIER (IRRID): DERR1-10.2196/31750.
出处有助于理解数据起源,是确保包含(敏感)科学数据的数字对象可信度的关键因素。出处信息有助于更好地理解科学成果,促进对现有数据的协作以及数据共享。这包括在临床和科学数据工作流程及研究中定义有关透明度、可追溯性、可重复性、有效性和质量保证的全面概念和标准。
本范围综述的目的是调查有关出处跟踪方法和标准的现有证据,并揭示生物医学领域当前的知识空白。本综述涵盖建模方面以及在(敏感)生物医学科学数据的创建、收集和处理过程中用于生成有意义且可用的出处信息的元数据框架。本综述还包括对出处标准质量方面的考察。
本范围综述将遵循阿克西和奥马利的方法框架。通过查询PubMed和科学网获取相关出版物。将纳入2006年1月1日至2021年3月23日期间发表的所有英文论文。数据检索将辅以对灰色文献的手动搜索。然后将潜在出版物导出到参考管理软件中,并去除重复项。之后,将获得的论文集转移到系统综述管理工具中。对所有出版物进行筛选、提取和分析:标题和摘要筛选将由4名独立评审员进行。根据定义的纳入和排除标准,论文入选需多数投票同意。全文阅读将由2名评审员独立进行,最后一步,将在预先测试的模板上提取关键信息。如果无法达成一致,将由领域专家解决冲突。将通过根据研究问题对各个数据项进行分类和总结来分析图表数据。如有适用,将给出表格或图形概述。
报告遵循系统综述和Meta分析的首选报告项目扩展声明用于范围综述。在PubMed和科学网中进行电子数据库搜索,去重后得到469条匹配结果。截至2021年9月,范围综述处于全文筛选阶段。使用预先测试的图表模板进行数据提取将在全文筛选阶段之后进行。我们预计范围综述报告将于2022年2月完成。
医疗保健数据的来源信息对科学成果的质量和可重用性以及后续活动有重大影响。本方案概述了一项范围综述的计划,该综述将提供有关生物医学科学中出处跟踪的当前方法、挑战或知识空白的信息。
国际注册报告识别号(IRRID):DERR1-10.2196/31750。