Suppr超能文献

贯穿数据生命周期的真实世界数据存储库质量评估:文献综述。

Quality assessment of real-world data repositories across the data life cycle: A literature review.

机构信息

WHO Collaborating Centre on eHealth, School of Population Health, Faculty of Medicine, UNSW Sydney, Sydney, New South Wales, Australia.

Nuffield Department of Primary Care Health Sciences, University of Oxford, Oxford, United Kingdom.

出版信息

J Am Med Inform Assoc. 2021 Jul 14;28(7):1591-1599. doi: 10.1093/jamia/ocaa340.

Abstract

OBJECTIVE

Data quality (DQ) must be consistently defined in context. The attributes, metadata, and context of longitudinal real-world data (RWD) have not been formalized for quality improvement across the data production and curation life cycle. We sought to complete a literature review on DQ assessment frameworks, indicators and tools for research, public health, service, and quality improvement across the data life cycle.

MATERIALS AND METHODS

The review followed PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) guidelines. Databases from health, physical and social sciences were used: Cinahl, Embase, Scopus, ProQuest, Emcare, PsycINFO, Compendex, and Inspec. Embase was used instead of PubMed (an interface to search MEDLINE) because it includes all MeSH (Medical Subject Headings) terms used and journals in MEDLINE as well as additional unique journals and conference abstracts. A combined data life cycle and quality framework guided the search of published and gray literature for DQ frameworks, indicators, and tools. At least 2 authors independently identified articles for inclusion and extracted and categorized DQ concepts and constructs. All authors discussed findings iteratively until consensus was reached.

RESULTS

The 120 included articles yielded concepts related to contextual (data source, custodian, and user) and technical (interoperability) factors across the data life cycle. Contextual DQ subcategories included relevance, usability, accessibility, timeliness, and trust. Well-tested computable DQ indicators and assessment tools were also found.

CONCLUSIONS

A DQ assessment framework that covers intrinsic, technical, and contextual categories across the data life cycle enables assessment and management of RWD repositories to ensure fitness for purpose. Balancing security, privacy, and FAIR principles requires trust and reciprocity, transparent governance, and organizational cultures that value good documentation.

摘要

目的

数据质量(DQ)必须在上下文中进行一致定义。在整个数据生产和管理生命周期中,尚未对纵向真实世界数据(RWD)的属性、元数据和上下文进行规范化,以实现质量改进。我们旨在对跨数据生命周期的研究、公共卫生、服务和质量改进的 DQ 评估框架、指标和工具进行文献综述。

材料与方法

本综述遵循 PRISMA(系统评价和荟萃分析的首选报告项目)指南。使用了健康、物理和社会科学数据库:CINHAL、Embase、Scopus、ProQuest、Emcare、PsycINFO、Compendex 和 Inspec。使用 Embase 而不是 PubMed(搜索 MEDLINE 的接口),因为它包括所有 MeSH(医学主题词)术语、MEDLINE 中的期刊以及其他独特的期刊和会议摘要。一个综合的数据生命周期和质量框架指导了对已发表和灰色文献中 DQ 框架、指标和工具的搜索。至少有 2 位作者独立确定纳入的文章,并提取和分类了 DQ 概念和结构。所有作者反复讨论研究结果,直到达成共识。

结果

120 篇纳入的文章产生了与数据生命周期相关的概念,包括上下文(数据源、保管人和用户)和技术(互操作性)因素。上下文 DQ 子类别包括相关性、可用性、可及性、及时性和可信度。还发现了经过充分测试的可计算 DQ 指标和评估工具。

结论

涵盖数据生命周期内在、技术和上下文类别的 DQ 评估框架能够对 RWD 存储库进行评估和管理,以确保其符合目的。平衡安全性、隐私性和 FAIR 原则需要信任和互惠、透明的治理以及重视良好文档的组织文化。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验