• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

面向数据质量评估的内容不可知可计算知识库。

Towards a content agnostic computable knowledge repository for data quality assessment.

机构信息

Department of Biomedical Informatics, Center for Clinical and Translational Sciences (CCTS) Biomedical Informatics Core, University of Utah, 421 Wakara Way, Suite 140, Salt Lake City, UT 84108-3514, USA.

出版信息

Comput Methods Programs Biomed. 2019 Aug;177:193-201. doi: 10.1016/j.cmpb.2019.05.017. Epub 2019 May 24.

DOI:10.1016/j.cmpb.2019.05.017
PMID:31319948
Abstract

BACKGROUND AND OBJECTIVE

In recent years, several data quality conceptual frameworks have been proposed across the Data Quality and Information Quality domains towards assessment of quality of data. These frameworks are diverse, varying from simple lists of concepts to complex ontological and taxonomical representations of data quality concepts. The goal of this study is to design, develop and implement a platform agnostic computable data quality knowledge repository for data quality assessments.

METHODS

We identified computable data quality concepts by performing a comprehensive literature review of articles indexed in three major bibliographic data sources. From this corpus, we extracted data quality concepts, their definitions, applicable measures, their computability and identified conceptual relationships. We used these relationships to design and develop a data quality meta-model and implemented it in a quality knowledge repository.

RESULTS

We identified three primitives for programmatically performing data quality assessments: data quality concept, its definition, its measure or rule for data quality assessment, and their associations. We modeled a computable data quality meta-data repository and extended this framework to adapt, store, retrieve and automate assessment of other existing data quality assessment models.

CONCLUSION

We identified research gaps in data quality literature towards automating data quality assessments methods. In this process, we designed, developed and implemented a computable data quality knowledge repository for assessing quality and characterizing data in health data repositories. We leverage this knowledge repository in a service-oriented architecture to perform scalable and reproducible framework for data quality assessments in disparate biomedical data sources.

摘要

背景与目的

近年来,数据质量和信息质量领域提出了多个数据质量概念框架,用于评估数据质量。这些框架多种多样,从简单的概念列表到数据质量概念的复杂本体论和分类学表示形式都有。本研究的目的是设计、开发和实现一个与平台无关的可计算数据质量知识库,用于数据质量评估。

方法

我们通过对三个主要文献数据源中索引的文章进行全面的文献回顾,确定了可计算的数据质量概念。从这个语料库中,我们提取了数据质量概念、它们的定义、适用的度量标准、它们的可计算性和识别的概念关系。我们使用这些关系来设计和开发数据质量元模型,并将其实现到质量知识库中。

结果

我们确定了三个用于进行数据质量评估的编程原语:数据质量概念、其定义、其数据质量评估的度量或规则,以及它们的关联。我们对可计算的数据质量元数据知识库进行建模,并扩展了这个框架,以适应、存储、检索和自动化评估其他现有的数据质量评估模型。

结论

我们确定了数据质量文献中自动化数据质量评估方法的研究空白。在这个过程中,我们设计、开发和实现了一个可计算的数据质量知识库,用于评估健康数据存储库中的数据质量和特征。我们在面向服务的架构中利用这个知识库来执行可扩展和可重复的数据质量评估框架,用于不同的生物医学数据源。

相似文献

1
Towards a content agnostic computable knowledge repository for data quality assessment.面向数据质量评估的内容不可知可计算知识库。
Comput Methods Programs Biomed. 2019 Aug;177:193-201. doi: 10.1016/j.cmpb.2019.05.017. Epub 2019 May 24.
2
The caCORE Software Development Kit: streamlining construction of interoperable biomedical information services.caCORE软件开发工具包:简化可互操作生物医学信息服务的构建
BMC Med Inform Decis Mak. 2006 Jan 6;6:2. doi: 10.1186/1472-6947-6-2.
3
ReVeaLD: a user-driven domain-specific interactive search platform for biomedical research.ReVeaLD:一个用户驱动的生物医学研究领域特定交互式搜索平台。
J Biomed Inform. 2014 Feb;47:112-30. doi: 10.1016/j.jbi.2013.10.001. Epub 2013 Oct 14.
4
An automated data verification approach for improving data quality in a clinical registry.一种自动化数据验证方法,用于提高临床注册中的数据质量。
Comput Methods Programs Biomed. 2019 Nov;181:104840. doi: 10.1016/j.cmpb.2019.01.012. Epub 2019 Jan 31.
5
Towards a repository for standardized medical image and signal case data annotated with ground truth.建立一个带有标注的真实数据的标准化医学图像和信号案例数据库。
J Digit Imaging. 2012 Apr;25(2):213-26. doi: 10.1007/s10278-011-9428-4.
6
PaperBot: open-source web-based search and metadata organization of scientific literature.PaperBot:基于网络的开源科学文献搜索和元数据组织工具。
BMC Bioinformatics. 2019 Jan 24;20(1):50. doi: 10.1186/s12859-019-2613-z.
7
An integrated content and metadata based retrieval system for art.一种基于内容和元数据整合的艺术检索系统。
IEEE Trans Image Process. 2004 Mar;13(3):302-13. doi: 10.1109/tip.2003.821346.
8
Semantic Health Knowledge Graph: Semantic Integration of Heterogeneous Medical Knowledge and Services.语义健康知识图谱:异构医学知识与服务的语义集成
Biomed Res Int. 2017;2017:2858423. doi: 10.1155/2017/2858423. Epub 2017 Feb 12.
9
caCORE: a common infrastructure for cancer informatics.caCORE:癌症信息学的通用基础设施。
Bioinformatics. 2003 Dec 12;19(18):2404-12. doi: 10.1093/bioinformatics/btg335.
10
BioWes-from design of experiment, through protocol to repository, control, standardization and back-tracking.生物实验工作流程——从实验设计,到方案制定,再到资源库管理、控制、标准化以及回溯。
Biomed Eng Online. 2016 Jul 15;15 Suppl 1(Suppl 1):74. doi: 10.1186/s12938-016-0188-8.

引用本文的文献

1
Development and initial validation of a data quality evaluation tool in obstetrics real-world data through HL7-FHIR interoperable Bayesian networks and expert rules.通过HL7-FHIR可互操作贝叶斯网络和专家规则开发并初步验证产科真实世界数据中的数据质量评估工具
JAMIA Open. 2024 Jul 27;7(3):ooae062. doi: 10.1093/jamiaopen/ooae062. eCollection 2024 Oct.
2
Frameworks, Dimensions, Definitions of Aspects, and Assessment Methods for the Appraisal of Quality of Health Data for Secondary Use: Comprehensive Overview of Reviews.二次使用健康数据质量评估的框架、维度、方面定义及评估方法:综述的全面概述
JMIR Med Inform. 2024 Mar 6;12:e51560. doi: 10.2196/51560.
3
The IHI Rochester Report 2022 on Healthcare Informatics Research: Resuming After the CoViD-19.
2022年国际医疗保健信息学会罗切斯特报告:新冠疫情后的恢复
J Healthc Inform Res. 2023 May 1;7(2):169-202. doi: 10.1007/s41666-023-00126-5. eCollection 2023 Jun.
4
Recent Trends in Patient Registries for Health Services Research.健康服务研究中患者注册的最新趋势。
Methods Inf Med. 2021 Jun;60(S 01):e1-e8. doi: 10.1055/s-0041-1724104. Epub 2021 Apr 16.
5
Quality assessment of real-world data repositories across the data life cycle: A literature review.贯穿数据生命周期的真实世界数据存储库质量评估:文献综述。
J Am Med Inform Assoc. 2021 Jul 14;28(7):1591-1599. doi: 10.1093/jamia/ocaa340.