National Institutes of Health, All of Us Research Program, Bethesda, MD.
Leidos, Frederick, MD.
JCO Clin Cancer Inform. 2024 Aug;8:e2400052. doi: 10.1200/CCI.24.00052.
The specific aims of this paper are to (1) develop and operationalize an electronic health record (EHR) data quality framework, (2) apply the dimensions of the framework to the phenotype and treatment pathways of ductal carcinoma in situ (DCIS) using Research Program data, and (3) propose and apply a checklist to evaluate the application of the framework.
We developed a framework of five data quality dimensions (DQD; completeness, concordance, conformance, plausibility, and temporality). Participants signed a consent and Health Insurance Portability and Accountability Act authorization to share EHR data and responded to demographic questions in the Basics questionnaire. We evaluated the internal characteristics of the data and compared data with external benchmarks with descriptive and inferential statistics. We developed a DQD checklist to evaluate concept selection, internal verification, and external validity for each DQD. The Observational Medical Outcomes Partnership Common Data Model (OMOP CDM) concept ID codes for DCIS were used to select a cohort of 2,209 females 18 years and older.
Using the proposed DQD checklist criteria, (1) concepts were selected and internally verified for conformance; (2) concepts were selected and internally verified for completeness; (3) concepts were selected, internally verified, and externally validated for concordance; (4) concepts were selected, internally verified, and externally validated for plausibility; and (5) concepts were selected, internally verified, and externally validated for temporality.
This assessment and evaluation provided insights into data quality for the DCIS phenotype using EHR data from the Research Program. The review demonstrates that salient clinical measures can be selected, applied, and operationalized within a conceptual framework and evaluated for fitness for use by applying a proposed checklist.
本文的具体目的是:(1) 开发和操作电子健康记录 (EHR) 数据质量框架,(2) 使用研究计划数据将框架的维度应用于导管原位癌 (DCIS) 的表型和治疗途径,(3) 提出并应用清单来评估框架的应用。
我们开发了一个由五个数据质量维度 (DQD; 完整性、一致性、一致性、合理性和时效性) 组成的框架。参与者签署了同意书和健康保险流通与责任法案授权书,以共享 EHR 数据,并在基础知识问卷中回答了人口统计问题。我们评估了数据的内部特征,并使用描述性和推断性统计数据将数据与外部基准进行了比较。我们开发了一个 DQD 清单,用于评估每个 DQD 的概念选择、内部验证和外部有效性。使用观察性医疗结局伙伴关系通用数据模型 (OMOP CDM) 的 DCIS 概念 ID 代码选择了 2209 名 18 岁及以上的女性队列。
使用提出的 DQD 清单标准,(1) 为一致性选择和内部验证了概念;(2) 为完整性选择和内部验证了概念;(3) 为一致性选择、内部验证和外部验证了概念;(4) 为合理性选择、内部验证和外部验证了概念;(5) 为时效性选择、内部验证和外部验证了概念。
本评估和评估使用研究计划的 EHR 数据为 DCIS 表型提供了数据质量的见解。该审查表明,可以在概念框架内选择、应用和操作重要的临床措施,并通过应用拟议的清单评估其适用性。