Suppr超能文献

利用真实世界数据评估 COVID-19 治疗方法的数据质量考量因素:来自国家 COVID 队列协作组织(N3C)的经验教训。

Data quality considerations for evaluating COVID-19 treatments using real world data: learnings from the National COVID Cohort Collaborative (N3C).

机构信息

National Center for Advancing Translational Sciences, National Institutes of Health, Bethesda, MD, USA.

Axle Research and Technologies, Rockville, MD, USA.

出版信息

BMC Med Res Methodol. 2023 Feb 17;23(1):46. doi: 10.1186/s12874-023-01839-2.

Abstract

BACKGROUND

Multi-institution electronic health records (EHR) are a rich source of real world data (RWD) for generating real world evidence (RWE) regarding the utilization, benefits and harms of medical interventions. They provide access to clinical data from large pooled patient populations in addition to laboratory measurements unavailable in insurance claims-based data. However, secondary use of these data for research requires specialized knowledge and careful evaluation of data quality and completeness. We discuss data quality assessments undertaken during the conduct of prep-to-research, focusing on the investigation of treatment safety and effectiveness.

METHODS

Using the National COVID Cohort Collaborative (N3C) enclave, we defined a patient population using criteria typical in non-interventional inpatient drug effectiveness studies. We present the challenges encountered when constructing this dataset, beginning with an examination of data quality across data partners. We then discuss the methods and best practices used to operationalize several important study elements: exposure to treatment, baseline health comorbidities, and key outcomes of interest.

RESULTS

We share our experiences and lessons learned when working with heterogeneous EHR data from over 65 healthcare institutions and 4 common data models. We discuss six key areas of data variability and quality. (1) The specific EHR data elements captured from a site can vary depending on source data model and practice. (2) Data missingness remains a significant issue. (3) Drug exposures can be recorded at different levels and may not contain route of administration or dosage information. (4) Reconstruction of continuous drug exposure intervals may not always be possible. (5) EHR discontinuity is a major concern for capturing history of prior treatment and comorbidities. Lastly, (6) access to EHR data alone limits the potential outcomes which can be used in studies.

CONCLUSIONS

The creation of large scale centralized multi-site EHR databases such as N3C enables a wide range of research aimed at better understanding treatments and health impacts of many conditions including COVID-19. As with all observational research, it is important that research teams engage with appropriate domain experts to understand the data in order to define research questions that are both clinically important and feasible to address using these real world data.

摘要

背景

多机构电子健康记录(EHR)是生成关于医疗干预措施的利用、益处和危害的真实世界证据(RWE)的真实世界数据(RWD)的丰富来源。它们提供了对大型患者群体的临床数据的访问,此外还提供了保险索赔数据中不可用的实验室测量值。然而,这些数据的二次利用需要专门的知识,并需要仔细评估数据质量和完整性。我们讨论了在进行研究前准备阶段进行的数据质量评估,重点是调查治疗的安全性和有效性。

方法

使用国家 COVID 队列协作(N3C)飞地,我们使用非干预性住院药物效果研究中常用的标准定义了患者人群。我们介绍了在构建此数据集时遇到的挑战,首先检查了各个数据合作伙伴的数据质量。然后,我们讨论了用于实现几个重要研究要素的方法和最佳实践:暴露于治疗、基线健康合并症和关键关注结果。

结果

我们分享了在与来自 65 个以上医疗机构和 4 个常见数据模型的异构 EHR 数据合作时的经验和教训。我们讨论了数据可变性和质量的六个关键领域。(1)从一个站点捕获的特定 EHR 数据元素可能因源数据模型和实践而异。(2)数据缺失仍然是一个重大问题。(3)药物暴露可以记录在不同的级别上,并且可能不包含给药途径或剂量信息。(4)连续药物暴露间隔的重建可能并不总是可行的。(5)EHR 不连续性是捕获既往治疗和合并症病史的主要关注点。最后,(6)仅访问 EHR 数据限制了可以在研究中使用的潜在结果。

结论

创建像 N3C 这样的大型集中式多站点 EHR 数据库,使得开展广泛的研究成为可能,这些研究旨在更好地了解包括 COVID-19 在内的许多疾病的治疗方法和健康影响。与所有观察性研究一样,重要的是研究团队与适当的领域专家合作,了解数据,以便确定既具有临床重要性又可以使用这些真实世界数据解决的研究问题。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b517/9936772/5de52d9a8be3/12874_2023_1839_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验