通过筛选具有“完整数据”的患者的电子健康记录所引入的偏差。

Biases introduced by filtering electronic health records for patients with "complete data".

作者信息

Weber Griffin M, Adams William G, Bernstam Elmer V, Bickel Jonathan P, Fox Kathe P, Marsolo Keith, Raghavan Vijay A, Turchin Alexander, Zhou Xiaobo, Murphy Shawn N, Mandl Kenneth D

机构信息

Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA.

Department of Medicine, Beth Israel Deaconess Medical Center, Boston, MA, USA.

出版信息

J Am Med Inform Assoc. 2017 Nov 1;24(6):1134-1141. doi: 10.1093/jamia/ocx071.

DOI:10.1093/jamia/ocx071

PMID:29016972

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC6080680/

Abstract

OBJECTIVE

One promise of nationwide adoption of electronic health records (EHRs) is the availability of data for large-scale clinical research studies. However, because the same patient could be treated at multiple health care institutions, data from only a single site might not contain the complete medical history for that patient, meaning that critical events could be missing. In this study, we evaluate how simple heuristic checks for data "completeness" affect the number of patients in the resulting cohort and introduce potential biases.

MATERIALS AND METHODS

We began with a set of 16 filters that check for the presence of demographics, laboratory tests, and other types of data, and then systematically applied all 216 possible combinations of these filters to the EHR data for 12 million patients at 7 health care systems and a separate payor claims database of 7 million members.

RESULTS

EHR data showed considerable variability in data completeness across sites and high correlation between data types. For example, the fraction of patients with diagnoses increased from 35.0% in all patients to 90.9% in those with at least 1 medication. An unrelated claims dataset independently showed that most filters select members who are older and more likely female and can eliminate large portions of the population whose data are actually complete.

DISCUSSION AND CONCLUSION

As investigators design studies, they need to balance their confidence in the completeness of the data with the effects of placing requirements on the data on the resulting patient cohort.

摘要

目的

在全国范围内采用电子健康记录（EHRs）的一个前景是可为大规模临床研究提供数据。然而，由于同一患者可能在多个医疗机构接受治疗，仅来自单一机构的数据可能不包含该患者的完整病史，这意味着关键事件可能缺失。在本研究中，我们评估了对数据“完整性”进行简单启发式检查如何影响最终队列中的患者数量，并引入潜在偏差。

材料与方法

我们从一组16个过滤器开始，这些过滤器用于检查人口统计学数据、实验室检查及其他类型数据的存在情况，然后系统地将这些过滤器的所有216种可能组合应用于7个医疗系统中1200万患者的电子健康记录数据以及一个包含700万成员的独立医保理赔数据库。

结果

电子健康记录数据显示，各机构之间的数据完整性存在显著差异，且数据类型之间具有高度相关性。例如，有诊断记录的患者比例从所有患者中的35.0%增加到至少使用过1种药物的患者中的90.9%。一个不相关的理赔数据集独立显示，大多数过滤器选择的成员年龄较大且更可能为女性，并且会排除很大一部分数据实际上完整的人群。

讨论与结论

在研究人员设计研究时，他们需要在对数据完整性的信心与对数据设置要求对最终患者队列的影响之间取得平衡。

Suppr 超能文献

文献检索

文件翻译

深度研究

Suppr 超能文献

文献检索

文件翻译

深度研究

通过筛选具有“完整数据”的患者的电子健康记录所引入的偏差。

Biases introduced by filtering electronic health records for patients with "complete data".

作者信息

机构信息

出版信息

OBJECTIVE

MATERIALS AND METHODS

RESULTS

DISCUSSION AND CONCLUSION

目的

材料与方法

结果

讨论与结论

相似文献

引用本文的文献

本文引用的文献

通过筛选具有“完整数据”的患者的电子健康记录所引入的偏差。

Biases introduced by filtering electronic health records for patients with "complete data".

作者信息

机构信息

出版信息

OBJECTIVE

MATERIALS AND METHODS

RESULTS

DISCUSSION AND CONCLUSION

目的

材料与方法

结果

讨论与结论

相似文献

引用本文的文献

本文引用的文献