Verheij Robert A, Curcin Vasa, Delaney Brendan C, McGilchrist Mark M
Netherlands Institute for Health Services Research, Utrecht, Netherlands.
King's College London, London, United Kingdom.
J Med Internet Res. 2018 May 29;20(5):e185. doi: 10.2196/jmir.9134.
Enormous amounts of data are recorded routinely in health care as part of the care process, primarily for managing individual patient care. There are significant opportunities to use these data for other purposes, many of which would contribute to establishing a learning health system. This is particularly true for data recorded in primary care settings, as in many countries, these are the first place patients turn to for most health problems.
In this paper, we discuss whether data that are recorded routinely as part of the health care process in primary care are actually fit to use for other purposes such as research and quality of health care indicators, how the original purpose may affect the extent to which the data are fit for another purpose, and the mechanisms behind these effects. In doing so, we want to identify possible sources of bias that are relevant for the use and reuse of these type of data.
This paper is based on the authors' experience as users of electronic health records data, as general practitioners, health informatics experts, and health services researchers. It is a product of the discussions they had during the Translational Research and Patient Safety in Europe (TRANSFoRm) project, which was funded by the European Commission and sought to develop, pilot, and evaluate a core information architecture for the learning health system in Europe, based on primary care electronic health records.
We first describe the different stages in the processing of electronic health record data, as well as the different purposes for which these data are used. Given the different data processing steps and purposes, we then discuss the possible mechanisms for each individual data processing step that can generate biased outcomes. We identified 13 possible sources of bias. Four of them are related to the organization of a health care system, whereas some are of a more technical nature.
There are a substantial number of possible sources of bias; very little is known about the size and direction of their impact. However, anyone that uses or reuses data that were recorded as part of the health care process (such as researchers and clinicians) should be aware of the associated data collection process and environmental influences that can affect the quality of the data. Our stepwise, actor- and purpose-oriented approach may help to identify these possible sources of bias. Unless data quality issues are better understood and unless adequate controls are embedded throughout the data lifecycle, data-driven health care will not live up to its expectations. We need a data quality research agenda to devise the appropriate instruments needed to assess the magnitude of each of the possible sources of bias, and then start measuring their impact. The possible sources of bias described in this paper serve as a starting point for this research agenda.
在医疗保健过程中,作为护理流程的一部分会常规记录大量数据,主要用于管理个体患者的护理。利用这些数据用于其他目的存在重大机遇,其中许多目的将有助于建立一个学习型卫生系统。对于在初级保健机构记录的数据而言尤其如此,因为在许多国家,这些机构是患者针对大多数健康问题首先求助的地方。
在本文中,我们讨论作为初级保健医疗保健过程一部分而常规记录的数据是否实际上适合用于其他目的,如研究和医疗保健指标质量评估,原始目的如何影响数据适合另一目的的程度,以及这些影响背后的机制。在此过程中,我们希望识别与这类数据的使用和再利用相关的可能偏差来源。
本文基于作者作为电子健康记录数据使用者、全科医生、健康信息学专家和卫生服务研究人员的经验。它是他们在欧洲转化研究与患者安全(TRANSFoRm)项目期间讨论的产物,该项目由欧盟委员会资助,旨在基于初级保健电子健康记录开发、试点和评估欧洲学习型卫生系统的核心信息架构。
我们首先描述电子健康记录数据处理的不同阶段,以及使用这些数据的不同目的。鉴于不同的数据处理步骤和目的,我们接着讨论每个单独数据处理步骤可能产生偏差结果的机制。我们识别出13个可能的偏差来源。其中4个与医疗保健系统的组织有关,而有些则具有更强的技术性质。
存在大量可能的偏差来源;关于它们影响的大小和方向知之甚少。然而,任何使用或再利用作为医疗保健过程一部分记录的数据的人(如研究人员和临床医生)都应了解可能影响数据质量的相关数据收集过程和环境影响。我们逐步的、以行为者和目的为导向的方法可能有助于识别这些可能的偏差来源。除非更好地理解数据质量问题,并且在整个数据生命周期中嵌入适当的控制措施,否则数据驱动的医疗保健将无法达到预期。我们需要一个数据质量研究议程来设计评估每个可能偏差来源大小所需的适当工具,然后开始衡量它们的影响。本文中描述的可能偏差来源是这个研究议程的起点。