Department of Women and Children's Health, School of Life Course Sciences, Faculty of Life Sciences and Medicine, King's College London, 10th Floor North Wing, St. Thomas' Hospital, Westminster Bridge Road, London, SE1 7EH, UK.
School of Population Health and Environmental Sciences, Faculty of Life Sciences and Medicine, King's College London, 4th Floor, Addison House, Guy's Campus, London, SE1 1UL, UK.
Trials. 2021 Mar 8;22(1):195. doi: 10.1186/s13063-021-05141-8.
The use of electronic patient records for assessing outcomes in clinical trials is a methodological strategy intended to drive faster and more cost-efficient acquisition of results. The aim of this manuscript was to outline the data collection and management considerations of a maternity and perinatal clinical trial using data from electronic patient records, exemplifying the DESiGN Trial as a case study.
The DESiGN Trial is a cluster randomised control trial assessing the effect of a complex intervention versus standard care for identifying small for gestational age foetuses. Data on maternal/perinatal characteristics and outcomes including infants admitted to neonatal care, parameters from foetal ultrasound and details of hospital activity for health-economic evaluation were collected at two time points from four types of electronic patient records held in 22 different electronic record systems at the 13 research clusters. Data were pseudonymised on site using a bespoke Microsoft Excel macro and securely transferred to the central data store. Data quality checks were undertaken. Rules for data harmonisation of the raw data were developed and a data dictionary produced, along with rules and assumptions for data linkage of the datasets. The dictionary included descriptions of the rationale and assumptions for data harmonisation and quality checks.
Data were collected on 182,052 babies from 178,350 pregnancies in 165,397 unique women. Data availability and completeness varied across research sites; each of eight variables which were key to calculation of the primary outcome were completely missing in median 3 (range 1-4) clusters at the time of the first data download. This improved by the second data download following clarification of instructions to the research sites (each of the eight key variables were completely missing in median 1 (range 0-1) cluster at the second time point). Common data management challenges were harmonising a single variable from multiple sources and categorising free-text data, solutions were developed for this trial.
Conduct of clinical trials which use electronic patient records for the assessment of outcomes can be time and cost-effective but still requires appropriate time and resources to maximise data quality. A difficulty for pregnancy and perinatal research in the UK is the wide variety of different systems used to collect patient data across maternity units. In this manuscript, we describe how we managed this and provide a detailed data dictionary covering the harmonisation of variable names and values that will be helpful for other researchers working with these data.
Primary registry and trial identifying number: ISRCTN 67698474 . Registered on 02/11/16.
使用电子病历记录评估临床试验结果是一种旨在加速并降低成本效益的方法策略。本文旨在概述一项使用电子病历记录数据的产科和围产期临床试验的数据收集和管理考虑因素,同时以 DESiGN 试验为例。
DESiGN 试验是一项评估复杂干预措施与标准护理对识别胎儿生长受限的效果的集群随机对照试验。在两个时间点从四个类型的电子病历记录中收集了产妇/围产期特征和结果数据,这些电子病历记录分别来自 22 个不同的电子病历系统,保存在 13 个研究集群中。数据在现场使用定制的 Microsoft Excel 宏进行匿名化,并安全地传输到中央数据存储库。进行了数据质量检查。制定了原始数据的数据协调规则,并生成了数据字典,以及数据集的数据链接规则和假设。字典中包括数据协调和质量检查的原理和假设的描述。
从 165397 名女性的 178350 次妊娠中收集了 182052 名婴儿的数据。各研究点的数据可用性和完整性存在差异;在第一次数据下载时,计算主要结局的八个关键变量中的每一个都有中位数为 3 个(范围为 1-4)集群完全缺失,在第二次数据下载后,经过对研究点的说明澄清后得到了改善(在第二次时间点,八个关键变量中的每一个都有中位数为 1 个(范围为 0-1)集群完全缺失)。常见的数据管理挑战是协调来自多个来源的单一变量和分类自由文本数据,针对这个试验制定了相应的解决方案。
使用电子病历记录评估结果的临床试验的实施既可以节省时间又可以节省成本,但仍需要适当的时间和资源来最大限度地提高数据质量。英国妊娠和围产期研究的一个困难是,在产妇单位中使用了广泛不同的系统来收集患者数据。在本文中,我们描述了我们如何管理这一点,并提供了一个详细的数据字典,涵盖了变量名称和值的协调,这将对使用这些数据的其他研究人员有帮助。
主要注册处和试验识别号:ISRCTN67698474。于 2016 年 11 月 2 日注册。