Department of Psychiatry, University of Oxford, Oxford, United Kingdom.
Swansea University Medical School, Swansea University, Swansea, United Kingdom.
Eur J Epidemiol. 2023 Feb;38(2):179-187. doi: 10.1007/s10654-022-00916-y. Epub 2023 Jan 7.
Research-ready data (data curated to a defined standard) increase scientific opportunity and rigour by integrating the data environment. The development of research platforms has highlighted the value of research-ready data, particularly for multi-cohort analyses. Following stakeholder consultation, a standard data model (C-Surv) optimised for data discovery, was developed using data from 5 population and clinical cohort studies. The model uses a four-tier nested structure based on 18 data themes selected according to user behaviour or technology. Standard variable naming conventions are applied to uniquely identify variables within the context of longitudinal studies. The data model was used to develop a harmonised dataset for 11 cohorts. This dataset populated the Cohort Explorer data discovery tool for assessing the feasibility of an analysis prior to making a data access request. Data preparation times were compared between cohort specific data models and C-Surv.It was concluded that adopting a common data model as a data standard for the discovery and analysis of research cohort data offers multiple benefits.
研究就绪数据(经过定义的标准进行整理的数据)通过整合数据环境,增加了科学机会和严谨性。研究平台的发展凸显了研究就绪数据的价值,特别是对于多队列分析。在利益相关者协商后,使用来自 5 个人群和临床队列研究的数据,开发了一个针对数据发现优化的标准数据模型 (C-Surv)。该模型使用基于根据用户行为或技术选择的 18 个数据主题的四层嵌套结构。标准变量命名约定应用于在纵向研究的上下文中唯一标识变量。该数据模型用于为 11 个队列开发一个协调数据集。该数据集填充了 Cohort Explorer 数据发现工具,用于在提出数据访问请求之前评估分析的可行性。比较了特定于队列的数据模型和 C-Surv 之间的数据准备时间。得出的结论是,采用通用数据模型作为研究队列数据的发现和分析的数据标准具有多种好处。