Research Institute - McGill University Health Centre, Montreal, Quebec, Canada.
Int J Epidemiol. 2011 Oct;40(5):1314-28. doi: 10.1093/ije/dyr106. Epub 2011 Jul 30.
Proper understanding of the roles of, and interactions between genetic, lifestyle, environmental and psycho-social factors in determining the risk of development and/or progression of chronic diseases requires access to very large high-quality databases. Because of the financial, technical and time burdens related to developing and maintaining very large studies, the scientific community is increasingly synthesizing data from multiple studies to construct large databases. However, the data items collected by individual studies must be inferentially equivalent to be meaningfully synthesized. The DataSchema and Harmonization Platform for Epidemiological Research (DataSHaPER; http://www.datashaper.org) was developed to enable the rigorous assessment of the inferential equivalence, i.e. the potential for harmonization, of selected information from individual studies.
This article examines the value of using the DataSHaPER for retrospective harmonization of established studies. Using the DataSHaPER approach, the potential to generate 148 harmonized variables from the questionnaires and physical measures collected in 53 large population-based studies (6.9 million participants) was assessed. Variable and study characteristics that might influence the potential for data synthesis were also explored.
Out of all assessment items evaluated (148 variables for each of the 53 studies), 38% could be harmonized. Certain characteristics of variables (i.e. relative importance, individual targeted, reference period) and of studies (i.e. observational units, data collection start date and mode of questionnaire administration) were associated with the potential for harmonization. For example, for variables deemed to be essential, 62% of assessment items paired could be harmonized.
The current article shows that the DataSHaPER provides an effective and flexible approach for the retrospective harmonization of information across studies. To implement data synthesis, some additional scientific, ethico-legal and technical considerations must be addressed. The success of the DataSHaPER as a harmonization approach will depend on its continuing development and on the rigour and extent of its use. The DataSHaPER has the potential to take us closer to a truly collaborative epidemiology and offers the promise of enhanced research potential generated through synthesized databases.
要正确理解遗传、生活方式、环境和心理社会因素在决定慢性病的发生和/或发展风险方面的作用和相互关系,需要访问非常大的高质量数据库。由于开发和维护大型研究的财务、技术和时间负担,科学界越来越多地综合来自多个研究的数据,以构建大型数据库。然而,各个研究收集的数据项必须具有可推断的等效性,才能进行有意义的综合。流行病学研究的数据模式和协调平台(DataSchema and Harmonization Platform for Epidemiological Research,DataSHaPER;http://www.datashaper.org)旨在对个体研究中选定信息的推断等效性(即协调潜力)进行严格评估。
本文研究了使用 DataSHaPER 对已建立的研究进行回顾性协调的价值。使用 DataSHaPER 方法,评估了从 53 项大型基于人群的研究(690 万参与者)的问卷和身体测量中生成 148 个协调变量的潜力。还探讨了可能影响数据综合潜力的变量和研究特征。
在所评估的所有评估项目中(53 项研究中的每一项都有 148 个变量),有 38%可以协调。变量和研究的某些特征(即相对重要性、个体针对性、参考期)与研究(即观测单位、数据收集开始日期和问卷管理模式)与协调潜力相关。例如,对于被认为是必不可少的变量,62%的配对评估项可以协调。
本文表明,DataSHaPER 为跨研究信息的回顾性协调提供了一种有效且灵活的方法。要实施数据综合,必须考虑一些额外的科学、伦理法律和技术问题。DataSHaPER 作为一种协调方法的成功将取决于其持续发展以及其使用的严谨性和程度。DataSHaPER 有可能使我们更接近真正的协作流行病学,并通过综合数据库提供增强的研究潜力。