Carlin John B, Gurrin Lyle C, Sterne Jonathan Ac, Morley Ruth, Dwyer Terry
Clinical Epidemiology and Biostatistics Unit, Murdoch Children's Research Institute, Royal Children's Hospital, Melbourne, Australia.
Int J Epidemiol. 2005 Oct;34(5):1089-99. doi: 10.1093/ije/dyi153. Epub 2005 Aug 8.
Twin studies have long been recognized for their value in learning about the aetiology of disease and specifically for their potential for separating genetic effects from environmental effects. The recent upsurge of interest in life-course epidemiology and the study of developmental influences on later health has provided a new impetus to study twins as a source of unique insights. Twins are of special interest because they provide naturally matched pairs where the confounding effects of a large number of potentially causal factors (such as maternal nutrition or gestation length) may be removed by comparisons between twins who share them. The traditional tool of epidemiological 'risk factor analysis' is the regression model, but it is not straightforward to transfer standard regression methods to twin data, because the analysis needs to reflect the paired structure of the data, which induces correlation between twins. This paper reviews the use of more specialized regression methods for twin data, based on generalized least squares or linear mixed models, and explains the relationship between these methods and the commonly used approach of analysing within-twin-pair difference values. Methods and issues of interpretation are illustrated using an example from a recent study of the association between birth weight and cord blood erythropoietin. We focus on the analysis of continuous outcome measures but review additional complexities that arise with binary outcomes. We recommend the use of a general model that includes separate regression coefficients for within-twin-pair and between-pair effects, and provide guidelines for the interpretation of estimates obtained under this model.
长期以来,双胞胎研究因其在了解疾病病因方面的价值而受到认可,特别是在区分遗传效应和环境效应方面的潜力。最近,人们对生命历程流行病学以及发育对后期健康影响的研究兴趣激增,这为将双胞胎作为独特见解来源进行研究提供了新的动力。双胞胎之所以特别受关注,是因为他们提供了自然匹配的对子,通过对共享大量潜在因果因素(如母亲营养或妊娠期长度)的双胞胎进行比较,可以消除这些因素的混杂效应。流行病学“风险因素分析”的传统工具是回归模型,但将标准回归方法应用于双胞胎数据并非易事,因为分析需要反映数据的配对结构,这会导致双胞胎之间产生相关性。本文回顾了基于广义最小二乘法或线性混合模型的更专门的双胞胎数据回归方法的使用,并解释了这些方法与常用的分析双胞胎对子内差异值方法之间的关系。通过最近一项关于出生体重与脐带血促红细胞生成素之间关联的研究实例,阐述了方法及解释问题。我们专注于连续结局指标的分析,但也回顾了二元结局所带来的额外复杂性。我们建议使用一个通用模型,该模型包括双胞胎对子内和对子间效应的单独回归系数,并为解释在此模型下获得的估计值提供指导。