O'Malley A James, Elwert Felix, Rosenquist J Niels, Zaslavsky Alan M, Christakis Nicholas A
The Dartmouth Institute for Health Policy and Clinical Practice, Geisel School of Medicine at Dartmouth, Lebanon, New Hampshire 03766, U.S.A.
Department of Sociology, Center for Demography and Ecology, University of Wisconsin-Madison, Madison, Wisconsin 53706, U.S.A.
Biometrics. 2014 Sep;70(3):506-15. doi: 10.1111/biom.12172. Epub 2014 Apr 29.
The identification of causal peer effects (also known as social contagion or induction) from observational data in social networks is challenged by two distinct sources of bias: latent homophily and unobserved confounding. In this paper, we investigate how causal peer effects of traits and behaviors can be identified using genes (or other structurally isomorphic variables) as instrumental variables (IV) in a large set of data generating models with homophily and confounding. We use directed acyclic graphs to represent these models and employ multiple IV strategies and report three main identification results. First, using a single fixed gene (or allele) as an IV will generally fail to identify peer effects if the gene affects past values of the treatment. Second, multiple fixed genes/alleles, or, more promisingly, time-varying gene expression, can identify peer effects if we instrument exclusion violations as well as the focal treatment. Third, we show that IV identification of peer effects remains possible even under multiple complications often regarded as lethal for IV identification of intra-individual effects, such as pleiotropy on observables and unobservables, homophily on past phenotype, past and ongoing homophily on genotype, inter-phenotype peer effects, population stratification, gene expression that is endogenous to past phenotype and past gene expression, and others. We apply our identification results to estimating peer effects of body mass index (BMI) among friends and spouses in the Framingham Heart Study. Results suggest a positive causal peer effect of BMI between friends.
从社交网络中的观测数据识别因果同伴效应(也称为社会传染或诱导)受到两种不同偏差来源的挑战:潜在同质性和未观察到的混杂因素。在本文中,我们研究了如何在具有同质性和混杂因素的大量数据生成模型中,使用基因(或其他结构同构变量)作为工具变量(IV)来识别特质和行为的因果同伴效应。我们使用有向无环图来表示这些模型,并采用多种IV策略,并报告了三个主要的识别结果。首先,如果基因影响治疗的过去值,使用单个固定基因(或等位基因)作为IV通常无法识别同伴效应。其次,如果我们对工具变量排除违规以及焦点治疗进行工具化处理,多个固定基因/等位基因,或者更有前景的是随时间变化的基因表达,可以识别同伴效应。第三,我们表明,即使存在多种通常被认为对个体内效应的IV识别具有致命性的复杂情况,如同对可观察和不可观察变量的多效性、对过去表型的同质性、对基因型的过去和当前同质性、表型间同伴效应、群体分层、对过去表型和过去基因表达内源性的基因表达等等,IV对同伴效应的识别仍然是可能的。我们将我们的识别结果应用于估计弗雷明汉心脏研究中朋友和配偶之间体重指数(BMI)的同伴效应。结果表明朋友之间BMI存在正向因果同伴效应。