Hecker Julian, Prokopenko Dmitry, Lange Christoph, Fier Heide Loehlein
Department of Biostatistics, Harvard T.H. Chan School of Public Health, 655 Huntington Avenue, Boston, MA 02115, USA and Department of Genomic Mathematics, University of Bonn, Sigmund-Freud-Strasse 25, 53127 Bonn, Germany.
Channing Division of Network Medicine, Brigham and Women's Hospital, 181 Longwood Avenue, Boston, MA 02115, USA.
Biostatistics. 2018 Jul 1;19(3):295-306. doi: 10.1093/biostatistics/kxx040.
To quantify polygenic effects, i.e. undetected genetic effects, in large-scale association studies, we propose a generalized estimating equation (GEE) based estimation framework. We develop a marginal model for single-variant association test statistics of complex diseases that generalizes existing approaches such as LD Score regression and that is applicable to population-based designs, to family-based designs or to arbitrary combinations of both. We extend the standard GEE approach so that the parameters of the proposed marginal model can be estimated based on working-correlation/linkage-disequilibrium (LD) matrices from external reference panels. Our method achieves substantial efficiency gains over standard approaches, while it is robust against misspecification of the LD structure, i.e. the LD structure of the reference panel can differ substantially from the true LD structure in the study population. In simulation studies and in applications to population-based and family-based studies, we illustrate the features of the proposed GEE framework. Our results suggest that our approach can be up to 100% more efficient than existing methodology.
为了在大规模关联研究中量化多基因效应,即未检测到的遗传效应,我们提出了一种基于广义估计方程(GEE)的估计框架。我们开发了一种用于复杂疾病单变量关联检验统计量的边际模型,该模型推广了现有方法,如LD Score回归,适用于基于人群的设计、基于家系的设计或两者的任意组合。我们扩展了标准的GEE方法,以便可以根据外部参考面板的工作相关性/连锁不平衡(LD)矩阵来估计所提出的边际模型的参数。我们的方法比标准方法具有显著的效率提升,同时对LD结构的错误指定具有鲁棒性,即参考面板的LD结构可能与研究人群中的真实LD结构有很大差异。在模拟研究以及基于人群和基于家系的研究应用中,我们阐述了所提出的GEE框架的特征。我们的结果表明,我们的方法比现有方法的效率可提高多达100%。