Yin Zhaoyu, Xia Kai, Chung Wonil, Sullivan Patrick F, Zou Fei
Department of Biostatistics, University of North Carolina, Chapel Hill, North, Carolina, United States of America.
Department of Psychiatry, University of North Carolina, Chapel Hill, North Carolina, United States of America.
Genet Epidemiol. 2015 Jul;39(5):357-65. doi: 10.1002/gepi.21900. Epub 2015 Apr 10.
Twin data are commonly used for studying complex psychiatric disorders, and mixed effects models are one of the most popular tools for modeling dependence structures between twin pairs. However, for eQTL (expression quantitative trait loci) data where associations between thousands of transcripts and millions of single nucleotide polymorphisms need to be tested, mixed effects models are computationally inefficient and often impractical. In this paper, we propose a fast eQTL analysis approach for twin eQTL data where we randomly split twin pairs into two groups, so that within each group the samples are unrelated, and we then apply a multiple linear regression analysis separately to each group. A score statistic that automatically adjusts the (hidden) correlation between the two groups is constructed for combining the results from the two groups. The proposed method has well-controlled type I error. Compared to mixed effects models, the proposed method has similar power but drastically improved computational efficiency. We demonstrate the computational advantage of the proposed method via extensive simulations. The proposed method is also applied to a large twin eQTL data from the Netherlands Twin Register.
双胞胎数据常用于研究复杂的精神疾病,混合效应模型是模拟双胞胎对之间依赖结构最常用的工具之一。然而,对于需要测试数千个转录本与数百万个单核苷酸多态性之间关联的eQTL(表达数量性状位点)数据,混合效应模型在计算上效率低下,且往往不切实际。在本文中,我们提出了一种用于双胞胎eQTL数据的快速eQTL分析方法,即我们将双胞胎对随机分成两组,使得每组内的样本不相关,然后分别对每组应用多元线性回归分析。构建了一个自动调整两组之间(隐藏)相关性的得分统计量,用于合并两组的结果。所提出的方法具有良好控制的I型错误。与混合效应模型相比,所提出的方法具有相似的功效,但计算效率大幅提高。我们通过广泛的模拟证明了所提出方法的计算优势。所提出的方法也应用于来自荷兰双胞胎登记处的大型双胞胎eQTL数据。