Verhulst Brad, Maes Hermine H, Neale Michael C
Virginia Institute for Psychiatric and Behavioral Genetics, Virginia Commonwealth University, Richmond, VA, USA.
Behav Genet. 2017 May;47(3):345-359. doi: 10.1007/s10519-017-9842-6. Epub 2017 Mar 15.
Improving the accuracy of phenotyping through the use of advanced psychometric tools will increase the power to find significant associations with genetic variants and expand the range of possible hypotheses that can be tested on a genome-wide scale. Multivariate methods, such as structural equation modeling (SEM), are valuable in the phenotypic analysis of psychiatric and substance use phenotypes, but these methods have not been integrated into standard genome-wide association analyses because fitting a SEM at each single nucleotide polymorphism (SNP) along the genome was hitherto considered to be too computationally demanding. By developing a method that can efficiently fit SEMs, it is possible to expand the set of models that can be tested. This is particularly necessary in psychiatric and behavioral genetics, where the statistical methods are often handicapped by phenotypes with large components of stochastic variance. Due to the enormous amount of data that genome-wide scans produce, the statistical methods used to analyze the data are relatively elementary and do not directly correspond with the rich theoretical development, and lack the potential to test more complex hypotheses about the measurement of, and interaction between, comorbid traits. In this paper, we present a method to test the association of a SNP with multiple phenotypes or a latent construct on a genome-wide basis using a diagonally weighted least squares (DWLS) estimator for four common SEMs: a one-factor model, a one-factor residuals model, a two-factor model, and a latent growth model. We demonstrate that the DWLS parameters and p-values strongly correspond with the more traditional full information maximum likelihood parameters and p-values. We also present the timing of simulations and power analyses and a comparison with and existing multivariate GWAS software package.
通过使用先进的心理测量工具提高表型分型的准确性,将增强发现与基因变异显著关联的能力,并扩展可在全基因组范围内进行检验的可能假设的范围。多变量方法,如结构方程模型(SEM),在精神疾病和物质使用表型的分析中很有价值,但这些方法尚未整合到标准的全基因组关联分析中,因为迄今为止,在基因组中每个单核苷酸多态性(SNP)处拟合SEM被认为计算量太大。通过开发一种能够有效拟合SEM的方法,有可能扩展可检验模型的集合。这在精神疾病和行为遗传学中尤为必要,因为在这些领域,统计方法常常受到具有大量随机方差成分的表型的限制。由于全基因组扫描产生的数据量巨大,用于分析数据的统计方法相对基础,与丰富的理论发展没有直接对应关系,并且缺乏检验关于共病特征的测量及其相互作用的更复杂假设的潜力。在本文中,我们提出了一种方法,使用对角加权最小二乘(DWLS)估计器,在全基因组范围内检验SNP与多种表型或潜在结构的关联,用于四种常见的SEM:单因素模型、单因素残差模型、双因素模型和潜在增长模型。我们证明DWLS参数和p值与更传统的完全信息最大似然参数和p值高度对应。我们还展示了模拟和功效分析的时间安排,以及与现有多变量全基因组关联研究软件包的比较。