The Department of Psychiatry, Virginia Commonwealth University, Richmond, USA.
The Virginia Institute for Psychiatric and Behavioral Genetics, Virginia Commonwealth University, Richmond, USA.
Behav Genet. 2021 May;51(3):343-357. doi: 10.1007/s10519-021-10043-1. Epub 2021 Feb 19.
Most genome-wide association study (GWAS) analyses test the association between single-nucleotide polymorphisms (SNPs) and a single trait or outcome. While valuable second-step analyses of these associations (e.g., calculating genetic correlations between traits) are common, single-step multivariate analyses of GWAS data are rarely performed. This is unfortunate because multivariate analyses can reveal information which is irrevocably obscured in multi-step analysis. One simple example is the distinction between variance common to a set of measures, and variance specific to each. Neither GWAS of sum- or factor-scores, nor GWAS of the individual measures will deliver a clean picture of loci associated with each measure's specific variance. While multivariate GWAS opens up a broad new landscape of feasible and informative analyses, its adoption has been slow, likely due to the heavy computational demands and difficulties specifying models it requires. Here we describe GW-SEM 2.0, which is designed to simplify model specification and overcome the inherent computational challenges associated with multivariate GWAS. In addition, GW-SEM 2.0 allows users to accurately model ordinal items, which are common in behavioral and psychological research, within a GWAS context. This new release enhances computational efficiency, allows users to select the fit function that is appropriate for their analyses, expands compatibility with standard genomic data formats, and outputs results for seamless reading into other standard post-GWAS processing software. To demonstrate GW-SEM's utility, we conducted (1) a series of GWAS using three substance use frequency items from data in the UK Biobank, (2) a timing study for several predefined GWAS functions, and (3) a Type I Error rate study. Our multivariate GWAS analyses emphasize the utility of GW-SEM for identifying novel patterns of associations that vary considerably between genomic loci for specific substances, highlighting the importance of differentiating between substance-specific use behaviors and polysubstance use. The timing studies demonstrate that the analyses take a reasonable amount of time and show the cost of including additional items. The Type I Error rate study demonstrates that hypothesis tests for genetic associations with latent variable models follow the hypothesized uniform distribution. Taken together, we suggest that GW-SEM may provide substantially deeper insights into the underlying genomic architecture for multivariate behavioral and psychological systems than is currently possible with standard GWAS methods. The current release of GW-SEM 2.0 is available on CRAN (stable release) and GitHub (beta release), and tutorials are available on our github wiki ( https://jpritikin.github.io/gwsem/ ).
大多数全基因组关联研究(GWAS)分析测试单核苷酸多态性(SNP)与单一性状或结果之间的关联。虽然对这些关联进行有价值的第二步分析(例如,计算性状之间的遗传相关性)很常见,但很少进行 GWAS 数据的单步多元分析。这很不幸,因为多元分析可以揭示在多步分析中不可逆转的模糊信息。一个简单的例子是一组测量中共同方差与每个测量中特定方差之间的区别。无论是总和或因子分数的 GWAS,还是个体测量的 GWAS,都不会提供与每个测量的特定方差相关的位点的清晰图像。虽然多元 GWAS 开辟了广泛的可行和信息丰富的分析新领域,但它的采用速度很慢,可能是由于它需要大量的计算需求和指定模型的困难。在这里,我们描述了 GW-SEM 2.0,它旨在简化模型指定并克服多元 GWAS 固有的计算挑战。此外,GW-SEM 2.0 允许用户在 GWAS 环境中准确地对常见于行为和心理研究的有序项目进行建模。此新版本提高了计算效率,允许用户为其分析选择合适的拟合函数,扩展了与标准基因组数据格式的兼容性,并输出结果,以便无缝地读入其他标准 GWAS 后处理软件。为了演示 GW-SEM 的实用性,我们进行了(1)使用来自英国生物库中三个物质使用频率项目的一系列 GWAS,(2)对几个预定义 GWAS 函数的时间研究,以及(3)一个 I 型错误率研究。我们的多元 GWAS 分析强调了 GW-SEM 用于识别特定物质的基因组位点之间差异很大的新型关联模式的实用性,突出了区分特定物质使用行为和多物质使用的重要性。时间研究表明,分析需要相当多的时间,并显示出包含额外项目的成本。I 型错误率研究表明,遗传关联的假设检验与潜在变量模型遵循假设的均匀分布。总的来说,我们认为 GW-SEM 可能为多元行为和心理系统的潜在基因组结构提供比目前标准 GWAS 方法更深入的见解。GW-SEM 2.0 的当前版本可在 CRAN(稳定版本)和 GitHub(测试版)上获得,并且我们的 github 维基上提供了教程(https://jpritikin.github.io/gwsem/)。