National Heart, Lung, and Blood Institute's Framingham Heart Study, Framingham, MA 01702, USA.
Bioinformatics. 2011 May 1;27(9):1201-6. doi: 10.1093/bioinformatics/btr116. Epub 2011 Mar 12.
The concept of pleiotropy was proposed a century ago, though up to now there have been insufficient efforts to design robust statistics and software aimed at visualizing and evaluating pleiotropy at a regional level. The Pleiotropic Region Identification Method (PRIMe) was developed to evaluate potentially pleiotropic loci based upon data from multiple genome-wide association studies (GWAS).
We first provide a software tool to systematically identify and characterize genomic regions where low association P-values are observed with multiple traits. We use the term Pleiotropy Index to denote the number of traits with low association P-values at a particular genomic region. For GWAS assumed to be uncorrelated, we adopted the binomial distribution to approximate the statistical significance of the Pleiotropy Index. For GWAS conducted on traits with known correlation coefficients, simulations are performed to derive the statistical distribution of the Pleiotropy Index under the null hypothesis of no genotype-phenotype association. For six hematologic and three blood pressure traits where full GWAS results were available from the Cohorts for Heart and Aging Research in Genomic Epidemiology (CHARGE) Consortium, we estimated the trait correlations and applied the simulation approach to examine genomic regions with statistical evidence of pleiotropy. We then applied the approximation approach to explore GWAS summarized in the National Human Genome Research Institute (NHGRI) GWAS Catalog.
By simulation, we identified pleiotropic regions including SH2B3 and BRAP (12q24.12) for hematologic and blood pressure traits. By approximation, we confirmed the genome-wide significant pleiotropy of these two regions based on the GWAS Catalog data, together with an exploration on other regions which highlights the FTO, GCKR and ABO regions.
The Perl and R scripts are available at http://www.framinghamheartstudy.org/research/gwas_pleiotropictool.html.
多效性的概念早在一个世纪前就被提出了,但到目前为止,还没有足够的努力来设计稳健的统计数据和软件,以在区域水平上可视化和评估多效性。多效性区域识别方法(PRIMe)是为了根据来自多个全基因组关联研究(GWAS)的数据评估潜在的多效性基因座而开发的。
我们首先提供了一种软件工具,用于系统地识别和描述在多个性状中观察到低关联 P 值的基因组区域。我们使用多效性指数一词来表示在特定基因组区域中具有低关联 P 值的性状数量。对于假设不相关的 GWAS,我们采用二项式分布来近似多效性指数的统计显著性。对于基于已知相关系数的性状进行的 GWAS,我们进行模拟以在基因型-表型关联无显著差异的零假设下得出多效性指数的统计分布。对于来自心血管和衰老研究中的基因组流行病学(CHARGE)联盟的全基因组关联研究(GWAS)结果可用的六个血液学和三个血压性状,我们估计了性状相关性,并应用模拟方法检查具有多效性统计证据的基因组区域。然后,我们应用近似方法探索了国家人类基因组研究所(NHGRI)GWAS 目录中汇总的 GWAS。
通过模拟,我们确定了包括 SH2B3 和 BRAP(12q24.12)在内的多效性区域,这些区域与血液学和血压性状有关。通过近似,我们根据 GWAS 目录数据证实了这两个区域的全基因组显著性多效性,同时还对其他区域进行了探索,突出了 FTO、GCKR 和 ABO 区域。
Perl 和 R 脚本可在 http://www.framinghamheartstudy.org/research/gwas_pleiotropictool.html 上获得。