Majumdar Arunabha, Pasaniuc Bogdan
Department of Mathematics, Indian Institute of Technology Hyderabad, Kandi, Telangana, India.
Department of Pathology and Laboratory Medicine, University of California, Los Angeles, Los Angeles, California.
Stat Med. 2023 Nov 20;42(26):4867-4885. doi: 10.1002/sim.9892. Epub 2023 Aug 29.
Polygenicity refers to the phenomenon that multiple genetic variants have a nonzero effect on a complex trait. It is defined as the proportion of genetic variants with a nonzero effect on the trait. Evaluation of polygenicity can provide valuable insights into the genetic architecture of the trait. Several recent works have attempted to estimate polygenicity at the single nucleotide polymorphism level. However, evaluating polygenicity at the gene level can be biologically more meaningful. We propose the notion of gene-level polygenicity, defined as the proportion of genes having a nonzero effect on the trait under the framework of a transcriptome-wide association study. We introduce a Bayesian approach genepoly to estimate this quantity for a trait. The method is based on spike and slab prior and simultaneously estimates the subset of non-null genes. Our simulation study shows that genepoly efficiently estimates gene-level polygenicity. The method produces a downward bias for small choices of trait heritability due to a non-null gene, which diminishes rapidly with an increase in the genome-wide association study (GWAS) sample size. While identifying the subset of non-null genes, genepoly offers a high level of specificity and an overall good level of sensitivity-the sensitivity increases as the sample size of the reference panel expression and GWAS data increase. We applied the method to seven phenotypes in the UK Biobank, integrating expression data. We find height to be the most polygenic and asthma to be the least polygenic.
多基因性是指多个基因变异对复杂性状具有非零效应的现象。它被定义为对该性状具有非零效应的基因变异的比例。多基因性评估可以为性状的遗传结构提供有价值的见解。最近的几项研究试图在单核苷酸多态性水平上估计多基因性。然而,在基因水平上评估多基因性在生物学上可能更有意义。我们提出了基因水平多基因性的概念,在全转录组关联研究框架下,将其定义为对性状具有非零效应的基因的比例。我们引入了一种贝叶斯方法genepoly来估计性状的这一数量。该方法基于尖峰和平板先验,同时估计非零基因的子集。我们的模拟研究表明,genepoly能够有效地估计基因水平的多基因性。由于存在非零基因,对于较小的性状遗传力选择,该方法会产生向下偏差,但随着全基因组关联研究(GWAS)样本量的增加,这种偏差会迅速减小。在识别非零基因子集时,genepoly具有较高的特异性和总体良好的敏感性——随着参考面板表达和GWAS数据样本量的增加,敏感性会提高。我们将该方法应用于英国生物银行中的七种表型,并整合了表达数据。我们发现身高的多基因性最高,哮喘的多基因性最低。