Institute of Statistical Science, Academia Sinica, No.128, Academia Road, Section 2, Nankang, Taipei 11529, Taiwan.
Brief Bioinform. 2024 May 23;25(4). doi: 10.1093/bib/bbae327.
Identifying the causal relationship between genotype and phenotype is essential to expanding our understanding of the gene regulatory network spanning the molecular level to perceptible traits. A pleiotropic gene can act as a central hub in the network, influencing multiple outcomes. Identifying such a gene involves testing under a composite null hypothesis where the gene is associated with, at most, one trait. Traditional methods such as meta-analyses of top-hit $P$-values and sequential testing of multiple traits have been proposed, but these methods fail to consider the background of genome-wide signals. Since Huang's composite test produces uniformly distributed $P$-values for genome-wide variants under the composite null, we propose a gene-level pleiotropy test that entails combining the aforementioned method with the aggregated Cauchy association test. A polygenic trait involves multiple genes with different functions to co-regulate mechanisms. We show that polygenicity should be considered when identifying pleiotropic genes; otherwise, the associations polygenic traits initiate will give rise to false positives. In this study, we constructed gene-trait functional modules using the results of the proposed pleiotropy tests. Our analysis suite was implemented as an R package PGCtest. We demonstrated the proposed method with an application study of the Taiwan Biobank database and identified functional modules comprising specific genes and their co-regulated traits.
确定基因型和表型之间的因果关系对于扩展我们对跨越分子水平到可感知特征的基因调控网络的理解至关重要。一个多效基因可以作为网络中的中心枢纽,影响多个结果。识别这样的基因涉及在复合零假设下进行测试,在该假设下,基因最多与一个特征相关。已经提出了一些传统方法,如对最高命中 $P$ 值进行荟萃分析和对多个特征进行顺序测试,但这些方法未能考虑到全基因组信号的背景。由于 Huang 的复合测试在复合零假设下为全基因组变体产生均匀分布的 $P$ 值,因此我们提出了一种基因水平的多效性测试,该测试需要将上述方法与聚合的 Cauchy 关联测试相结合。多基因性状涉及多个具有不同功能的基因来共同调节机制。我们表明,在识别多效基因时应考虑多效性;否则,多基因性状引发的关联将产生假阳性。在这项研究中,我们使用提出的多效性测试的结果构建了基因-性状功能模块。我们的分析套件实现为一个 R 包 PGCtest。我们通过对台湾生物银行数据库的应用研究来展示所提出的方法,并确定了包含特定基因及其共同调节性状的功能模块。