Department of Mathematical Sciences, Michigan Technological University, Houghton, MI 49931, United States.
Bioinformatics. 2023 Dec 1;39(12). doi: 10.1093/bioinformatics/btad707.
Genome-wide association studies is an essential tool for analyzing associations between phenotypes and single nucleotide polymorphisms (SNPs). Most of binary phenotypes in large biobanks are extremely unbalanced, which leads to inflated type I error rates for many widely used association tests for joint analysis of multiple phenotypes. In this article, we first propose a novel method to construct a Multi-Layer Network (MLN) using individuals with at least one case status among all phenotypes. Then, we introduce a computationally efficient community detection method to group phenotypes into disjoint clusters based on the MLN. Finally, we propose a novel approach, MLN with Omnibus (MLN-O), to jointly analyse the association between phenotypes and a SNP. MLN-O uses the score test to test the association of each merged phenotype in a cluster and a SNP, then uses the Omnibus test to obtain an overall test statistic to test the association between all phenotypes and a SNP.
We conduct extensive simulation studies to reveal that the proposed approach can control type I error rates and is more powerful than some existing methods. Meanwhile, we apply the proposed method to a real data set in the UK Biobank. Using phenotypes in Chapter XIII (Diseases of the musculoskeletal system and connective tissue) in the UK Biobank, we find that MLN-O identifies more significant SNPs than other methods we compare with.
https://github.com/Hongjing-Xie/Multi-Layer-Network-with-Omnibus-MLN-O.
全基因组关联研究是分析表型与单核苷酸多态性(SNP)之间关联的重要工具。大型生物库中的大多数二元表型极其不平衡,这导致许多广泛用于联合分析多种表型的关联测试的Ⅰ型错误率膨胀。在本文中,我们首先提出了一种使用所有表型中至少有一种病例状态的个体构建多层网络(MLN)的新方法。然后,我们引入了一种计算效率高的社区检测方法,根据 MLN 将表型分为不相交的簇。最后,我们提出了一种新的方法,即多层面网络与综合(MLN-O),用于联合分析表型与 SNP 之间的关联。MLN-O 使用评分检验来检验簇和 SNP 之间每个合并表型的关联,然后使用综合检验来获得总体检验统计量,以检验所有表型与 SNP 之间的关联。
我们进行了广泛的模拟研究,结果表明,所提出的方法可以控制Ⅰ型错误率,并且比一些现有的方法更有效。同时,我们将所提出的方法应用于英国生物库中的一个真实数据集。使用英国生物库中第十三章(肌肉骨骼系统和结缔组织疾病)中的表型,我们发现 MLN-O 比我们比较的其他方法识别出更多的显著 SNP。
https://github.com/Hongjing-Xie/Multi-Layer-Network-with-Omnibus-MLN-O。