Pividori Milton, Sadeeq Suraju, Krishnan Arjun, Stranger Barbara E, Gignoux Christopher R
Department of Biomedical Informatics, University of Colorado School of Medicine, Aurora, CO 80045, USA; Colorado Center for Personalized Medicine, University of Colorado Anschutz Medical Campus, Aurora, CO, USA.
Department of Biomedical Informatics, University of Colorado School of Medicine, Aurora, CO 80045, USA.
bioRxiv. 2024 Nov 11:2024.11.08.622657. doi: 10.1101/2024.11.08.622657.
The growing availability of genome-wide association studies (GWAS) and large-scale biobanks provides an unprecedented opportunity to explore the genetic basis of complex traits and diseases. However, with this vast amount of data comes the challenge of interpreting numerous associations across thousands of traits, especially given the high polygenicity and pleiotropy underlying complex phenotypes. Traditional clustering methods, which identify global patterns in data, lack the resolution to capture overlapping associations relevant to subsets of traits or genes. Consequently, there is a critical need for innovative analytic approaches capable of revealing local, biologically meaningful patterns that could advance our understanding of trait comorbidities and gene-trait interactions. Here, we applied BiBit, a biclustering algorithm, to transcriptome-wide association study (TWAS) results from PhenomeXcan, a large resource of gene-trait associations derived from the UK Biobank. BiBit allows simultaneous grouping of traits and genes, identifying biclusters that represent local, overlapping associations. Our analyses uncovered biologically interpretable patterns, including asthma-related biclusters enriched for immune-related gene sets, connections between eye traits and blood pressure, and associations between dietary traits, high cholesterol, and specific loci on chromosome 19. These biclusters highlight gene-trait relationships and patterns of trait co-occurrence that may otherwise be obscured by traditional methods. Our findings demonstrate that biclustering can provide a nuanced view of the genetic architecture of complex traits, offering insights into pleiotropy and disease mechanisms. By enabling the exploration of complex, overlapping patterns within biobank-scale datasets, this approach provides a valuable framework for advancing research on genetic associations, comorbidities, and polygenic traits.
全基因组关联研究(GWAS)和大规模生物样本库的日益普及,为探索复杂性状和疾病的遗传基础提供了前所未有的机遇。然而,伴随着如此海量的数据而来的是解释数千个性状间众多关联的挑战,特别是考虑到复杂表型背后的高度多基因性和多效性。传统的聚类方法旨在识别数据中的全局模式,但缺乏分辨率来捕捉与性状或基因子集相关的重叠关联。因此,迫切需要创新的分析方法,能够揭示局部的、具有生物学意义的模式,从而推动我们对性状共病和基因-性状相互作用的理解。在此,我们将双聚类算法BiBit应用于PhenomeXcan的转录组全关联研究(TWAS)结果,PhenomeXcan是一个源自英国生物样本库的基因-性状关联的大型资源库。BiBit允许同时对性状和基因进行分组,识别代表局部重叠关联的双聚类。我们的分析揭示了具有生物学可解释性的模式,包括富含免疫相关基因集的哮喘相关双聚类、眼部性状与血压之间的联系,以及饮食性状、高胆固醇与19号染色体上特定基因座之间的关联。这些双聚类突出了基因-性状关系和性状共现模式,而这些模式可能会被传统方法所掩盖。我们的研究结果表明,双聚类可以提供对复杂性状遗传结构的细致入微的观点,深入了解多效性和疾病机制。通过在生物样本库规模的数据集中探索复杂的重叠模式,这种方法为推进遗传关联、共病和多基因性状的研究提供了一个有价值的框架。