Centre for Biostatistics, School of Health Sciences, Faculty of Medicine, Biology and Health, University of Manchester, Manchester, UK.
Division of Cardiovascular Sciences, Faculty of Medicine, Biology and Health, University of Manchester, Manchester, UK.
BMC Med Res Methodol. 2024 Aug 2;24(1):168. doi: 10.1186/s12874-024-02229-y.
Understanding the complex interactions between genes and their causal effects on diseases is crucial for developing targeted treatments and gaining insight into biological mechanisms. However, the analysis of molecular networks, especially in the context of high-dimensional data, presents significant challenges.
This study introduces MRdualPC, a computationally tractable algorithm based on the MRPC approach, to infer large-scale causal molecular networks. We apply MRdualPC to investigate the upstream causal transcriptomics influencing hypertension using a comprehensive dataset of kidney genome and transcriptome data.
Our algorithm proves to be 100 times faster than MRPC on average in identifying transcriptomics drivers of hypertension. Through clustering, we identify 63 modules with causal driver genes, including 17 modules with extensive causal networks. Notably, we find that genes within one of the causal networks are associated with the electron transport chain and oxidative phosphorylation, previously linked to hypertension. Moreover, the identified causal ancestor genes show an over-representation of blood pressure-related genes.
MRdualPC has the potential for broader applications beyond gene expression data, including multi-omics integration. While there are limitations, such as the need for clustering in large gene expression datasets, our study represents a significant advancement in building causal molecular networks, offering researchers a valuable tool for analyzing big data and investigating complex diseases.
理解基因之间的复杂相互作用及其对疾病的因果影响对于开发靶向治疗方法和深入了解生物学机制至关重要。然而,分析分子网络,特别是在高维数据的背景下,存在着重大的挑战。
本研究引入了 MRdualPC,这是一种基于 MRPC 方法的计算上可行的算法,用于推断大规模因果分子网络。我们应用 MRdualPC 来研究影响高血压的上游因果转录组学,使用了肾脏基因组和转录组数据的综合数据集。
我们的算法在识别高血压的转录组学驱动因素方面平均比 MRPC 快 100 倍。通过聚类,我们确定了 63 个具有因果驱动基因的模块,其中 17 个模块具有广泛的因果网络。值得注意的是,我们发现因果网络之一内的基因与电子传递链和氧化磷酸化有关,这些先前与高血压有关。此外,所确定的因果祖先基因显示出与血压相关基因的过度表达。
MRdualPC 具有超越基因表达数据的更广泛应用的潜力,包括多组学整合。虽然存在一些局限性,例如在大型基因表达数据集需要聚类,但我们的研究代表了构建因果分子网络方面的重大进展,为研究人员提供了一种分析大数据和研究复杂疾病的有价值的工具。