Department of Computer Science and Technology, University of Cambridge, Cambridge, UK.
Clare Hall, University of Cambridge, Cambridge, UK.
NPJ Syst Biol Appl. 2021 May 27;7(1):24. doi: 10.1038/s41540-021-00186-6.
Here, we performed a comprehensive intra-tissue and inter-tissue multilayer network analysis of the human transcriptome. We generated an atlas of communities in gene co-expression networks in 49 tissues (GTEx v8), evaluated their tissue specificity, and investigated their methodological implications. UMAP embeddings of gene expression from the communities (representing nearly 18% of all genes) robustly identified biologically-meaningful clusters. Notably, new gene expression data can be embedded into our algorithmically derived models to accelerate discoveries in high-dimensional molecular datasets and downstream diagnostic or prognostic applications. We demonstrate the generalisability of our approach through systematic testing in external genomic and transcriptomic datasets. Methodologically, prioritisation of the communities in a transcriptome-wide association study of the biomarker C-reactive protein (CRP) in 361,194 individuals in the UK Biobank identified genetically-determined expression changes associated with CRP and led to considerably improved performance. Furthermore, a deep learning framework applied to the communities in nearly 11,000 tumors profiled by The Cancer Genome Atlas across 33 different cancer types learned biologically-meaningful latent spaces, representing metastasis (p < 2.2 × 10) and stemness (p < 2.2 × 10). Our study provides a rich genomic resource to catalyse research into inter-tissue regulatory mechanisms, and their downstream consequences on human disease.
在这里,我们对人类转录组进行了全面的组织内和组织间多层网络分析。我们生成了 49 种组织(GTEx v8)中基因共表达网络社区的图谱,评估了它们的组织特异性,并研究了它们的方法学意义。来自社区的基因表达 UMAP 嵌入(代表近 18%的所有基因)稳健地识别出具有生物学意义的聚类。值得注意的是,新的基因表达数据可以嵌入到我们算法衍生的模型中,以加速在高维分子数据集中的发现和下游诊断或预后应用。我们通过在 UK Biobank 中对 361,194 名个体的生物标志物 C 反应蛋白(CRP)的全转录组关联研究中对社区进行系统测试,证明了我们方法的通用性。从方法学上讲,在 UK Biobank 中对 361,194 名个体的生物标志物 C 反应蛋白(CRP)的全转录组关联研究中对社区进行系统测试,确定了与 CRP 相关的遗传决定的表达变化,并显著提高了性能。此外,应用于 The Cancer Genome Atlas 对 33 种不同癌症类型的近 11,000 个肿瘤进行的分析中的社区的深度学习框架学习了具有生物学意义的潜在空间,代表了转移(p<2.2×10)和干性(p<2.2×10)。我们的研究提供了丰富的基因组资源,以促进对组织间调控机制及其对人类疾病的下游影响的研究。