Liu Peng, Page David, Ahlquist Paul, Ong Irene M, Gitter Anthony
Department of Biostatistics and Medical Informatics, University of Wisconsin-Madison, Madison, Wisconsin, United States of America.
Carbone Cancer Center, University of Wisconsin-Madison, Madison, Wisconsin, United States of America.
bioRxiv. 2025 Mar 31:2024.06.15.599113. doi: 10.1101/2024.06.15.599113.
Fully capturing cellular state requires examining genomic, epigenomic, transcriptomic, proteomic, and other assays for a biological sample and comprehensive computational modeling to reason with the complex and sometimes conflicting measurements. Modeling these so-called multi-omic data is especially beneficial in disease analysis, where observations across omic data types may reveal unexpected patient groupings and inform clinical outcomes and treatments. We present Multi-omic Pathway Analysis of Cells (MPAC), a computational framework that interprets multi-omic data through prior knowledge from biological pathways. MPAC uses network relationships encoded in pathways using a factor graph to infer consensus activity levels for proteins and associated pathway entities from multi-omic data, runs permutation testing to eliminate spurious activity predictions, and groups biological samples by pathway activities to prioritize proteins with potential clinical relevance. Using DNA copy number alteration and RNA-seq data from head and neck squamous cell carcinoma patients from The Cancer Genome Atlas as an example, we demonstrate that MPAC predicts a patient subgroup related to immune responses not identified by analysis with either input omic data type alone. Key proteins identified via this subgroup have pathway activities related to clinical outcome as well as immune cell compositions. Our MPAC R package, available at https://bioconductor.org/packages/MPAC, enables similar multi-omic analyses on new datasets.
要全面捕捉细胞状态,需要对生物样本进行基因组、表观基因组、转录组、蛋白质组及其他分析,并进行全面的计算建模,以便处理复杂且有时相互矛盾的测量数据。对这些所谓的多组学数据进行建模在疾病分析中特别有用,因为跨组学数据类型的观察结果可能揭示意外的患者分组,并为临床结果和治疗提供信息。我们提出了细胞多组学通路分析(MPAC),这是一个通过生物通路的先验知识来解释多组学数据的计算框架。MPAC使用因子图编码通路中的网络关系,从多组学数据推断蛋白质及相关通路实体的共识活性水平,进行置换检验以消除虚假的活性预测,并通过通路活性对生物样本进行分组,以确定具有潜在临床相关性的蛋白质的优先级。以癌症基因组图谱中头颈部鳞状细胞癌患者的DNA拷贝数改变和RNA测序数据为例,我们证明MPAC预测了一个与免疫反应相关的患者亚组,该亚组在单独使用任何一种输入组学数据类型进行分析时均未被识别。通过该亚组鉴定出的关键蛋白质具有与临床结果以及免疫细胞组成相关的通路活性。我们的MPAC R包可在https://bioconductor.org/packages/MPAC获取,能够对新数据集进行类似的多组学分析。