IEEE/ACM Trans Comput Biol Bioinform. 2023 Sep-Oct;20(5):2920-2932. doi: 10.1109/TCBB.2023.3282028. Epub 2023 Oct 9.
In this paper, we study the problem of inferring spatially-varying Gaussian Markov random fields (SV-GMRF) where the goal is to learn a network of sparse, context-specific GMRFs representing network relationships between genes. An important application of SV-GMRFs is in inference of gene regulatory networks from spatially-resolved transcriptomics datasets. The current work on inference of SV-GMRFs are based on the regularized maximum likelihood estimation (MLE) and suffer from overwhelmingly high computational cost due to their highly nonlinear nature. To alleviate this challenge, we propose a simple and efficient optimization problem in lieu of MLE that comes equipped with strong statistical and computational guarantees. Our proposed optimization problem is extremely efficient in practice: we can solve instances of SV-GMRFs with more than 2 million variables in less than 2 minutes. We apply the developed framework to study how gene regulatory networks in Glioblastoma are spatially rewired within tissue, and identify prominent activity of the transcription factor HES4 and ribosomal proteins as characterizing the gene expression network in the tumor peri-vascular niche that is known to harbor treatment resistant stem cells.
在本文中,我们研究了推断空间变化的高斯马尔可夫随机场(SV-GMRF)的问题,其目的是学习一个稀疏的、特定于上下文的 GMRF 网络,该网络表示基因之间的网络关系。SV-GMRF 的一个重要应用是从空间分辨转录组数据集推断基因调控网络。目前基于正则化最大似然估计(MLE)的 SV-GMRF 推断工作,由于其高度非线性,计算成本极高。为了缓解这一挑战,我们提出了一个简单而有效的优化问题,替代 MLE,它具有强大的统计和计算保证。我们提出的优化问题在实践中非常高效:我们可以在不到 2 分钟的时间内解决超过 200 万个变量的 SV-GMRF 实例。我们应用所开发的框架来研究胶质母细胞瘤中的基因调控网络如何在组织内重新布线,并确定转录因子 HES4 和核糖体蛋白的显著活性,作为已知含有治疗抗性干细胞的肿瘤血管周围龛中基因表达网络的特征。