Wang Zeya, Kaseb Ahmed O, Amin Hesham M, Hassan Manal M, Wang Wenyi, Morris Jeffrey S
Department of Statistics, Rice University; Department of Bioinformatics and Computational Biology, University of Texas MD Anderson Cancer Center, Veerabhadran Baladandayuthapani; Department of Biostatistics, University of Michigan.
Department of Gastrointestinal Medical Oncology, The University of Texas MD Anderson Cancer Center.
J Am Stat Assoc. 2022;117(538):533-546. doi: 10.1080/01621459.2021.2000866. Epub 2022 Jan 5.
It is well-established that interpatient heterogeneity in cancer may significantly affect genomic data analyses and in particular, network topologies. Most existing graphical model methods estimate a single population-level graph for genomic or proteomic network. In many investigations, these networks depend on patient-specific indicators that characterize the heterogeneity of individual networks across subjects with respect to subject-level covariates. Examples include assessments of how the network varies with patient-specific prognostic scores or comparisons of tumor and normal graphs while accounting for tumor purity as a continuous predictor. In this paper, we propose a novel edge regression model for undirected graphs, which estimates conditional dependencies as a function of subject-level covariates. We evaluate our model performance through simulation studies focused on comparing tumor and normal graphs while adjusting for tumor purity. In application to a dataset of proteomic measurements on plasma samples from patients with hepatocellular carcinoma (HCC), we ascertain how blood protein networks vary with disease severity, as measured by HepatoScore, a novel biomarker signature measuring disease severity. Our case study shows that the network connectivity increases with HepatoScore and a set of hub genes as well as important gene connections are identified under different HepatoScore, which may provide important biological insights to the development of precision therapies for HCC.
癌症患者间的异质性会显著影响基因组数据分析,尤其是网络拓扑结构,这一点已得到充分证实。大多数现有的图形模型方法会为基因组或蛋白质组网络估计一个单一的群体水平图。在许多研究中,这些网络依赖于患者特定指标,这些指标表征了个体网络在受试者层面协变量方面的异质性。例如,评估网络如何随患者特定的预后评分变化,或者在将肿瘤纯度作为连续预测变量的情况下比较肿瘤图和正常图。在本文中,我们提出了一种用于无向图的新型边回归模型,该模型将条件依赖性估计为受试者层面协变量的函数。我们通过模拟研究来评估模型性能,这些研究侧重于在调整肿瘤纯度的同时比较肿瘤图和正常图。在应用于肝细胞癌(HCC)患者血浆样本蛋白质组测量数据集时,我们确定血液蛋白质网络如何随疾病严重程度变化,疾病严重程度由HepatoScore衡量,HepatoScore是一种测量疾病严重程度的新型生物标志物特征。我们的案例研究表明,网络连通性随HepatoScore增加,并且在不同的HepatoScore下识别出了一组枢纽基因以及重要的基因连接,这可能为HCC精准治疗的发展提供重要的生物学见解。