Systems Biology Ireland, School of Medicine, University College Dublin, Belfield, Dublin 4, Ireland.
Int J Mol Sci. 2021 Sep 15;22(18):9970. doi: 10.3390/ijms22189970.
Gaining insight into the mechanisms of signal transduction networks (STNs) by using critical features from patient-specific mathematical models can improve patient stratification and help to identify potential drug targets. To achieve this, these models should focus on the critical STNs for each cancer, include prognostic genes and proteins, and correctly predict patient-specific differences in STN activity. Focussing on colorectal cancer and the WNT STN, we used mechanism-based machine learning models to identify genes and proteins with significant associations to event-free patient survival and predictive power for explaining patient-specific differences of STN activity. First, we identified the WNT pathway as the most significant pathway associated with event-free survival. Second, we built linear-regression models that incorporated both genes and proteins from established mechanistic models in the literature and novel genes with significant associations to event-free patient survival. Data from The Cancer Genome Atlas and Clinical Proteomic Tumour Analysis Consortium were used, and patient-specific STN activity scores were computed using PROGENy. Three linear regression models were built, based on; (1) the gene-set of a state-of-the-art mechanistic model in the literature, (2) novel genes identified, and (3) novel proteins identified. The novel genes and proteins were genes and proteins of the extant WNT pathway whose expression was significantly associated with event-free survival. The results show that the predictive power of a model that incorporated novel event-free associated genes is better compared to a model focussing on the genes of a current state-of-the-art mechanistic model. Several significant genes that should be integrated into future mechanistic models of the WNT pathway are DVL3, FZD5, RAC1, ROCK2, GSK3B, CTB2, CBT1, and PRKCA. Thus, the study demonstrates that using mechanistic information in combination with machine learning can identify novel features (genes and proteins) that are important for explaining the STN heterogeneity between patients and their association to clinical outcomes.
通过利用来自患者特定数学模型的关键特征来深入了解信号转导网络(STN)的机制,可以改善患者分层并帮助确定潜在的药物靶点。为了实现这一目标,这些模型应专注于每个癌症的关键 STN,包括预后基因和蛋白质,并正确预测 STN 活性的患者特异性差异。以结直肠癌和 WNT STN 为例,我们使用基于机制的机器学习模型来识别与无事件患者生存相关且具有预测能力的基因和蛋白质,以解释 STN 活性的患者特异性差异。首先,我们确定 WNT 途径是与无事件生存最相关的最重要途径。其次,我们构建了线性回归模型,该模型结合了文献中已建立的机制模型以及与无事件患者生存显著相关的新型基因中的基因和蛋白质。使用了来自癌症基因组图谱和临床蛋白质组肿瘤分析联盟的数据,并使用 PROGENy 计算了患者特异性 STN 活性评分。基于以下三种线性回归模型进行构建:(1)文献中最先进的机制模型的基因集,(2)新鉴定的基因和(3)新鉴定的蛋白质。新的基因和蛋白质是与无事件生存显著相关的现存 WNT 途径中的基因和蛋白质。结果表明,纳入新的与无事件相关的基因的模型的预测能力优于专注于当前最先进的机制模型的基因的模型。几个应纳入 WNT 途径未来机制模型的重要基因是 DVL3、FZD5、RAC1、ROCK2、GSK3B、CTB2、CBT1 和 PRKCA。因此,该研究表明,使用机制信息与机器学习相结合可以识别对解释患者之间的 STN 异质性及其与临床结果的关联很重要的新特征(基因和蛋白质)。