School of Biotechnology, East China University of Science and Technology, Shanghai 200237, China.
Shanghai Center for Bioinformation Technology, Shanghai 201203, China.
J Mol Cell Biol. 2020 Jul 27;12(11):881-893. doi: 10.1093/jmcb/mjaa041.
The implementation of cancer precision medicine requires biomarkers or signatures for predicting prognosis and therapeutic benefits. Most of current efforts in this field are paying much more attention to predictive accuracy than to molecular mechanistic interpretability. Mechanism-driven strategy has recently emerged, aiming to build signatures with both predictive power and explanatory power. Driven by this strategy, we developed a robust gene dysregulation analysis framework with machine learning algorithms, which is capable of exploring gene dysregulations underlying carcinogenesis from high-dimensional data with cooperativity and synergy between regulators and several other transcriptional regulation rules taken into consideration. We then applied the framework to a colorectal cancer (CRC) cohort from The Cancer Genome Atlas. The identified CRC-related dysregulations significantly covered known carcinogenic processes and exhibited good prognostic effect. By choosing dysregulations with greedy strategy, we built a four-dysregulation (4-DysReg) signature, which has the capability of predicting prognosis and adjuvant chemotherapy benefit. 4-DysReg has the potential to explain carcinogenesis in terms of dysfunctional transcriptional regulation. These results demonstrate that our gene dysregulation analysis framework could be used to develop predictive signature with mechanistic interpretability for cancer precision medicine, and furthermore, elucidate the mechanisms of carcinogenesis.
癌症精准医学的实施需要生物标志物或特征来预测预后和治疗效果。目前该领域的大多数研究都更加关注预测准确性,而不是分子机制的可解释性。最近出现了一种基于机制的策略,旨在构建具有预测能力和解释能力的特征。受此策略的驱动,我们开发了一个稳健的基因失调分析框架,该框架结合了机器学习算法,能够从具有协同性和调控因子之间的协同作用的高维数据中探索致癌作用背后的基因失调,并考虑了其他几种转录调控规则。然后,我们将该框架应用于来自癌症基因组图谱(The Cancer Genome Atlas)的结直肠癌(CRC)队列。鉴定出的 CRC 相关失调显著涵盖了已知的致癌过程,并表现出良好的预后效果。通过采用贪婪策略选择失调基因,我们构建了一个四失调(4-DysReg)特征,该特征具有预测预后和辅助化疗效果的能力。4-DysReg 有可能从功能失调的转录调控角度解释致癌作用。这些结果表明,我们的基因失调分析框架可用于开发具有癌症精准医学机制解释能力的预测特征,并进一步阐明致癌作用的机制。