Dept. of Biomedical Engineering, Georgia Tech and Emory University, Atlanta, USA.
Dept. of Biomedical Engineering, Georgia Tech, Atlanta, USA.
Methods. 2021 May;189:74-85. doi: 10.1016/j.ymeth.2020.07.008. Epub 2020 Aug 5.
Breast and ovarian cancers are the second and the fifth leading causes of cancer death among women. Predicting the overall survival of breast and ovarian cancer patients can facilitate the therapeutics evaluation and treatment decision making. Multi-scale multi-omics data such as gene expression, DNA methylation, miRNA expression, and copy number variations can provide insights on personalized survival. However, how to effectively integrate multi-omics data remains a challenging task. In this paper, we develop multi-omics integration methods to improve the prediction of overall survival for breast cancer and ovarian cancer patients. Because multi-omics data for the same patient jointly impact the survival of cancer patients, features from different -omics modality are related and can be modeled by either association or causal relationship (e.g., pathways). By extracting these relationships among modalities, we can get rid of the irrelevant information from high-throughput multi-omics data. However, it is infeasible to use the Brute Force method to capture all possible multi-omics interactions. Thus, we use deep neural networks with novel divergence-based consensus regularization to capture multi-omics interactions implicitly by extracting modality-invariant representations. In comparing the concatenation-based integration networks with our new divergence-based consensus networks, the breast cancer overall survival C-index is improved from 0.655±0.062 to 0.671±0.046 when combing DNA methylation and miRNA expression, and from 0.627±0.062 to 0.667±0.073 when combing miRNA expression and copy number variations. In summary, our novel deep consensus neural network has successfully improved the prediction of overall survival for breast cancer and ovarian cancer patients by implicitly learning the multi-omics interactions.
乳腺癌和卵巢癌是女性癌症死亡的第二和第五大原因。预测乳腺癌和卵巢癌患者的总生存率有助于评估治疗效果和决策治疗方案。基因表达、DNA 甲基化、miRNA 表达和拷贝数变异等多尺度多组学数据可以提供个性化生存的见解。然而,如何有效地整合多组学数据仍然是一个具有挑战性的任务。在本文中,我们开发了多组学整合方法,以提高乳腺癌和卵巢癌患者的总生存率预测。由于同一患者的多组学数据共同影响癌症患者的生存,不同组学模态的特征是相关的,可以通过关联或因果关系(例如,途径)进行建模。通过提取模态之间的这些关系,我们可以从高通量多组学数据中去除不相关的信息。然而,使用蛮力方法捕捉所有可能的多组学相互作用是不可行的。因此,我们使用具有新颖基于散度的共识正则化的深度神经网络,通过提取模态不变表示来隐式捕捉多组学相互作用。在将基于串联的整合网络与我们新的基于散度的共识网络进行比较时,当结合 DNA 甲基化和 miRNA 表达时,乳腺癌总生存率 C 指数从 0.655±0.062 提高到 0.671±0.046,当结合 miRNA 表达和拷贝数变异时,从 0.627±0.062 提高到 0.667±0.073。总之,我们的新型深度共识神经网络通过隐式学习多组学相互作用,成功提高了乳腺癌和卵巢癌患者的总生存率预测。