Takahashi Satoshi, Asada Ken, Takasawa Ken, Shimoyama Ryo, Sakai Akira, Bolatkan Amina, Shinkai Norio, Kobayashi Kazuma, Komatsu Masaaki, Kaneko Syuzo, Sese Jun, Hamamoto Ryuji
Cancer Translational Research Team, RIKEN Center for Advanced Intelligence Project, 1-4-1 Nihonbashi, Chuo-ku, Tokyo 103-0027, Japan.
Division of Molecular Modification and Cancer Biology, National Cancer Center Research Institute, 5-1-1 Tsukiji, Chuo-ku, Tokyo 104-0045, Japan.
Biomolecules. 2020 Oct 19;10(10):1460. doi: 10.3390/biom10101460.
Mortality attributed to lung cancer accounts for a large fraction of cancer deaths worldwide. With increasing mortality figures, the accurate prediction of prognosis has become essential. In recent years, multi-omics analysis has emerged as a useful survival prediction tool. However, the methodology relevant to multi-omics analysis has not yet been fully established and further improvements are required for clinical applications. In this study, we developed a novel method to accurately predict the survival of patients with lung cancer using multi-omics data. With unsupervised learning techniques, survival-associated subtypes in non-small cell lung cancer were first detected using the multi-omics datasets from six categories in The Cancer Genome Atlas (TCGA). The new subtypes, referred to as integration survival subtypes, clearly divided patients into longer and shorter-surviving groups (log-rank test: = 0.003) and we confirmed that this is independent of histopathological classification (Chi-square test of independence: = 0.94). Next, an attempt was made to detect the integration survival subtypes using only one categorical dataset. Our machine learning model that was only trained on the reverse phase protein array (RPPA) could accurately predict the integration survival subtypes (AUC = 0.99). The predicted subtypes could also distinguish between high and low risk patients (log-rank test: = 0.012). Overall, this study explores novel potentials of multi-omics analysis to accurately predict the prognosis of patients with lung cancer.
在全球范围内,肺癌导致的死亡率在癌症死亡中占很大比例。随着死亡率的上升,准确预测预后变得至关重要。近年来,多组学分析已成为一种有用的生存预测工具。然而,与多组学分析相关的方法尚未完全确立,临床应用还需要进一步改进。在本研究中,我们开发了一种利用多组学数据准确预测肺癌患者生存情况的新方法。通过无监督学习技术,首先使用来自癌症基因组图谱(TCGA)六个类别的多组学数据集检测非小细胞肺癌中与生存相关的亚型。这些新的亚型,即整合生存亚型,将患者明显分为生存期较长和较短的组(对数秩检验:= 0.003),并且我们证实这与组织病理学分类无关(独立性卡方检验:= 0.94)。接下来,尝试仅使用一个分类数据集检测整合生存亚型。我们仅在反相蛋白质阵列(RPPA)上训练的机器学习模型能够准确预测整合生存亚型(AUC = 0.99)。预测的亚型也能够区分高风险和低风险患者(对数秩检验:= 0.012)。总体而言,本研究探索了多组学分析在准确预测肺癌患者预后方面的新潜力。