The College of Computer Science and Technology, Nanjing University of Aeronautics and Astronautics, MIIT Key Laboratory of Pattern Analysis and Machine Intelligence, Nanjing, 211106, China.
The Department of Computer Science, Indiana University Bloomington, Bloomington, 47405, USA.
Med Image Anal. 2020 Oct;65:101795. doi: 10.1016/j.media.2020.101795. Epub 2020 Jul 23.
With the tremendous development of artificial intelligence, many machine learning algorithms have been applied to the diagnosis of human cancers. Recently, rather than predicting categorical variables (e.g., stages and subtypes) as in cancer diagnosis, several prognosis prediction models basing on patients' survival information have been adopted to estimate the clinical outcome of cancer patients. However, most existing studies treat the diagnosis and prognosis tasks separately. In fact, the diagnosis information (e.g., TNM Stages) indicates the extent of the disease severity that is highly correlated with the patients' survival. While the diagnosis is largely made based on histopathological images, recent studies have also demonstrated that integrative analysis of histopathological images and genomic data can hold great promise for improving the diagnosis and prognosis of cancers. However, direct combination of these two types of data may bring redundant features that will negatively affect the prediction performance. Therefore, it is necessary to select informative features from the derived multi-modal data. Based on the above considerations, we propose a multi-task multi-modal feature selection method for joint diagnosis and prognosis of cancers. Specifically, we make use of the task relationship learning framework to automatically discover the relationships between the diagnosis and prognosis tasks, through which we can identify important image and genomics features for both tasks. In addition, we add a regularization term to ensure that the correlation within the multi-modal data can be captured. We evaluate our method on three cancer datasets from The Cancer Genome Atlas project, and the experimental results verify that our method can achieve better performance on both diagnosis and prognosis tasks than the related methods.
随着人工智能的飞速发展,许多机器学习算法已被应用于人类癌症的诊断。最近,除了预测癌症诊断中的类别变量(例如,分期和亚型)外,一些基于患者生存信息的预后预测模型也被用于评估癌症患者的临床结局。然而,大多数现有研究将诊断和预后任务分开处理。实际上,诊断信息(例如,TNM 分期)指示疾病严重程度,与患者的生存高度相关。虽然诊断主要基于组织病理学图像,但最近的研究也表明,组织病理学图像和基因组数据的综合分析在提高癌症的诊断和预后方面具有广阔的前景。然而,这两种类型的数据的直接组合可能会带来冗余特征,从而对预测性能产生负面影响。因此,有必要从衍生的多模态数据中选择信息丰富的特征。基于上述考虑,我们提出了一种用于癌症联合诊断和预后的多任务多模态特征选择方法。具体来说,我们利用任务关系学习框架自动发现诊断和预后任务之间的关系,通过这种关系,我们可以识别出两个任务的重要图像和基因组特征。此外,我们还添加了一个正则化项来确保能够捕捉多模态数据中的相关性。我们在来自癌症基因组图谱计划的三个癌症数据集上评估了我们的方法,实验结果验证了我们的方法在诊断和预后任务上的性能均优于相关方法。