Department of Computer and Information Sciences, University of Northumbria, Newcastle Upon Tyne, United Kingdom.
Information School, University of Sheffield, Sheffield, United Kingdom.
PLoS One. 2024 Sep 3;19(9):e0305268. doi: 10.1371/journal.pone.0305268. eCollection 2024.
There exists an unexplained diverse variation within the predefined colon cancer stages using only features from either genomics or histopathological whole slide images as prognostic factors. Unraveling this variation will bring about improved staging and treatment outcomes. Hence, motivated by the advancement of Deep Neural Network (DNN) libraries and complementary factors within some genomics datasets, we aggregate atypia patterns in histopathological images with diverse carcinogenic expression from mRNA, miRNA and DNA methylation as an integrative input source into a deep neural network for colon cancer stages classification, and samples stratification into low or high-risk survival groups.
The genomics-only and integrated input features return Area Under Curve-Receiver Operating Characteristic curve (AUC-ROC) of 0.97 compared with AUC-ROC of 0.78 obtained when only image features are used for the stage's classification. A further analysis of prediction accuracy using the confusion matrix shows that the integrated features have a weakly improved accuracy of 0.08% more than the accuracy obtained with genomics features. Also, the extracted features were used to split the patients into low or high-risk survival groups. Among the 2,700 fused features, 1,836 (68%) features showed statistically significant survival probability differences in aggregating samples into either low or high between the two risk survival groups. Availability and Implementation: https://github.com/Ogundipe-L/EDCNN.
仅使用基因组学或组织病理学全切片图像中的特征作为预后因素,就存在定义明确的结肠癌分期内无法解释的多样化差异。揭示这种差异将带来改善的分期和治疗结果。因此,受深度学习神经网络 (DNN) 库和一些基因组学数据集中互补因素的推动,我们将组织病理学图像中的异型模式与来自 mRNA、miRNA 和 DNA 甲基化的多样化致癌表达相结合,作为一个综合输入源,将其输入到用于结肠癌分期分类的深度神经网络中,并将样本分层为低风险或高风险生存组。
与仅使用图像特征进行分期分类时获得的 AUC-ROC 为 0.78 相比,基因组学特征和综合输入特征的 AUC-ROC 分别为 0.97。使用混淆矩阵进一步分析预测准确性表明,与使用基因组学特征获得的准确性相比,综合特征的准确性略有提高 0.08%。此外,还使用提取的特征将患者分为低风险或高风险生存组。在融合的 2700 个特征中,有 1836 个(68%)特征在将样本聚集到两个风险生存组中的低或高之间时,显示出统计学上显著的生存概率差异。