Lee Sugi, Jung Jaeeun, Park Ilkyu, Park Kunhyang, Kim Dae-Soo
Department of Bioinformatics, KRIBB School of Bioscience, Korea University of Science and Technology (UST), 217 Gajeong-ro, Yuseong-gu, Daejeon, Republic of Korea.
Department of Environmental Disease Research Centers, Korea Research Institute of Bioscience & Biotechnology (KRIBB), 125 Gwahak-ro, Yuseong-gu, Daejeon, Republic of Korea.
Comput Struct Biotechnol J. 2020 Sep 24;18:2639-2646. doi: 10.1016/j.csbj.2020.09.029. eCollection 2020.
Papillary renal cell carcinoma (pRCC), which accounts for 10-15% of renal cell carcinomas, is the second most frequent renal cell carcinoma. pRCC patient classification is difficult because of disease heterogeneity, histologic subtypes, and variations in both disease progression and patient outcomes. Nevertheless, symptom-based patient classification is indispensable in deciding treatment options. Here we introduce a prediction method for distinguishing pRCC pathological tumour stages using deep learning and similarity-based hierarchical clustering approaches. Differentially expressed genes (DEGs) were identified from gene expression data of pRCC patients retrieved from TCGA. Thirty-three of these genes were distinguished based on expression in early or late stage pRCC using the Wilcoxon rank sum test, confidence interval, and LASSO regression. Then, a deep learning model was constructed to predict tumour progression with an accuracy of 0.942 and area under curve of 0.933. Furthermore, pathological sub-stage information with an accuracy of 0.857 was obtained via similarity-based hierarchical clustering using 18 DEGs between stages I and II, and 11 DEGs between stages III and IV, identified through Wilcoxon rank sum test and quantile approach. Additionally, we offer this classification process as an R function. This is the first report of a model distinguishing the pathological tumour stages of pRCC using deep learning and similarity-based hierarchical clustering methods. Our findings are potentially applicable for improving early detection and treatment of pRCC and establishing a clearer classification of the pathological stages in other tumours.
乳头状肾细胞癌(pRCC)占肾细胞癌的10%-15%,是第二常见的肾细胞癌。由于疾病异质性、组织学亚型以及疾病进展和患者预后的差异,pRCC患者的分类很困难。然而,基于症状的患者分类对于决定治疗方案是必不可少的。在此,我们介绍一种使用深度学习和基于相似性的层次聚类方法来区分pRCC病理肿瘤分期的预测方法。从TCGA检索的pRCC患者基因表达数据中鉴定出差异表达基因(DEG)。使用Wilcoxon秩和检验、置信区间和LASSO回归,根据这些基因在早期或晚期pRCC中的表达区分出33个基因。然后,构建了一个深度学习模型来预测肿瘤进展,准确率为0.942,曲线下面积为0.933。此外,通过基于相似性的层次聚类,利用通过Wilcoxon秩和检验和分位数方法确定的I期和II期之间的18个DEG以及III期和IV期之间的11个DEG,获得了准确率为0.857的病理亚分期信息。此外,我们将这个分类过程作为一个R函数提供。这是第一篇使用深度学习和基于相似性的层次聚类方法区分pRCC病理肿瘤分期的模型报告。我们的发现可能适用于改善pRCC的早期检测和治疗,并在其他肿瘤中建立更清晰的病理分期分类。