Xie Ping, Batur Jesur, An Xin, Yasen Musha, Fu Xuefeng, Jia Lin, Luo Yun
Department of Urology, The Third Affiliated Hospital of Sun Yat-Sen University, Guangzhou, Guangdong, China.
Department of Urology, The First People's Hospital of Kashi Prefecture, Kashi, Xinjiang, China.
Front Oncol. 2023 Jan 13;12:1084403. doi: 10.3389/fonc.2022.1084403. eCollection 2022.
The presence of lymph node metastasis leads to a poor prognosis for prostate cancer (Pca). Recently, many studies have indicated that gene signatures may be able to predict the status of lymph nodes. The purpose of this study is to probe and validate a new tool to predict lymph node metastasis (LNM) based on alternative splicing (AS).
Gene expression profiles and clinical information of prostate adenocarcinoma cohort were retrieved from The Cancer Genome Atlas (TCGA) database, and the corresponding RNA-seq splicing events profiles were obtained from the TCGA SpliceSeq. Limma package was used to identify the differentially expressed alternative splicing (DEAS) events between LNM and non-LNM groups. Eight machine learning classifiers were built to train with stratified five-fold cross-validation. SHAP values was used to explain the model.
333 differentially expressed alternative splicing (DEAS) events were identified. Using correlation filter and the least absolute shrinkage and selection operator (LASSO) method, a 96 AS signature was identified that had favorable discrimination in the training set and validated in the validation set. The linear discriminant analysis (LDA) was the best classifier after 100 iterations of training. The LDA classifier was able to distinguish between LNM and non-LNM with an area under the receiver operating curve of 0.962 ± 0.026 in the training set (D1 = 351) and 0.953 in the validation set (D2 = 62). The decision curve analysis plot proved the clinical application of the AS-based model.
Machine learning combined with AS data could robustly distinguish between LNM and non-LNM in Pca.
淋巴结转移的存在会导致前列腺癌(Pca)预后不良。最近,许多研究表明基因特征可能能够预测淋巴结状态。本研究的目的是探索并验证一种基于可变剪接(AS)预测淋巴结转移(LNM)的新工具。
从癌症基因组图谱(TCGA)数据库中检索前列腺腺癌队列的基因表达谱和临床信息,并从TCGA SpliceSeq获得相应的RNA测序剪接事件谱。使用Limma软件包识别LNM组和非LNM组之间差异表达的可变剪接(DEAS)事件。构建了八个机器学习分类器,采用分层五折交叉验证进行训练。使用SHAP值来解释模型。
识别出333个差异表达的可变剪接(DEAS)事件。使用相关过滤器和最小绝对收缩和选择算子(LASSO)方法,识别出一个96个AS特征,该特征在训练集中具有良好的区分能力,并在验证集中得到验证。线性判别分析(LDA)是经过100次训练迭代后最佳的分类器。LDA分类器在训练集(D1 = 351)中能够区分LNM和非LNM,受试者操作特征曲线下面积为0.962±0.026,在验证集(D2 = 62)中为0.953。决策曲线分析图证明了基于AS的模型的临床应用价值。
机器学习结合AS数据能够有力地区分Pca中的LNM和非LNM。