Szincsak Sara, Király Péter, Szegvari Gabor, Horváth Mátyás, Dora David, Lohinai Zoltan
Translational Medicine Institute, Semmelweis University, 1094 Budapest, Hungary.
Department of Anatomy, Histology, and Embryology, Semmelweis University, 1094 Budapest, Hungary.
Int J Mol Sci. 2025 Jun 20;26(13):5937. doi: 10.3390/ijms26135937.
Machine learning (ML) algorithms hold the potential to outperform the selection of patients for immunotherapy (ICIs) compared to previous biomarker studies. We analyzed the predictive performance of ML models and compared them to traditional clinical biomarkers (TCBs) in the field of gastrointestinal (GI) cancers. The study has been registered in PROSPERO (number: CRD42023465917). A systematic search of PubMed was conducted to identify studies applying different ML algorithms to GI cancer patients treated with ICIs using tumor RNA gene expression profiles. The outcomes included were response to immunotherapy (ITR) or survival. Additionally, we compared the ML methodology details and predictive power inherent in the published gene sets using 5-fold cross-validation and logistic regression (LR), on an available well-defined ICI-treated metastatic gastric cancer (GC) cohort ( = 45). A set of standard clinical ICI biomarkers (MLH, MSH, and CD8 genes, plus PMS2 and PD-L1)) and de-novo calculated principal components (PCs) of the original datasets were also included as additional points of comparison. Nine articles were identified as eligible to meet the inclusion criteria. Three were pan-cancer studies, five assessed GC, and one studied colorectal cancer (CRC). Classification and regression models were used to predict ICI efficacy. Next, using LR, we validated the predictive power of applied ML algorithms on RNA signatures, using their reported receiver operating characteristics (ROC) analysis area under the curve (AUC) values on a well-defined ICI-treated gastric cancer (GC) dataset ( = 45). In two cases our method has outperformed the published results (reported/LR comparison: 0.74/0.831, 0.67/0.735). Besides the published studies, we have included two benchmarks: a set of TCBs and using principal components based on the whole dataset (PCA, 99% explained variance, 40 components). Interestingly, a study using a selected gene set (immuno-oncology panel) with AUC = 0.83 was the only one that outperformed the TCB (AUC = 0.8) and the PCA (AUC =0.81) results. Cross-validation of the predictive performance of these genes on the same GC dataset and an investigation of their prognostic role on a collated multi-cohort GC dataset of = 375 resected, or chemotherapy-treated patients revealed that genes mannose-6-phosphate receptor (M6PR), Indoleamine 2,3-Dioxygenase 1 (IDO1), Neuropilin-1 (NRP1), and MAGEA3 performed similarly, or better than established biomarkers like PD-L1 and MSI. We found an immuno-oncology panel with an AUC = 0.83 that outperformed the clinical benchmark or the PC results. We recommend further investigation and experimental validation in the case of M6PR, IDO1, NRP1, and MAGEA3 expressions based on their strong predictive power in GC ITR. Well-designed studies with larger sample sizes and nonlinear ML models might help improve biomarker selections.
与以往的生物标志物研究相比,机器学习(ML)算法在免疫治疗(ICI)患者选择方面具有更优表现的潜力。我们分析了ML模型的预测性能,并将其与胃肠道(GI)癌领域的传统临床生物标志物(TCB)进行比较。该研究已在PROSPERO注册(编号:CRD42023465917)。我们对PubMed进行了系统检索,以识别使用肿瘤RNA基因表达谱将不同ML算法应用于接受ICI治疗的GI癌患者的研究。纳入的结果包括对免疫治疗的反应(ITR)或生存情况。此外,我们在一个可用的、定义明确的接受ICI治疗的转移性胃癌(GC)队列(n = 45)中,使用5折交叉验证和逻辑回归(LR)比较了已发表基因集的ML方法细节和内在预测能力。一组标准的临床ICI生物标志物(MLH、MSH和CD8基因,加上PMS2和PD-L1)以及原始数据集的从头计算主成分(PC)也作为额外的比较点纳入。九篇文章被确定符合纳入标准。其中三篇是泛癌研究,五篇评估GC,一篇研究结直肠癌(CRC)。使用分类和回归模型预测ICI疗效。接下来,我们使用LR,根据已报道的接受ICI治疗的胃癌(GC)数据集(n = 45)上的曲线下面积(AUC)值,验证应用的ML算法对RNA特征的预测能力。在两个案例中,我们的方法优于已发表的结果(报道/LR比较:0.74/0.831,0.67/0.735)。除了已发表的研究外,我们还纳入了两个基准:一组TCB以及基于整个数据集使用主成分(PCA,99%解释方差,40个成分)。有趣的是,一项使用选定基因集(免疫肿瘤学面板)且AUC = 0.83的研究是唯一优于TCB(AUC = 0.8)和PCA(AUC = 0.81)结果的研究。在同一GC数据集上对这些基因的预测性能进行交叉验证,并在一个由375例接受手术切除或化疗的患者组成的整理后的多队列GC数据集中研究它们的预后作用,结果显示,甘露糖-6-磷酸受体(M6PR)、吲哚胺2,3-双加氧酶1(IDO1)、神经纤毛蛋白-1(NRP1)和MAGEA3的表现与既定生物标志物如PD-L1和微卫星高度不稳定(MSI)相似,或优于它们。我们发现一个AUC =