Department of Pharmacy, Huashan Hospital, Fudan University, Shanghai, China.
Department of Pharmacy, Baoshan Campus of Huashan Hospital, Fudan University, Shanghai, China.
Sci Rep. 2024 Nov 12;14(1):27700. doi: 10.1038/s41598-024-79036-4.
Fluorouracil-based chemotherapy responses in colorectal cancer (CRC) patients vary widely, highlighting the role of pharmacogenomics in developing better predictive models. We analyzed 379 CRC patients receiving fluorouracil-based chemotherapy, collecting data on fluorouracil metabolism-related SNPs (TYMS, MTHFR, DPYD, RRM1), blood inflammatory markers, and clinical status. Six machine learning models-K-nearest neighbors, support vector machine, gradient boosting decision trees (GBDT), eXtreme Gradient Boosting (XGBoost), LightGBM, and random forest-were compared against multivariate logistic regression and a deep learning model (i.e., multilayer perceptron, MLP). Feature importance analysis highlighted seven predictors: histological grade, N and M staging, monocyte count, platelet-to-lymphocyte ratio, MTHFR rs1801131, and RRM1 rs11030918. In a five-fold cross-validation, XGBoost and GBDT exhibited superior performance, with Area Under Curve (AUC) of 0.88 ± 0.02. XGBoost excelled in identifying favorable prognosis (recall = 0.939). GBDT demonstrated balance in recognizing both categories, with a recall for favorable prognosis of 0.908 and a precision for unfavorable prognosis of 0.863. MLP had a similar AUC (0.87) with high precision for favorable prognosis (recall = 0.946). In external validation, XGBoost model achieved an accuracy of 0.79. An online prognostic tool based on XGBoost was developed, integrating metabolism-related SNPs and inflammatory markers, enhancing CRC treatment precision and supporting tailored chemotherapy.
氟尿嘧啶为基础的化疗在结直肠癌(CRC)患者中的反应差异很大,这凸显了药物基因组学在开发更好的预测模型中的作用。我们分析了 379 名接受氟尿嘧啶为基础的化疗的 CRC 患者,收集了与氟尿嘧啶代谢相关的 SNP(TYMS、MTHFR、DPYD、RRM1)、血液炎症标志物和临床状态的数据。我们比较了 6 种机器学习模型(K-最近邻、支持向量机、梯度提升决策树(GBDT)、极端梯度提升(XGBoost)、LightGBM 和随机森林)与多变量逻辑回归和深度学习模型(即多层感知器,MLP)。特征重要性分析突出了七个预测因素:组织学分级、N 和 M 分期、单核细胞计数、血小板与淋巴细胞比值、MTHFR rs1801131 和 RRM1 rs11030918。在五折交叉验证中,XGBoost 和 GBDT 表现出优异的性能,曲线下面积(AUC)为 0.88±0.02。XGBoost 在识别有利预后方面表现出色(召回率=0.939)。GBDT 在识别这两个类别方面表现出平衡,有利预后的召回率为 0.908,不利预后的精度为 0.863。MLP 的 AUC(0.87)相似,对有利预后的精度较高(召回率=0.946)。在外部验证中,XGBoost 模型的准确率为 0.79。基于 XGBoost 开发了一个在线预后工具,整合了代谢相关的 SNP 和炎症标志物,提高了 CRC 治疗的精准度,并支持量身定制的化疗。