机器学习算法在（放化疗）治疗结果预测中的应用：分类器的实证比较。

Machine learning algorithms for outcome prediction in (chemo)radiotherapy: An empirical comparison of classifiers.

机构信息

The D-lab: Decision Support for Precision Medicine, GROW - School for Oncology and Developmental Biology, Maastricht University Medical Centre+, Universiteitssingel 40, 6229 ER, Maastricht, The Netherlands.

Department of Radiation Oncology, GROW, School for Oncology and Developmental Biology, Maastricht University Medical Center, Maastricht, The Netherlands.

出版信息

Med Phys. 2018 Jul;45(7):3449-3459. doi: 10.1002/mp.12967. Epub 2018 Jun 13.

DOI:10.1002/mp.12967

PMID:29763967

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC6095141/

Abstract

PURPOSE

Machine learning classification algorithms (classifiers) for prediction of treatment response are becoming more popular in radiotherapy literature. General Machine learning literature provides evidence in favor of some classifier families (random forest, support vector machine, gradient boosting) in terms of classification performance. The purpose of this study is to compare such classifiers specifically for (chemo)radiotherapy datasets and to estimate their average discriminative performance for radiation treatment outcome prediction.

METHODS

We collected 12 datasets (3496 patients) from prior studies on post-(chemo)radiotherapy toxicity, survival, or tumor control with clinical, dosimetric, or blood biomarker features from multiple institutions and for different tumor sites, that is, (non-)small-cell lung cancer, head and neck cancer, and meningioma. Six common classification algorithms with built-in feature selection (decision tree, random forest, neural network, support vector machine, elastic net logistic regression, LogitBoost) were applied on each dataset using the popular open-source R package caret. The R code and documentation for the analysis are available online (https://github.com/timodeist/classifier_selection_code). All classifiers were run on each dataset in a 100-repeated nested fivefold cross-validation with hyperparameter tuning. Performance metrics (AUC, calibration slope and intercept, accuracy, Cohen's kappa, and Brier score) were computed. We ranked classifiers by AUC to determine which classifier is likely to also perform well in future studies. We simulated the benefit for potential investigators to select a certain classifier for a new dataset based on our study (pre-selection based on other datasets) or estimating the best classifier for a dataset (set-specific selection based on information from the new dataset) compared with uninformed classifier selection (random selection).

RESULTS

Random forest (best in 6/12 datasets) and elastic net logistic regression (best in 4/12 datasets) showed the overall best discrimination, but there was no single best classifier across datasets. Both classifiers had a median AUC rank of 2. Preselection and set-specific selection yielded a significant average AUC improvement of 0.02 and 0.02 over random selection with an average AUC rank improvement of 0.42 and 0.66, respectively.

CONCLUSION

Random forest and elastic net logistic regression yield higher discriminative performance in (chemo)radiotherapy outcome and toxicity prediction than other studied classifiers. Thus, one of these two classifiers should be the first choice for investigators when building classification models or to benchmark one's own modeling results against. Our results also show that an informed preselection of classifiers based on existing datasets can improve discrimination over random selection.

摘要

目的

机器学习分类算法（分类器）在放射治疗文献中越来越受欢迎，用于预测治疗反应。一般的机器学习文献提供了一些分类器家族（随机森林、支持向量机、梯度提升）在分类性能方面的证据。本研究的目的是专门比较这些分类器，特别是对于（放化疗）数据集，并估计它们对放射治疗结果预测的平均判别性能。

方法

我们从多个机构和不同肿瘤部位（非小细胞肺癌、头颈部癌症和脑膜瘤）的多个机构收集了 12 个数据集（3496 名患者），这些数据集包含临床、剂量学或血液生物标志物特征，用于研究放化疗后毒性、生存或肿瘤控制情况。我们使用流行的开源 R 包 caret 在每个数据集上应用了六种具有内置特征选择的常见分类算法（决策树、随机森林、神经网络、支持向量机、弹性网络逻辑回归、LogitBoost）。所有分类器都在每个数据集上进行了 100 次重复嵌套五折交叉验证和超参数调整。计算了性能指标（AUC、校准斜率和截距、准确性、Cohen's kappa 和 Brier 评分）。我们根据 AUC 对分类器进行排名，以确定哪种分类器在未来的研究中也可能表现良好。我们模拟了潜在研究者基于我们的研究（基于其他数据集的预选）选择特定分类器的好处，或者基于新数据集的信息估计数据集的最佳分类器（基于数据集的选择），而不是盲目选择分类器。

结果

随机森林（在 6/12 个数据集中表现最好）和弹性网络逻辑回归（在 4/12 个数据集中表现最好）表现出总体最佳的判别能力，但没有一种分类器在所有数据集中都是最佳的。这两种分类器的中位数 AUC 排名均为 2。预选和特定于数据集的选择导致 AUC 的平均改善分别为 0.02 和 0.02，AUC 排名的平均改善分别为 0.42 和 0.66。

结论

随机森林和弹性网络逻辑回归在（放化疗）治疗结果和毒性预测中的判别性能优于其他研究的分类器。因此，当研究人员构建分类模型或根据自己的建模结果进行基准测试时，这两种分类器中的一种应该是首选。我们的结果还表明，基于现有数据集的分类器的信息预选可以提高随机选择的判别能力。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dac4/9559917/0ce43a89924c/MP-45-3449-g004.jpg

相似文献

Machine learning algorithms for outcome prediction in (chemo)radiotherapy: An empirical comparison of classifiers.机器学习算法在（放化疗）治疗结果预测中的应用：分类器的实证比较。

Med Phys. 2018 Jul;45(7):3449-3459. doi: 10.1002/mp.12967. Epub 2018 Jun 13.

Can Predictive Modeling Tools Identify Patients at High Risk of Prolonged Opioid Use After ACL Reconstruction?预测模型工具能否识别 ACL 重建术后阿片类药物使用时间延长的高风险患者？

Clin Orthop Relat Res. 2020 Jul;478(7):0-1618. doi: 10.1097/CORR.0000000000001251.

Combining handcrafted features with latent variables in machine learning for prediction of radiation-induced lung damage.将机器学习中的手工特征与潜在变量相结合，以预测放射性肺损伤。

Med Phys. 2019 May;46(5):2497-2511. doi: 10.1002/mp.13497. Epub 2019 Apr 8.

Breast cancer prediction with transcriptome profiling using feature selection and machine learning methods.基于转录组谱特征选择和机器学习方法的乳腺癌预测。

BMC Bioinformatics. 2022 Oct 1;23(1):410. doi: 10.1186/s12859-022-04965-8.

Machine Learning Hybrid Model for the Prediction of Chronic Kidney Disease.机器学习混合模型预测慢性肾脏病。

Comput Intell Neurosci. 2023 Mar 14;2023:9266889. doi: 10.1155/2023/9266889. eCollection 2023.

Ensemble machine learning model trained on a new synthesized dataset generalizes well for stress prediction using wearable devices.在新合成数据集上训练的集成机器学习模型，对于使用可穿戴设备进行压力预测具有良好的泛化能力。

J Biomed Inform. 2023 Dec;148:104556. doi: 10.1016/j.jbi.2023.104556. Epub 2023 Dec 2.

Optimizing prognostic factors of five-year survival in gastric cancer patients using feature selection techniques with machine learning algorithms: a comparative study.使用机器学习算法进行特征选择技术优化胃癌患者五年生存率的预后因素：一项比较研究。

BMC Med Inform Decis Mak. 2023 Apr 6;23(1):54. doi: 10.1186/s12911-023-02154-y.

Clinico-radiological characteristic-based machine learning in reducing unnecessary prostate biopsies of PI-RADS 3 lesions with dual validation.基于临床-放射学特征的机器学习在双重验证下减少 PI-RADS 3 病变前列腺活检的必要性。

Eur Radiol. 2020 Nov;30(11):6274-6284. doi: 10.1007/s00330-020-06958-8. Epub 2020 Jun 10.

Unveiling the potential of machine learning approaches in predicting the emergence of stroke at its onset: a predicting framework.揭示机器学习方法在预测中风发病时出现的潜力：一个预测框架。

Sci Rep. 2024 Aug 29;14(1):20053. doi: 10.1038/s41598-024-70354-1.

Application of information theoretic feature selection and machine learning methods for the development of genetic risk prediction models.信息论特征选择和机器学习方法在遗传风险预测模型开发中的应用。

Sci Rep. 2021 Dec 2;11(1):23335. doi: 10.1038/s41598-021-00854-x.

引用本文的文献

FAR1 as a ferroptosis-related biomarker and potential therapeutic target in acute kidney injury: integrated bioinformatics and experimental validation.FAR1作为急性肾损伤中与铁死亡相关的生物标志物和潜在治疗靶点：综合生物信息学与实验验证

Ren Fail. 2025 Dec;47(1):2547260. doi: 10.1080/0886022X.2025.2547260. Epub 2025 Aug 19.

Personalized diagnosis of radiation pneumonitis in breast cancer patients based on radiomics.基于影像组学的乳腺癌患者放射性肺炎的个性化诊断

Front Oncol. 2025 Jul 22;15:1609421. doi: 10.3389/fonc.2025.1609421. eCollection 2025.

Research on the application of distinguishing between benign and malignant breast nodules using MRI and US radiomics.利用MRI和超声影像组学鉴别乳腺良恶性结节的应用研究

Front Oncol. 2025 Jul 16;15:1630583. doi: 10.3389/fonc.2025.1630583. eCollection 2025.

Harnessing Radiomics and Explainable AI for the Classification of Usual and Nonspecific Interstitial Pneumonia.利用放射组学和可解释人工智能对普通型和非特异性间质性肺炎进行分类。

J Clin Med. 2025 Jul 11;14(14):4934. doi: 10.3390/jcm14144934.

An explainable AI approach to surgical and radiotherapy interventions for optimized treatment decision-making in early-stage non-small cell lung cancer.一种用于早期非小细胞肺癌手术和放射治疗干预以优化治疗决策的可解释人工智能方法。

Transl Lung Cancer Res. 2025 Jun 30;14(6):2011-2030. doi: 10.21037/tlcr-2025-152. Epub 2025 Jun 26.

Sphingolipid metabolism-related genes for the diagnosis of metabolic syndrome by integrated bioinformatics analysis and Mendelian randomization identification.通过综合生物信息学分析和孟德尔随机化鉴定用于代谢综合征诊断的鞘脂代谢相关基因。

Diabetol Metab Syndr. 2025 Jun 19;17(1):234. doi: 10.1186/s13098-025-01803-8.

Utilizing bioinformatics to identify biomarkers and analyze their expression in relation to immune cell ratios in femoral head necrosis.利用生物信息学鉴定生物标志物并分析其在股骨头坏死中与免疫细胞比例相关的表达。

Front Physiol. 2025 Apr 16;16:1373721. doi: 10.3389/fphys.2025.1373721. eCollection 2025.

Characterization of lactylation-based phenotypes and molecular biomarkers in sepsis-associated acute respiratory distress syndrome.脓毒症相关急性呼吸窘迫综合征中基于乳酰化的表型和分子生物标志物的特征分析

Sci Rep. 2025 Apr 22;15(1):13831. doi: 10.1038/s41598-025-96969-6.

From images to clinical insights: an educational review on radiomics in lung diseases.从图像到临床见解：关于肺部疾病放射组学的教育性综述

Breathe (Sheff). 2025 Mar 18;21(1):230225. doi: 10.1183/20734735.0225-2023. eCollection 2025 Jan.

Disulfidptosis classification of pancreatic carcinoma reveals correlation with clinical prognosis and immune profile.胰腺癌的二硫化物诱导细胞程序性坏死分类揭示了与临床预后和免疫特征的相关性。

Hereditas. 2025 Feb 22;162(1):26. doi: 10.1186/s41065-025-00381-z.

本文引用的文献

Predicting acute odynophagia during lung cancer radiotherapy using observations derived from patient-centred nursing care.利用以患者为中心的护理观察结果预测肺癌放疗期间的急性吞咽痛

Tech Innov Patient Support Radiat Oncol. 2018 Feb 22;5:16-20. doi: 10.1016/j.tipsro.2018.01.002. eCollection 2018 Mar.

Data-driven advice for applying machine learning to bioinformatics problems.将机器学习应用于生物信息学问题的基于数据的建议。

Pac Symp Biocomput. 2018;23:192-203.

Inclusion of Incidental Radiation Dose to the Cardiac Atria and Ventricles Does Not Improve the Prediction of Radiation Pneumonitis in Advanced-Stage Non-Small Cell Lung Cancer Patients Treated With Intensity Modulated Radiation Therapy.将心房和心室的偶然辐射剂量纳入考量并不能改善对接受调强放射治疗的晚期非小细胞肺癌患者放射性肺炎的预测。

Int J Radiat Oncol Biol Phys. 2017 Oct 1;99(2):434-441. doi: 10.1016/j.ijrobp.2017.04.011. Epub 2017 Apr 19.

Developing and Validating a Survival Prediction Model for NSCLC Patients Through Distributed Learning Across 3 Countries.通过三个国家的分布式学习开发并验证非小细胞肺癌患者的生存预测模型

Int J Radiat Oncol Biol Phys. 2017 Oct 1;99(2):344-352. doi: 10.1016/j.ijrobp.2017.04.021. Epub 2017 Apr 24.

Reirradiation of head and neck cancer: Long-term disease control and toxicity.头颈部癌的再程放疗：长期疾病控制与毒性反应

Head Neck. 2017 Jun;39(6):1122-1130. doi: 10.1002/hed.24733. Epub 2017 Mar 6.

MediBoost: a Patient Stratification Tool for Interpretable Decision Making in the Era of Precision Medicine.MediBoost：精准医学时代可解释决策的患者分层工具。

Sci Rep. 2016 Nov 30;6:37854. doi: 10.1038/srep37854.

Prognostic value of blood-biomarkers related to hypoxia, inflammation, immune response and tumour load in non-small cell lung cancer - A survival model with external validation.非小细胞肺癌中与缺氧、炎症、免疫反应和肿瘤负荷相关的血液生物标志物的预后价值——一项具有外部验证的生存模型

Radiother Oncol. 2016 Jun;119(3):487-94. doi: 10.1016/j.radonc.2016.04.024. Epub 2016 Apr 29.

Implementation of a rapid learning platform: Predicting 2-year survival in laryngeal carcinoma patients in a clinical setting.快速学习平台的实施：在临床环境中预测喉癌患者的2年生存率。

Oncotarget. 2016 Jun 14;7(24):37288-37296. doi: 10.18632/oncotarget.8755.

Multivariable normal-tissue complication modeling of acute esophageal toxicity in advanced stage non-small cell lung cancer patients treated with intensity-modulated (chemo-)radiotherapy.晚期非小细胞肺癌患者接受调强（化疗）放疗后急性食管毒性的多变量正常组织并发症建模

Radiother Oncol. 2015 Oct;117(1):49-54. doi: 10.1016/j.radonc.2015.08.010. Epub 2015 Sep 2.

Machine Learning methods for Quantitative Radiomic Biomarkers.用于定量放射组学生物标志物的机器学习方法。

Sci Rep. 2015 Aug 17;5:13087. doi: 10.1038/srep13087.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

机器学习算法在（放化疗）治疗结果预测中的应用：分类器的实证比较。

Machine learning algorithms for outcome prediction in (chemo)radiotherapy: An empirical comparison of classifiers.

机构信息

出版信息

PURPOSE

METHODS

RESULTS

CONCLUSION

目的

方法

结果

结论

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献