• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

将机器学习中的手工特征与潜在变量相结合,以预测放射性肺损伤。

Combining handcrafted features with latent variables in machine learning for prediction of radiation-induced lung damage.

机构信息

Applied Physics Program, University of Michigan, Ann Arbor, MI, USA.

Department of Radiation Oncology, University of Michigan, Ann Arbor, MI, USA.

出版信息

Med Phys. 2019 May;46(5):2497-2511. doi: 10.1002/mp.13497. Epub 2019 Apr 8.

DOI:10.1002/mp.13497
PMID:30891794
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC6510637/
Abstract

PURPOSE

There has been burgeoning interest in applying machine learning methods for predicting radiotherapy outcomes. However, the imbalanced ratio of a large number of variables to a limited sample size in radiation oncology constitutes a major challenge. Therefore, dimensionality reduction methods can be a key to success. The study investigates and contrasts the application of traditional machine learning methods and deep learning approaches for outcome modeling in radiotherapy. In particular, new joint architectures based on variational autoencoder (VAE) for dimensionality reduction are presented and their application is demonstrated for the prediction of lung radiation pneumonitis (RP) from a large-scale heterogeneous dataset.

METHODS

A large-scale heterogeneous dataset containing a pool of 230 variables including clinical factors (e.g., dose, KPS, stage) and biomarkers (e.g., single nucleotide polymorphisms (SNPs), cytokines, and micro-RNAs) in a population of 106 nonsmall cell lung cancer (NSCLC) patients who received radiotherapy was used for modeling RP. Twenty-two patients had grade 2 or higher RP. Four methods were investigated, including feature selection (case A) and feature extraction (case B) with traditional machine learning methods, a VAE-MLP joint architecture (case C) with deep learning and lastly, the combination of feature selection and joint architecture (case D). For feature selection, Random forest (RF), Support Vector Machine (SVM), and multilayer perceptron (MLP) were implemented to select relevant features. Specifically, each method was run for multiple times to rank features within several cross-validated (CV) resampled sets. A collection of ranking lists were then aggregated by top 5% and Kemeny graph methods to identify the final ranking for prediction. A synthetic minority oversampling technique was applied to correct for class imbalance during this process. For deep learning, a VAE-MLP joint architecture where a VAE aimed for dimensionality reduction and an MLP aimed for classification was developed. In this architecture, reconstruction loss and prediction loss were combined into a single loss function to realize simultaneous training and weights were assigned to different classes to mitigate class imbalance. To evaluate the prediction performance and conduct comparisons, the area under receiver operating characteristic curves (AUCs) were performed for nested CVs for both handcrafted feature selections and the deep learning approach. The significance of differences in AUCs was assessed using the DeLong test of U-statistics.

RESULTS

An MLP-based method using weight pruning (WP) feature selection yielded the best performance among the different hand-crafted feature selection methods (case A), reaching an AUC of 0.804 (95% CI: 0.761-0.823) with 29 top features. A VAE-MLP joint architecture (case C) achieved a comparable but slightly lower AUC of 0.781 (95% CI: 0.737-0.808) with the size of latent dimension being 2. The combination of handcrafted features (case A) and latent representation (case D) achieved a significant AUC improvement of 0.831 (95% CI: 0.805-0.863) with 22 features (P-value = 0.000642 compared with handcrafted features only (Case A) and P-value = 0.000453 compared to VAE alone (Case C)) with an MLP classifier.

CONCLUSION

The potential for combination of traditional machine learning methods and deep learning VAE techniques has been demonstrated for dealing with limited datasets in modeling radiotherapy toxicities. Specifically, latent variables from a VAE-MLP joint architecture are able to complement handcrafted features for the prediction of RP and improve prediction over either method alone.

摘要

目的

应用机器学习方法预测放疗结果的兴趣日益浓厚。然而,放射肿瘤学中大量变量与有限样本量之间的不平衡比例是一个主要挑战。因此,降维方法可以是成功的关键。本研究调查并对比了传统机器学习方法和深度学习方法在放疗结果建模中的应用。特别是,提出了基于变分自动编码器(VAE)的新联合架构,用于从大规模异质数据集预测肺放射性肺炎(RP)。

方法

使用包含 230 个变量的大型异质数据集,包括 106 名非小细胞肺癌(NSCLC)患者的临床因素(如剂量、KPS、分期)和生物标志物(如单核苷酸多态性(SNP)、细胞因子和 micro-RNAs),对 RP 进行建模。22 名患者出现 2 级或更高的 RP。研究了四种方法,包括特征选择(病例 A)和特征提取(病例 B)的传统机器学习方法、具有深度学习的 VAE-MLP 联合架构(病例 C)以及最后,特征选择和联合架构的组合(病例 D)。对于特征选择,实施了随机森林(RF)、支持向量机(SVM)和多层感知器(MLP)来选择相关特征。具体来说,每种方法都进行了多次运行,以在几个交叉验证(CV)重采样集中对特征进行排名。然后通过前 5%和 Kemeny 图方法对排名列表进行聚合,以确定最终的预测排名。在此过程中应用了合成少数群体过采样技术来纠正类别不平衡。对于深度学习,开发了一种 VAE-MLP 联合架构,其中 VAE 旨在降维,MLP 旨在分类。在该架构中,重建损失和预测损失被组合到单个损失函数中,以实现同时训练,并为不同的类别分配权重,以减轻类别不平衡。为了评估预测性能并进行比较,对基于手工艺品的特征选择和深度学习方法进行了嵌套 CV 的接收器操作特征曲线(AUC)的评估。使用 U 统计量的 DeLong 检验评估 AUC 差异的显著性。

结果

基于 MLP 的方法使用权重剪枝(WP)特征选择(病例 A)在不同的手工特征选择方法中表现最佳,达到 AUC 为 0.804(95%CI:0.761-0.823),具有 29 个顶级特征。VAE-MLP 联合架构(病例 C)达到了相当但略低的 AUC 为 0.781(95%CI:0.737-0.808),潜在维度大小为 2。手工特征(病例 A)和潜在表示(病例 D)的组合在具有 22 个特征时,AUC 有显著提高(0.831,95%CI:0.805-0.863)(与仅使用手工特征(病例 A)相比,P 值=0.000642,与单独使用 VAE(病例 C)相比,P 值=0.000453),使用 MLP 分类器。

结论

已经证明了传统机器学习方法和深度学习 VAE 技术的组合具有处理建模放疗毒性的有限数据集的潜力。具体来说,VAE-MLP 联合架构中的潜在变量能够补充手工特征,提高对 RP 的预测,并提高任何单一方法的预测能力。

相似文献

1
Combining handcrafted features with latent variables in machine learning for prediction of radiation-induced lung damage.将机器学习中的手工特征与潜在变量相结合,以预测放射性肺损伤。
Med Phys. 2019 May;46(5):2497-2511. doi: 10.1002/mp.13497. Epub 2019 Apr 8.
2
Machine learning algorithms for outcome prediction in (chemo)radiotherapy: An empirical comparison of classifiers.机器学习算法在(放化疗)治疗结果预测中的应用:分类器的实证比较。
Med Phys. 2018 Jul;45(7):3449-3459. doi: 10.1002/mp.12967. Epub 2018 Jun 13.
3
Multi-institutional dose-segmented dosiomic analysis for predicting radiation pneumonitis after lung stereotactic body radiation therapy.多机构剂量分段剂量组学分析预测肺立体定向体部放疗后放射性肺炎
Med Phys. 2021 Apr;48(4):1781-1791. doi: 10.1002/mp.14769. Epub 2021 Mar 2.
4
Predicting radiation pneumonitis in locally advanced stage II-III non-small cell lung cancer using machine learning.使用机器学习预测局部晚期 II-III 期非小细胞肺癌的放射性肺炎。
Radiother Oncol. 2019 Apr;133:106-112. doi: 10.1016/j.radonc.2019.01.003. Epub 2019 Jan 23.
5
Enhancing the prediction of IDC breast cancer staging from gene expression profiles using hybrid feature selection methods and deep learning architecture.使用混合特征选择方法和深度学习架构增强从基因表达谱预测浸润性导管癌乳腺癌分期的能力。
Med Biol Eng Comput. 2023 Nov;61(11):2895-2919. doi: 10.1007/s11517-023-02892-1. Epub 2023 Aug 2.
6
Bayesian network ensemble as a multivariate strategy to predict radiation pneumonitis risk.贝叶斯网络集成作为预测放射性肺炎风险的多变量策略。
Med Phys. 2015 May;42(5):2421-30. doi: 10.1118/1.4915284.
7
Supervised learning applied to classifying fallers versus non-fallers among older adults with cancer.应用于对老年癌症患者中跌倒者和非跌倒者进行分类的有监督学习。
J Geriatr Oncol. 2023 May;14(4):101498. doi: 10.1016/j.jgo.2023.101498. Epub 2023 Apr 19.
8
Application of machine learning model in predicting the likelihood of blood transfusion after hip fracture surgery.机器学习模型在预测髋部骨折手术后输血可能性中的应用。
Aging Clin Exp Res. 2023 Nov;35(11):2643-2656. doi: 10.1007/s40520-023-02550-4. Epub 2023 Sep 21.
9
Joint modeling strategy for using electronic medical records data to build machine learning models: an example of intracerebral hemorrhage.利用电子病历数据构建机器学习模型的联合建模策略:以脑出血为例。
BMC Med Inform Decis Mak. 2022 Oct 25;22(1):278. doi: 10.1186/s12911-022-02018-x.
10
Class-imbalanced crash prediction based on real-time traffic and weather data: A driving simulator study.基于实时交通和天气数据的不平衡碰撞预测:驾驶模拟器研究。
Traffic Inj Prev. 2020;21(3):201-208. doi: 10.1080/15389588.2020.1723794. Epub 2020 Mar 3.

引用本文的文献

1
Predictive value of machine learning for radiation pneumonitis and checkpoint inhibitor pneumonitis in lung cancer patients: a systematic review and meta-analysis.机器学习对肺癌患者放射性肺炎和检查点抑制剂肺炎的预测价值:一项系统评价和荟萃分析。
Sci Rep. 2025 Jul 1;15(1):20961. doi: 10.1038/s41598-025-05505-z.
2
Deep learning combining imaging, dose and clinical data for predicting bowel toxicity after pelvic radiotherapy.深度学习结合影像、剂量和临床数据用于预测盆腔放疗后的肠道毒性。
Phys Imaging Radiat Oncol. 2025 Feb 1;33:100710. doi: 10.1016/j.phro.2025.100710. eCollection 2025 Jan.
3
Machine learning approaches to predict the need for intensive care unit admission among Iranian COVID-19 patients based on ICD-10: A cross-sectional study.基于国际疾病分类第10版(ICD-10),采用机器学习方法预测伊朗新冠肺炎患者重症监护病房入院需求的横断面研究。
Health Sci Rep. 2024 Sep 2;7(9):e70041. doi: 10.1002/hsr2.70041. eCollection 2024 Sep.
4
Fostering Transformation: Unleashing the Power of Artifical Intelligence and Machine Learning in the Field of Radiation Oncology.促进变革:释放人工智能和机器学习在放射肿瘤学领域的力量。
Indian J Otolaryngol Head Neck Surg. 2024 Aug;76(4):3750-3754. doi: 10.1007/s12070-024-04658-z. Epub 2024 Apr 13.
5
Deep-Learning Model Prediction of Radiation Pneumonitis Using Pretreatment Chest Computed Tomography and Clinical Factors.使用预处理胸部 CT 和临床因素的深度学习模型预测放射性肺炎。
Technol Cancer Res Treat. 2024 Jan-Dec;23:15330338241254060. doi: 10.1177/15330338241254060.
6
SH3GL2 and MMP17 as lung adenocarcinoma biomarkers: a machine-learning based approach.SH3GL2和MMP17作为肺腺癌生物标志物:一种基于机器学习的方法。
Biochem Biophys Rep. 2024 Mar 25;38:101693. doi: 10.1016/j.bbrep.2024.101693. eCollection 2024 Jul.
7
Improved outcome models with denoising diffusion.采用去噪扩散的改进结果模型。
Phys Med. 2024 Mar;119:103307. doi: 10.1016/j.ejmp.2024.103307. Epub 2024 Feb 6.
8
AI/ML advances in non-small cell lung cancer biomarker discovery.人工智能/机器学习在非小细胞肺癌生物标志物发现方面的进展。
Front Oncol. 2023 Dec 11;13:1260374. doi: 10.3389/fonc.2023.1260374. eCollection 2023.
9
Artificial intelligence (AI) and machine learning (ML) in precision oncology: a review on enhancing discoverability through multiomics integration.人工智能(AI)和机器学习(ML)在精准肿瘤学中的应用:通过多组学整合提高可发现性的综述。
Br J Radiol. 2023 Oct;96(1150):20230211. doi: 10.1259/bjr.20230211. Epub 2023 Sep 3.
10
A deep learning approach for morphological feature extraction based on variational auto-encoder: an application to mandible shape.基于变分自编码器的形态特征提取深度学习方法:在颌骨形状中的应用。
NPJ Syst Biol Appl. 2023 Jul 6;9(1):30. doi: 10.1038/s41540-023-00293-6.

本文引用的文献

1
Unraveling biophysical interactions of radiation pneumonitis in non-small-cell lung cancer via Bayesian network analysis.通过贝叶斯网络分析揭示非小细胞肺癌放射性肺炎的生物物理相互作用
Radiother Oncol. 2017 Apr;123(1):85-92. doi: 10.1016/j.radonc.2017.02.004. Epub 2017 Feb 22.
2
Mastering the game of Go with deep neural networks and tree search.用深度神经网络和树搜索掌握围棋游戏。
Nature. 2016 Jan 28;529(7587):484-9. doi: 10.1038/nature16961.
3
MicroRNAs, TGF-β signaling, and the inflammatory microenvironment in cancer.微小RNA、转化生长因子-β信号传导与癌症中的炎症微环境
Tumour Biol. 2016 Jan;37(1):115-25. doi: 10.1007/s13277-015-4374-2. Epub 2015 Nov 12.
4
Deep learning.深度学习。
Nature. 2015 May 28;521(7553):436-44. doi: 10.1038/nature14539.
5
Bayesian network ensemble as a multivariate strategy to predict radiation pneumonitis risk.贝叶斯网络集成作为预测放射性肺炎风险的多变量策略。
Med Phys. 2015 May;42(5):2421-30. doi: 10.1118/1.4915284.
6
Transparent reporting of a multivariable prediction model for individual prognosis or diagnosis (TRIPOD): the TRIPOD statement.个体预后或诊断多变量预测模型的透明报告(TRIPOD):TRIPOD声明
Br J Cancer. 2015 Jan 20;112(2):251-9. doi: 10.1038/bjc.2014.639. Epub 2015 Jan 6.
7
Suggestion for a new grading scale for radiation induced pneumonitis based on radiological findings of computerized tomography: correlation with clinical and radiotherapeutic parameters in lung cancer patients.基于计算机断层扫描影像学表现的放射性肺炎新分级量表建议:与肺癌患者临床及放射治疗参数的相关性
Asian Pac J Cancer Prev. 2013;14(5):2717-22. doi: 10.7314/apjcp.2013.14.5.2717.
8
Representation learning: a review and new perspectives.表示学习:综述与新视角。
IEEE Trans Pattern Anal Mach Intell. 2013 Aug;35(8):1798-828. doi: 10.1109/TPAMI.2013.50.
9
Misuse of DeLong test to compare AUCs for nested models.误用 Delong 检验比较嵌套模型的 AUC。
Stat Med. 2012 Oct 15;31(23):2577-87. doi: 10.1002/sim.5328. Epub 2012 Mar 13.
10
Genetic variation in the TGF-β signaling pathway and colon and rectal cancer risk.转化生长因子-β 信号通路的遗传变异与结直肠癌风险。
Cancer Epidemiol Biomarkers Prev. 2011 Jan;20(1):57-69. doi: 10.1158/1055-9965.EPI-10-0843. Epub 2010 Nov 10.