Stenhouse Kailyn, McGeachy Philip, Spampinato Sofia, Tanderup Kari, Kirchheiner Kathrin, Martell Kevin, Quirk Sarah, Viswanathan Akila N, Roumeliotis Michael
Department of Physics and Astronomy, University of Calgary, Calgary, Alberta, Canada.
Department of Medical Physics, Arthur J.E. Child Comprehensive Cancer Centre, Calgary, Alberta, Canada.
Med Phys. 2025 Jun;52(6):5051-5063. doi: 10.1002/mp.17881. Epub 2025 May 21.
Bayesian networks are seeing increased usage in healthcare, particularly for modeling complex treatment decisions under uncertainty. Bayesian networks offer significant advantages over classical machine learning and deep learning techniques due to their interpretability, with the network visualized through a directed acyclic graph outlining conditional relationships. Prior clinical knowledge can also be incorporated into these networks to enhance their clarity and facilitate integration into clinical workflows. However, out-of-box optimization techniques may produce networks that are not logically coherent or reflective of clinical understanding and may focus solely on optimizing information-based metrics without consideration for performance metrics crucial for developing predictive models. In late morbidity modeling, where the risk factors surrounding an outcome may be complex, intercorrelated, and not yet fully identified, it is important to have a customizable optimization approach to automatically produce logical, interpretable Bayesian networks that outline these complex outcomes.
Develop a simulated annealing-based framework for developing Bayesian network structures for late morbidity prediction in cervical cancer patients, addressing limitations of traditional optimization techniques and prioritizing interpretability.
This study utilizes the multi-center EMBRACE I cervical cancer dataset (n = 1153) to develop Bayesian network structures for late moderate-to-severe (grade ≥2) cystitis (CTCAEv.3) prediction. The dataset was split into training/validation data (80%) and holdout test data (20%). A process of 10 × 5-fold cross-validation was integrated into the optimization framework. A simulated annealing-based optimization method was developed incorporating information-theoretic measures, predictive performance measures, and complexity measures. The different network structures developed by this framework were compared in terms of complexity, interpretability, and predictive performance to optimization methods available out-of-box from the PyAgrum package for Python (Greedy Hill Climbing, Tree-Augmented Naïve Bayes, and Chow-Liu Optimization). Bayesian networks were also compared to conventional machine learning classifiers in terms of feature importance and predictive performance. Differences in model predictions arising from structure differences were assessed with Cochran's Q-test (p < 0.05).
The simulated annealing framework demonstrated the ability to produce Bayesian network structures with comparable or superior predictive performance compared to out-of-box models. A statistically significant performance difference was identified between the simulated annealing and out-of-box methods with Cochran's Q-test (p = 0.03). The simulated annealing approach equalled or outperformed out-of-box models on a bootstrapped holdout test set, with a balanced accuracy of 64.1%, an F1 macro score of 55.9%, and an ROC-AUC of 0.66. Simulated annealing models also featured fewer arcs and nodes, with this simplification resulting in networks that were easier to interpret without compromising on predictive performance, highlighting the effectiveness of simulated annealing in creating highly interpretable models for clinical use.
The proposed simulated annealing-based framework represents a novel method for automatically generating Bayesian network structures for cervical cancer late morbidity modeling. Compared to out-of-box optimization techniques, the simulated annealing Bayesian networks provide comparable or superior predictive performance while constructing a more simple, interpretable network useful for clinical implementation.
贝叶斯网络在医疗保健领域的应用日益广泛,特别是用于对不确定性下的复杂治疗决策进行建模。与传统机器学习和深度学习技术相比,贝叶斯网络具有显著优势,因为其具有可解释性,网络通过有向无环图可视化,该图勾勒出条件关系。先前的临床知识也可以纳入这些网络,以提高其清晰度并促进融入临床工作流程。然而,现成的优化技术可能会产生逻辑不连贯或不符合临床理解的网络,并且可能仅专注于优化基于信息的指标,而不考虑对开发预测模型至关重要的性能指标。在晚期发病建模中,结果周围的风险因素可能复杂、相互关联且尚未完全确定,因此拥有一种可定制的优化方法来自动生成概述这些复杂结果的逻辑、可解释的贝叶斯网络非常重要。
开发一个基于模拟退火的框架,用于构建宫颈癌患者晚期发病预测的贝叶斯网络结构,解决传统优化技术的局限性并优先考虑可解释性。
本研究利用多中心EMBRACE I宫颈癌数据集(n = 1153)来构建晚期中度至重度(≥2级)膀胱炎(CTCAEv.3)预测的贝叶斯网络结构。数据集被分为训练/验证数据(80%)和保留测试数据(20%)。一个10×5折交叉验证过程被整合到优化框架中。开发了一种基于模拟退火的优化方法,该方法纳入了信息论度量、预测性能度量和复杂性度量。将此框架开发的不同网络结构在复杂性、可解释性和预测性能方面与Python的PyAgrum包中现成的优化方法(贪婪爬山法、树增强朴素贝叶斯和周-刘优化)进行比较。贝叶斯网络在特征重要性和预测性能方面也与传统机器学习分类器进行了比较。通过Cochran Q检验(p < 0.05)评估因结构差异导致的模型预测差异。
与现成模型相比,模拟退火框架展示了生成具有可比或更优预测性能的贝叶斯网络结构的能力。通过Cochran Q检验确定模拟退火方法与现成方法之间存在统计学上显著的性能差异(p = 0.03)。在自举保留测试集上,模拟退火方法等于或优于现成模型,平衡准确率为64.1%,F1宏分数为55.9%,ROC-AUC为0.66。模拟退火模型的弧和节点也更少,这种简化使得网络在不影响预测性能的情况下更易于解释,突出了模拟退火在创建用于临床的高度可解释模型方面的有效性。
所提出的基于模拟退火的框架代表了一种用于自动生成宫颈癌晚期发病建模的贝叶斯网络结构的新方法。与现成的优化技术相比,模拟退火贝叶斯网络在构建更简单、可解释的网络以用于临床实施的同时,提供了可比或更优的预测性能。