Division of Transdisciplinary Sciences, Graduate School of Frontier Science Initiative, Kanazawa University, Ishikawa, Japan.
Department of Neurosurgery, Kanazawa University, Ishikawa, Japan.
Cardiovasc Eng Technol. 2024 Aug;15(4):394-404. doi: 10.1007/s13239-024-00721-6. Epub 2024 May 23.
To enhance the performance of machine learning (ML) models for the post-embolization recanalization of cerebral aneurysms, we evaluated the impact of hemodynamic feature derivation and selection method on six ML algorithms.
We utilized computational fluid dynamics (CFD) to simulate hemodynamics in 66 cerebral aneurysms from 65 patients, including 57 stable and nine recanalized aneurysms. We derived a total of 107 features for each aneurysm, encompassing four clinical features, 12 morphological features, and 91 hemodynamic features. To investigate the influence of feature derivation and selection methods on the ML models, we employed two derivation methods, simplified and fully derived, in combination with four selection methods: all features, statistically significant analysis, stepwise multivariate logistic regression analysis (stepwise-LR), and recursive feature elimination (RFE). Model performance was assessed using the area under the receiver operating characteristic curve (AUROC) and precision-recall curve (AUPRC) on both the training and testing datasets.
The AUROC values on the testing dataset exhibited a wide-ranging spectrum, spanning from 0.373 to 0.863. Fully derived features and the RFE selection method demonstrated superior performance in intra-model comparisons. The multi-layer perceptron (MLP) model, trained with RFE-selected fully derived features, achieved the best performance on the testing dataset, with an AUROC value of 0.863 (95% CI: 0.684- 1.000).
Our study demonstrated the importance of feature derivation and selection in determining the performance of ML models. This enabled the development of accurate decision-making models without the need to invade the patient.
为了提高脑动脉瘤栓塞后再通的机器学习(ML)模型的性能,我们评估了血流动力学特征推导和选择方法对六种 ML 算法的影响。
我们利用计算流体动力学(CFD)模拟了来自 65 名患者的 66 个脑动脉瘤的血流动力学,其中包括 57 个稳定和 9 个再通的动脉瘤。我们为每个动脉瘤总共推导了 107 个特征,包括四个临床特征、12 个形态学特征和 91 个血流动力学特征。为了研究特征推导和选择方法对 ML 模型的影响,我们采用了两种推导方法,简化和完全推导,结合四种选择方法:所有特征、统计显著性分析、逐步多元逻辑回归分析(逐步-LR)和递归特征消除(RFE)。使用训练和测试数据集的接收者操作特征曲线(AUROC)和精度-召回曲线(AUPRC)评估模型性能。
测试数据集上的 AUROC 值范围很广,从 0.373 到 0.863。完全推导的特征和 RFE 选择方法在模型内比较中表现出更好的性能。使用 RFE 选择的完全推导特征训练的多层感知器(MLP)模型在测试数据集上表现出最佳性能,AUROC 值为 0.863(95%置信区间:0.684-1.000)。
我们的研究表明了特征推导和选择在确定 ML 模型性能方面的重要性。这使得能够开发出无需侵入患者的准确决策模型。