文献检索文档翻译深度研究
Suppr Zotero 插件Zotero 插件
邀请有礼套餐&价格历史记录

新学期,新优惠

限时优惠:9月1日-9月22日

30天高级会员仅需29元

1天体验卡首发特惠仅需5.99元

了解详情
不再提醒
插件&应用
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
高级版
套餐订阅购买积分包
AI 工具
文献检索文档翻译深度研究
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2025

稳定机器学习以获得可重复和可解释的结果:一种针对特定个体见解的新型验证方法。

Stabilizing machine learning for reproducible and explainable results: A novel validation approach to subject-specific insights.

作者信息

Vos Gideon, van Eijk Liza, Sarnyai Zoltan, Rahimi Azghadi Mostafa

机构信息

College of Science and Engineering, James Cook University, James Cook Dr, Townsville, 4811, QLD, Australia.

College of Health Care Sciences, James Cook University, James Cook Dr, Townsville, 4811, QLD, Australia.

出版信息

Comput Methods Programs Biomed. 2025 Jun 21;269:108899. doi: 10.1016/j.cmpb.2025.108899.


DOI:10.1016/j.cmpb.2025.108899
PMID:40570739
Abstract

INTRODUCTION: Machine Learning (ML) is transforming medical research by enhancing diagnostic accuracy, predicting disease progression, and personalizing treatments. While general models trained on large datasets identify broad patterns across populations, the diversity of human biology, shaped by genetics, environment, and lifestyle, often limits their effectiveness. This has driven a shift towards subject-specific models that incorporate individual biological and clinical data for more precise predictions and personalized care. However, developing these models presents significant practical and financial challenges. Additionally, ML models initialized through stochastic processes with random seeds can suffer from reproducibility issues when those seeds are changed, leading to variations in predictive performance and feature importance. To address this, this study introduces a novel validation approach to enhance model interpretability, stabilizing predictive performance and feature importance at both the group and subject-specific levels. METHODS: We conducted initial experiments using a single Random Forest (RF) model initialized with a random seed for key stochastic processes, on nine datasets that varied in domain problems, sample size, and demographics. Different validation techniques were applied to assess model accuracy and reproducibility while evaluating feature importance consistency. Next, the experiment was repeated for each dataset for up to 400 trials per subject, randomly seeding the machine learning algorithm between each trial. This introduced variability in the initialization of model parameters, thus providing a more comprehensive evaluation of the machine learning model's features and performance consistency. The repeated trials generated up to 400 feature sets per subject. By aggregating feature importance rankings across trials, our method identified the most consistently important features, reducing the impact of noise and random variation in feature selection. The top subject-specific feature importance set across all trials was then identified. Finally, using all subject-specific feature sets, the top group-specific feature importance set was also created. This process resulted in stable, reproducible feature rankings, enhancing both subject-level and group-level model explainability. RESULTS: We found that machine learning models with stochastic initialization were particularly susceptible to variations in reproducibility, predictive accuracy, and feature importance due to random seed selection and validation techniques during training. Changes in random seeds altered weight initialization, optimization paths, and feature rankings, leading to fluctuations in test accuracy and interpretability. These findings align with prior research on the sensitivity of stochastic models to initialization randomness. This study builds on that understanding by introducing a novel repeated trials validation approach with random seed variation, significantly reducing variability in feature rankings and improving the consistency of model performance metrics. The method enabled robust identification of key features for each subject using a single, generic machine learning model, making predictions more interpretable and stable across experiments. CONCLUSION: Subject-specific models improve generalization by addressing variability in human biology but are often costly and impractical for clinical trials. In this study, we introduce a novel validation technique for determining both group- and subject-specific feature importance within a general machine learning model, achieving greater stability in feature selection, higher predictive accuracy, and improved model interpretability. Our proposed approach ensures reproducible accuracy metrics and reliable feature rankings when using models incorporating stochastic processes, making machine learning models more robust and clinically applicable.

摘要

引言:机器学习(ML)正在通过提高诊断准确性、预测疾病进展和实现治疗个性化来改变医学研究。虽然在大型数据集上训练的通用模型能够识别总体人群中的广泛模式,但由基因、环境和生活方式塑造的人类生物学多样性往往会限制其有效性。这推动了向特定个体模型的转变,这些模型纳入个体生物学和临床数据以进行更精确的预测和个性化护理。然而,开发这些模型面临重大的实际和资金挑战。此外,通过具有随机种子的随机过程初始化的ML模型,当这些种子发生变化时,可能会出现可重复性问题,导致预测性能和特征重要性的差异。为了解决这个问题,本研究引入了一种新颖的验证方法,以增强模型的可解释性,在群体和特定个体层面稳定预测性能和特征重要性。 方法:我们使用单个随机森林(RF)模型进行了初步实验,该模型针对关键随机过程用随机种子进行初始化,实验数据来自九个在领域问题、样本大小和人口统计学方面存在差异的数据集。应用不同的验证技术来评估模型准确性和可重复性,同时评估特征重要性的一致性。接下来,针对每个数据集重复该实验,每个受试者最多进行400次试验,每次试验之间随机设置机器学习算法的种子。这在模型参数初始化中引入了可变性,从而对机器学习模型的特征和性能一致性进行了更全面的评估。重复试验为每个受试者生成多达400个特征集。通过汇总各次试验的特征重要性排名,我们的方法确定了最一致重要的特征,减少了特征选择中噪声和随机变化的影响。然后确定所有试验中特定个体的顶级特征重要性集。最后,使用所有特定个体的特征集,还创建了顶级群体特定特征重要性集。这个过程产生了稳定、可重复的特征排名,增强了个体层面和群体层面模型的可解释性。 结果:我们发现,由于训练期间随机种子的选择和验证技术,具有随机初始化的机器学习模型在可重复性、预测准确性和特征重要性方面特别容易出现变化。随机种子的变化改变了权重初始化、优化路径和特征排名,导致测试准确性和可解释性的波动。这些发现与之前关于随机模型对初始化随机性敏感性的研究一致。本研究在此基础上,通过引入一种带有随机种子变化的新颖重复试验验证方法,显著降低了特征排名的可变性,提高了模型性能指标的一致性。该方法能够使用单个通用机器学习模型稳健地识别每个受试者的关键特征,使预测在不同实验中更具可解释性和稳定性。 结论:特定个体模型通过解决人类生物学中的变异性来提高泛化能力,但对于临床试验来说往往成本高昂且不切实际。在本研究中,我们引入了一种新颖的验证技术,用于确定通用机器学习模型中群体和特定个体的特征重要性,在特征选择中实现更高的稳定性、更高的预测准确性和更好的模型可解释性。我们提出的方法在使用包含随机过程的模型时确保了可重复的准确性指标和可靠的特征排名,使机器学习模型更稳健且适用于临床。

相似文献

[1]
Stabilizing machine learning for reproducible and explainable results: A novel validation approach to subject-specific insights.

Comput Methods Programs Biomed. 2025-6-21

[2]
Are Current Survival Prediction Tools Useful When Treating Subsequent Skeletal-related Events From Bone Metastases?

Clin Orthop Relat Res. 2024-9-1

[3]
Comparison of Two Modern Survival Prediction Tools, SORG-MLA and METSSS, in Patients With Symptomatic Long-bone Metastases Who Underwent Local Treatment With Surgery Followed by Radiotherapy and With Radiotherapy Alone.

Clin Orthop Relat Res. 2024-12-1

[4]
A Responsible Framework for Assessing, Selecting, and Explaining Machine Learning Models in Cardiovascular Disease Outcomes Among People With Type 2 Diabetes: Methodology and Validation Study.

JMIR Med Inform. 2025-6-27

[5]
Falls prevention interventions for community-dwelling older adults: systematic review and meta-analysis of benefits, harms, and patient values and preferences.

Syst Rev. 2024-11-26

[6]
Systemic Inflammatory Response Syndrome

2025-1

[7]
Audit and feedback: effects on professional practice.

Cochrane Database Syst Rev. 2025-3-25

[8]
Supervised Machine Learning Models for Predicting Sepsis-Associated Liver Injury in Patients With Sepsis: Development and Validation Study Based on a Multicenter Cohort Study.

J Med Internet Res. 2025-5-26

[9]
Does the Presence of Missing Data Affect the Performance of the SORG Machine-learning Algorithm for Patients With Spinal Metastasis? Development of an Internet Application Algorithm.

Clin Orthop Relat Res. 2024-1-1

[10]
Cost-effectiveness of using prognostic information to select women with breast cancer for adjuvant systemic therapy.

Health Technol Assess. 2006-9

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

推荐工具

医学文档翻译智能文献检索