Department of Computer Science, Northwestern University, Evanston, IL, United States.
Department of Preventive Medicine, Feinberg School of Medicine, Northwestern University, Chicago, IL, United States.
J Med Internet Res. 2023 Sep 6;25:e42047. doi: 10.2196/42047.
Predicting the likelihood of success of weight loss interventions using machine learning (ML) models may enhance intervention effectiveness by enabling timely and dynamic modification of intervention components for nonresponders to treatment. However, a lack of understanding and trust in these ML models impacts adoption among weight management experts. Recent advances in the field of explainable artificial intelligence enable the interpretation of ML models, yet it is unknown whether they enhance model understanding, trust, and adoption among weight management experts.
This study aimed to build and evaluate an ML model that can predict 6-month weight loss success (ie, ≥7% weight loss) from 5 engagement and diet-related features collected over the initial 2 weeks of an intervention, to assess whether providing ML-based explanations increases weight management experts' agreement with ML model predictions, and to inform factors that influence the understanding and trust of ML models to advance explainability in early prediction of weight loss among weight management experts.
We trained an ML model using the random forest (RF) algorithm and data from a 6-month weight loss intervention (N=419). We leveraged findings from existing explainability metrics to develop Prime Implicant Maintenance of Outcome (PRIMO), an interactive tool to understand predictions made by the RF model. We asked 14 weight management experts to predict hypothetical participants' weight loss success before and after using PRIMO. We compared PRIMO with 2 other explainability methods, one based on feature ranking and the other based on conditional probability. We used generalized linear mixed-effects models to evaluate participants' agreement with ML predictions and conducted likelihood ratio tests to examine the relationship between explainability methods and outcomes for nested models. We conducted guided interviews and thematic analysis to study the impact of our tool on experts' understanding and trust in the model.
Our RF model had 81% accuracy in the early prediction of weight loss success. Weight management experts were significantly more likely to agree with the model when using PRIMO (χ=7.9; P=.02) compared with the other 2 methods with odds ratios of 2.52 (95% CI 0.91-7.69) and 3.95 (95% CI 1.50-11.76). From our study, we inferred that our software not only influenced experts' understanding and trust but also impacted decision-making. Several themes were identified through interviews: preference for multiple explanation types, need to visualize uncertainty in explanations provided by PRIMO, and need for model performance metrics on similar participant test instances.
Our results show the potential for weight management experts to agree with the ML-based early prediction of success in weight loss treatment programs, enabling timely and dynamic modification of intervention components to enhance intervention effectiveness. Our findings provide methods for advancing the understandability and trust of ML models among weight management experts.
使用机器学习(ML)模型预测减肥干预措施的成功可能性,可以通过及时和动态地修改治疗无反应者的干预措施来提高干预效果。然而,对这些 ML 模型的理解和信任的缺乏影响了体重管理专家的采用。人工智能可解释性领域的最新进展使 ML 模型的解释成为可能,但尚不清楚它们是否能提高体重管理专家对 ML 模型的理解、信任和采用。
本研究旨在构建和评估一个 ML 模型,该模型可以根据干预开始后最初 2 周内收集的 5 个参与度和与饮食相关的特征,预测 6 个月的减肥成功率(即体重减轻≥7%),以评估提供基于 ML 的解释是否会增加体重管理专家对 ML 模型预测的一致性,并为影响 ML 模型理解和信任的因素提供信息,以推进体重管理专家对早期减肥预测中可解释性的理解。
我们使用随机森林(RF)算法和一项为期 6 个月的减肥干预研究(N=419)的数据来训练 ML 模型。我们利用现有的可解释性指标的发现,开发了 Prime Implicant Maintenance of Outcome(PRIMO),这是一种理解 RF 模型预测的交互式工具。我们让 14 名体重管理专家在使用 PRIMO 前后预测假设参与者的减肥成功率。我们将 PRIMO 与另外两种可解释性方法进行了比较,一种是基于特征排序,另一种是基于条件概率。我们使用广义线性混合效应模型来评估参与者对 ML 预测的一致性,并进行似然比检验,以检验嵌套模型中解释方法与结果之间的关系。我们进行了指导性访谈和主题分析,以研究我们的工具对专家对模型的理解和信任的影响。
我们的 RF 模型在早期对减肥成功率的预测准确率为 81%。与另外两种方法相比,体重管理专家在使用 PRIMO 时更有可能与模型一致(χ=7.9;P=.02),优势比为 2.52(95%CI 0.91-7.69)和 3.95(95%CI 1.50-11.76)。从我们的研究中,我们推断出我们的软件不仅影响了专家的理解和信任,还影响了他们的决策。通过访谈确定了几个主题:对多种解释类型的偏好、对 PRIMO 提供的解释中不确定性的可视化需求,以及对类似参与者测试实例的模型性能指标的需求。
我们的结果表明,体重管理专家有可能同意基于 ML 的减肥治疗方案的早期成功预测,从而能够及时和动态地修改干预措施,以提高干预效果。我们的研究结果为体重管理专家理解和信任 ML 模型提供了方法。