Huang Alexander A, Huang Samuel Y
Surgery, Northwestern University Feinberg School of Medicine, Chicago, USA.
Internal Medicine, Icahn School of Medicine at Mount Sinai South Nassau, Oceanside, USA.
Cureus. 2023 Oct 5;15(10):e46549. doi: 10.7759/cureus.46549. eCollection 2023 Oct.
Machine-learning techniques have been increasing in popularity within medicine during the past decade. However, these computational techniques are not presented in statistical lectures throughout medical school and are perceived to have a high barrier to entry. The objective is to develop a concise pipeline with publicly available data to decrease the learning time towards using machine learning for medical research and quality-improvement initiatives. This report utilized a publicly available machine-learning data package in R (MLDataR) and computational packages (XGBoost) to highlight techniques for machine-learning model development and visualization with SHaply Additive exPlanations (SHAP). A simple six-step process along with example code was constructed to build and visualize machine-learning models. A concrete set of three steps was developed to help with interpretation. Further teaching of these methods could benefit researchers by providing alternative methods for data analysis in medical studies. These could help researchers without computational experience to get a feel for machine learning to better understand the literature and technique.
在过去十年中,机器学习技术在医学领域越来越受欢迎。然而,在整个医学院的统计学课程中都没有介绍这些计算技术,并且人们认为其入门门槛很高。目标是开发一个使用公开可用数据的简洁流程,以减少将机器学习用于医学研究和质量改进计划的学习时间。本报告利用R语言中的一个公开可用的机器学习数据包(MLDataR)和计算包(XGBoost),重点介绍了使用SHaply加法解释(SHAP)进行机器学习模型开发和可视化的技术。构建了一个简单的六步流程以及示例代码,用于构建和可视化机器学习模型。还制定了一套具体的三个步骤来帮助进行解释。进一步教授这些方法可以为研究人员提供医学研究中数据分析的替代方法,从而使他们受益。这可以帮助没有计算经验的研究人员初步了解机器学习,以便更好地理解相关文献和技术。