Mohammed VI Center For Research and Innovation, Rabat, Morocco.
International School of Public Health, Mohammed VI University of Sciences and Health, Casablanca, Morocco.
BMC Med Res Methodol. 2024 Aug 30;24(1):191. doi: 10.1186/s12874-024-02305-3.
Handling missing data in clinical prognostic studies is an essential yet challenging task. This study aimed to provide a comprehensive assessment of the effectiveness and reliability of different machine learning (ML) imputation methods across various analytical perspectives. Specifically, it focused on three distinct classes of performance metrics used to evaluate ML imputation methods: post-imputation bias of regression estimates, post-imputation predictive accuracy, and substantive model-free metrics. As an illustration, we applied data from a real-world breast cancer survival study. This comprehensive approach aimed to provide a thorough assessment of the effectiveness and reliability of ML imputation methods across various analytical perspectives. A simulated dataset with 30% Missing At Random (MAR) values was used. A number of single imputation (SI) methods - specifically KNN, missMDA, CART, missForest, missRanger, missCforest - and multiple imputation (MI) methods - specifically miceCART and miceRF - were evaluated. The performance metrics used were Gower's distance, estimation bias, empirical standard error, coverage rate, length of confidence interval, predictive accuracy, proportion of falsely classified (PFC), normalized root mean squared error (NRMSE), AUC, and C-index scores. The analysis revealed that in terms of Gower's distance, CART and missForest were the most accurate, while missMDA and CART excelled for binary covariates; missForest and miceCART were superior for continuous covariates. When assessing bias and accuracy in regression estimates, miceCART and miceRF exhibited the least bias. Overall, the various imputation methods demonstrated greater efficiency than complete-case analysis (CCA), with MICE methods providing optimal confidence interval coverage. In terms of predictive accuracy for Cox models, missMDA and missForest had superior AUC and C-index scores. Despite offering better predictive accuracy, the study found that SI methods introduced more bias into the regression coefficients compared to MI methods. This study underlines the importance of selecting appropriate imputation methods based on study goals and data types in time-to-event research. The varying effectiveness of methods across the different performance metrics studied highlights the value of using advanced machine learning algorithms within a multiple imputation framework to enhance research integrity and the robustness of findings.
处理临床预后研究中的缺失数据是一项至关重要但具有挑战性的任务。本研究旨在从多个分析角度全面评估不同机器学习(ML)插补方法的有效性和可靠性。具体来说,它侧重于用于评估 ML 插补方法的三类不同性能指标:回归估计的后插补偏差、后插补预测准确性和实质性无模型指标。为了说明问题,我们应用了来自真实乳腺癌生存研究的数据。这种综合方法旨在从多个分析角度全面评估 ML 插补方法的有效性和可靠性。使用了具有 30%随机缺失(MAR)值的模拟数据集。评估了几种单插补(SI)方法 - 特别是 KNN、missMDA、CART、missForest、missRanger 和 missCforest - 和多种插补(MI)方法 - 特别是 miceCART 和 miceRF。使用的性能指标是 Gower 距离、估计偏差、经验标准误差、覆盖率、置信区间长度、预测准确性、错误分类比例(PFC)、归一化均方根误差(NRMSE)、AUC 和 C 指数得分。分析表明,在 Gower 距离方面,CART 和 missForest 最为准确,而 missMDA 和 CART 在二元协变量方面表现出色;missForest 和 miceCART 在连续协变量方面表现出色。在评估回归估计中的偏差和准确性时,miceCART 和 miceRF 表现出最小的偏差。总体而言,各种插补方法比完全案例分析(CCA)更有效,MICE 方法提供了最佳的置信区间覆盖。在 Cox 模型的预测准确性方面,missMDA 和 missForest 的 AUC 和 C 指数得分更高。尽管 SI 方法的预测准确性更高,但研究发现,与 MI 方法相比,SI 方法会使回归系数产生更大的偏差。本研究强调了在生存研究中根据研究目标和数据类型选择适当的插补方法的重要性。在所研究的不同性能指标中,方法的有效性各不相同,这突出了在多重插补框架内使用先进的机器学习算法来提高研究完整性和结果稳健性的价值。