Suppr超能文献

将基于机器学习的多重填补方法应用于纵向临床研究中的非参数多重比较。

Applying machine learning-based multiple imputation methods to nonparametric multiple comparisons in longitudinal clinical studies.

作者信息

Yanarateş Tuncay, Karabulut Erdem

机构信息

Department of Biostatistics, School of Medicine, Hacettepe University, Ankara, Turkey.

出版信息

J Biopharm Stat. 2024 Dec 21:1-12. doi: 10.1080/10543406.2024.2444243.

Abstract

Dependent samples, in which repeated measurements are made on the same subjects, eliminate potential differences among the subjects. In k-dependent samples, missing data can occur for various reasons. The Skillings-Mack test is used instead of the Friedman test for k-dependent samples with missing observations that are non-normally distributed. If a significant difference exists among groups, nonparametric multiple comparisons need to be performed. In this study, we propose an innovative approach by applying four methods to nonparametric multiple comparisons of incomplete k-dependent samples that are non-normally distributed. The four methods are two nonparametric multiple imputation methods based on machine learning (multiple imputations by chained equations utilizing classification and regression trees (MICE-CART) and random forest (MICE-RF)), one nonparametric imputation method (random hot deck imputation), and the listwise deletion method. We compare the four methods under two missing data mechanisms, four correlation coefficients, two sample sizes, and three percentages of missingness. After implementing different scenarios in a simulation study, the listwise deletion method is inferior to the other methods. MICE-CART and MICE-RF are superior to the other methods for moderate and small sample sizes with well-controlled type 1 error. The two nonparametric multiple imputation methods based on machine learning can be applied to nonparametric multiple comparisons. Therefore, we propose machine learning-based multiple imputation methods for nonparametric multiple comparisons of k-dependent samples with missing observations. The approach was also illustrated with a longitudinal dentistry clinical trial.

摘要

在对同一受试者进行重复测量的相关样本中,消除了受试者之间的潜在差异。在k个相关样本中,缺失数据可能由于各种原因而出现。对于存在非正态分布缺失观测值的k个相关样本,使用斯基林斯-麦克检验代替弗里德曼检验。如果组间存在显著差异,则需要进行非参数多重比较。在本研究中,我们提出了一种创新方法,即将四种方法应用于非正态分布的不完全k个相关样本的非参数多重比较。这四种方法分别是两种基于机器学习的非参数多重插补方法(利用分类与回归树的链式方程进行多重插补(MICE-CART)和随机森林(MICE-RF))、一种非参数插补方法(随机热盘插补)以及逐行删除法。我们在两种缺失数据机制、四个相关系数、两种样本量以及三个缺失率的条件下对这四种方法进行比较。在模拟研究中实施不同场景后,逐行删除法不如其他方法。对于中小样本量且I型错误得到良好控制的情况,MICE-CART和MICE-RF优于其他方法。这两种基于机器学习的非参数多重插补方法可应用于非参数多重比较。因此,我们提出了基于机器学习的多重插补方法,用于对存在缺失观测值的k个相关样本进行非参数多重比较。该方法还通过一项纵向牙科临床试验进行了说明。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验