• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

一种用于处理认知诊断评估中缺失数据的随机森林动态阈值插补方法。

A random forest dynamic threshold imputation method for handling missing data in cognitive diagnosis assessments.

作者信息

You Xiaofeng, Yang Jianqin, Xu Xinai

机构信息

School of Mathematics and Information Science, Nanchang Normal University, Nanchang, China.

Department of Educational Psychology, Faculty of Education, East China Normal University, Shanghai, China.

出版信息

Front Psychol. 2025 Aug 5;16:1487111. doi: 10.3389/fpsyg.2025.1487111. eCollection 2025.

DOI:10.3389/fpsyg.2025.1487111
PMID:40837258
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC12362312/
Abstract

The handling of missing data in cognitive diagnostic assessment is an important issue. The Random Forest Threshold Imputation (RFTI) method proposed by You et al. in 2023 is specifically designed for cognitive diagnostic models (CDMs) and built on the random forest imputation. However, in RFTI, the threshold for determining imputed values to be 0 is fixed at 0.5, which may result in uncertainty in this imputation. To address this issue, we proposed an improved method, Random Forest Dynamic Threshold Imputation (RFDTI), which possess two dynamic thresholds for dichotomous imputed values. A simulation study showed that the classification of attribute profiles when using RFDTI to impute missing data was always better than the four commonly used traditional methods (i.e., person mean imputation, two-way imputation, expectation-maximization algorithm, and multiple imputation). Compared with RFTI, RFDTI was slightly better for MAR or MCAR data, but slightly worse for MNAR or MIXED data, especially with a larger missingness proportion. An empirical example with MNAR data demonstrates the applicability of RFDTI, which performed similarly as RFTI and much better than the other four traditional methods. An R package is provided to facilitate the application of the proposed method.

摘要

认知诊断评估中缺失数据的处理是一个重要问题。You等人在2023年提出的随机森林阈值插补(RFTI)方法专门为认知诊断模型(CDM)设计,并建立在随机森林插补的基础上。然而,在RFTI中,将插补值确定为0的阈值固定为0.5,这可能导致这种插补存在不确定性。为了解决这个问题,我们提出了一种改进方法,即随机森林动态阈值插补(RFDTI),它为二分插补值拥有两个动态阈值。一项模拟研究表明,使用RFDTI插补缺失数据时属性轮廓的分类总是优于四种常用的传统方法(即个人均值插补、双向插补、期望最大化算法和多重插补)。与RFTI相比,RFDTI对MAR或MCAR数据略好,但对MNAR或混合数据略差,尤其是在缺失比例较大时。一个具有MNAR数据的实证例子证明了RFDTI的适用性,其表现与RFTI相似,且比其他四种传统方法好得多。提供了一个R包以促进所提出方法的应用。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a7c2/12362312/929814fe2c38/fpsyg-16-1487111-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a7c2/12362312/929814fe2c38/fpsyg-16-1487111-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a7c2/12362312/929814fe2c38/fpsyg-16-1487111-g001.jpg

相似文献

1
A random forest dynamic threshold imputation method for handling missing data in cognitive diagnosis assessments.一种用于处理认知诊断评估中缺失数据的随机森林动态阈值插补方法。
Front Psychol. 2025 Aug 5;16:1487111. doi: 10.3389/fpsyg.2025.1487111. eCollection 2025.
2
Prescription of Controlled Substances: Benefits and Risks管制药品的处方:益处与风险
3
Does the Presence of Missing Data Affect the Performance of the SORG Machine-learning Algorithm for Patients With Spinal Metastasis? Development of an Internet Application Algorithm.缺失数据的存在是否会影响 SORG 机器学习算法在脊柱转移瘤患者中的性能?开发一种互联网应用算法。
Clin Orthop Relat Res. 2024 Jan 1;482(1):143-157. doi: 10.1097/CORR.0000000000002706. Epub 2023 Jun 12.
4
Comparison of Two Modern Survival Prediction Tools, SORG-MLA and METSSS, in Patients With Symptomatic Long-bone Metastases Who Underwent Local Treatment With Surgery Followed by Radiotherapy and With Radiotherapy Alone.两种现代生存预测工具 SORG-MLA 和 METSSS 在接受手术联合放疗和单纯放疗治疗有症状长骨转移患者中的比较。
Clin Orthop Relat Res. 2024 Dec 1;482(12):2193-2208. doi: 10.1097/CORR.0000000000003185. Epub 2024 Jul 23.
5
Generative adversarial networks for imputing missing data for big data clinical research.生成对抗网络在大数据临床研究中用于填补缺失数据。
BMC Med Res Methodol. 2021 Apr 20;21(1):78. doi: 10.1186/s12874-021-01272-3.
6
Impact of residual disease as a prognostic factor for survival in women with advanced epithelial ovarian cancer after primary surgery.原发性手术后晚期上皮性卵巢癌患者残留病灶对生存预后的影响。
Cochrane Database Syst Rev. 2022 Sep 26;9(9):CD015048. doi: 10.1002/14651858.CD015048.pub2.
7
Signs and symptoms to determine if a patient presenting in primary care or hospital outpatient settings has COVID-19.在基层医疗机构或医院门诊环境中,如果患者出现以下症状和体征,可判断其是否患有 COVID-19。
Cochrane Database Syst Rev. 2022 May 20;5(5):CD013665. doi: 10.1002/14651858.CD013665.pub3.
8
Addressing Missing Data in GC × GC Metabolomics: Identifying Missingness Type and Evaluating the Impact of Imputation Methods on Experimental Replication.解决 GC × GC 代谢组学中的缺失数据问题:确定缺失类型,并评估插补方法对实验重复的影响。
Anal Chem. 2022 Aug 9;94(31):10912-10920. doi: 10.1021/acs.analchem.1c04093. Epub 2022 Jul 26.
9
Falls prevention interventions for community-dwelling older adults: systematic review and meta-analysis of benefits, harms, and patient values and preferences.社区居住的老年人跌倒预防干预措施:系统评价和荟萃分析的益处、危害以及患者的价值观和偏好。
Syst Rev. 2024 Nov 26;13(1):289. doi: 10.1186/s13643-024-02681-3.
10
Diagnostic test accuracy and cost-effectiveness of tests for codeletion of chromosomal arms 1p and 19q in people with glioma.染色体臂 1p 和 19q 缺失的检测在胶质瘤患者中的诊断准确性和成本效益。
Cochrane Database Syst Rev. 2022 Mar 2;3(3):CD013387. doi: 10.1002/14651858.CD013387.pub2.

本文引用的文献

1
The Impact of Missing Data on Parameter Estimation: Three Examples in Computerized Adaptive Testing.缺失数据对参数估计的影响:计算机自适应测试中的三个例子
Educ Psychol Meas. 2025 Jan 7:00131644241306990. doi: 10.1177/00131644241306990.
2
On the Treatment of Missing Item Responses in Educational Large-Scale Assessment Data: An Illustrative Simulation Study and a Case Study Using PISA 2018 Mathematics Data.教育大规模评估数据中缺失项目反应的处理:一项说明性模拟研究及使用2018年国际学生评估项目(PISA)数学数据的案例研究
Eur J Investig Health Psychol Educ. 2021 Dec 14;11(4):1653-1687. doi: 10.3390/ejihpe11040117.
3
The Feedback of the Chinese Learning Diagnosis System for Personalized Learning in Classrooms.
课堂个性化学习中文学习诊断系统的反馈
Front Psychol. 2019 Aug 8;10:1751. doi: 10.3389/fpsyg.2019.01751. eCollection 2019.
4
Random forest-based imputation outperforms other methods for imputing LC-MS metabolomics data: a comparative study.基于随机森林的插补方法在 LC-MS 代谢组学数据插补方面优于其他方法:一项比较研究。
BMC Bioinformatics. 2019 Oct 11;20(1):492. doi: 10.1186/s12859-019-3110-0.
5
Investigation of Missing Responses in Q-Matrix Validation.Q矩阵验证中缺失响应的调查。
Appl Psychol Meas. 2018 Nov;42(8):660-676. doi: 10.1177/0146621618762742. Epub 2018 Mar 26.
6
Evaluating Person Fit for Cognitive Diagnostic Assessment.评估认知诊断评估的个体适配性。
Appl Psychol Meas. 2015 May;39(3):223-238. doi: 10.1177/0146621614557272. Epub 2014 Nov 17.
7
An Overview and Evaluation of Recent Machine Learning Imputation Methods Using Cardiac Imaging Data.使用心脏成像数据的近期机器学习插补方法概述与评估
Data (Basel). 2017 Mar;2(1). doi: 10.3390/data2010008. Epub 2017 Jan 25.
8
Influence of Imputation and EM Methods on Factor Analysis when Item Nonresponse in Questionnaire Data is Nonignorable.问卷数据中项目无应答不可忽略时,插补和期望最大化(EM)方法对因子分析的影响。
Multivariate Behav Res. 2000 Jul 1;35(3):321-64. doi: 10.1207/S15327906MBR3503_03.
9
MissForest--non-parametric missing value imputation for mixed-type data.MissForest--用于混合类型数据的非参数缺失值插补。
Bioinformatics. 2012 Jan 1;28(1):112-8. doi: 10.1093/bioinformatics/btr597. Epub 2011 Oct 28.
10
Growth modeling with nonignorable dropout: alternative analyses of the STAR*D antidepressant trial.采用不可忽视缺失数据的生长模型:STAR*D 抗抑郁试验的替代分析。
Psychol Methods. 2011 Mar;16(1):17-33. doi: 10.1037/a0022634.