• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

用于缺失数据插补的Transformer深度学习模型:ReMasker模型在心理测量量表上的应用

Transformers deep learning models for missing data imputation: an application of the ReMasker model on a psychometric scale.

作者信息

Casella Monica, Milano Nicola, Dolce Pasquale, Marocco Davide

机构信息

Natural and Artificial Cognition Laboratory, Department of Humanistic Studies, University of Naples "Federico II", Naples, Italy.

Department of Translational Medical Science, University of Naples "Federico II", Naples, Italy.

出版信息

Front Psychol. 2024 Dec 17;15:1449272. doi: 10.3389/fpsyg.2024.1449272. eCollection 2024.

DOI:10.3389/fpsyg.2024.1449272
PMID:39744035
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11688576/
Abstract

INTRODUCTION

Missing data in psychometric research presents a substantial challenge, impacting the reliability and validity of study outcomes. Various factors contribute to this issue, including participant non-response, dropout, or technical errors during data collection. Traditional methods like mean imputation or regression, commonly used to handle missing data, rely upon assumptions that may not hold on psychological data and can lead to distorted results.

METHODS

This study aims to evaluate the effectiveness of transformer-based deep learning for missing data imputation, comparing ReMasker, a masking autoencoding transformer model, with conventional imputation techniques (mean and median imputation, Expectation-Maximization algorithm) and machine learning approaches (K-nearest neighbors, MissForest, and an Artificial Neural Network). A psychometric dataset from the COVID distress repository was used, with imputation performance assessed through the Root Mean Squared Error (RMSE) between the original and imputed data matrices.

RESULTS

Results indicate that machine learning techniques, particularly ReMasker, achieve superior performance in terms of reconstruction error compared to conventional imputation techniques across all tested scenarios.

DISCUSSION

This finding underscores the potential of transformer-based models to provide robust imputation in psychometric research, enhancing data integrity and generalizability.

摘要

引言

心理测量学研究中的缺失数据带来了重大挑战,影响研究结果的可靠性和有效性。导致这个问题的因素有很多,包括参与者无回应、退出或数据收集过程中的技术错误。像均值插补或回归这样的传统方法,常用于处理缺失数据,它们依赖的假设可能不适用于心理数据,并且可能导致结果失真。

方法

本研究旨在评估基于Transformer的深度学习在缺失数据插补方面的有效性,将掩码自动编码Transformer模型ReMasker与传统插补技术(均值和中位数插补、期望最大化算法)以及机器学习方法(K近邻、MissForest和人工神经网络)进行比较。使用了来自COVID困扰库的心理测量数据集,通过原始数据矩阵和插补后数据矩阵之间的均方根误差(RMSE)来评估插补性能。

结果

结果表明,在所有测试场景中,与传统插补技术相比,机器学习技术,特别是ReMasker,在重构误差方面表现更优。

讨论

这一发现强调了基于Transformer的模型在心理测量学研究中提供强大插补的潜力,增强了数据完整性和可推广性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c63d/11688576/fffe2abe9a51/fpsyg-15-1449272-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c63d/11688576/5045931cfbaf/fpsyg-15-1449272-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c63d/11688576/e7c3b76ee7f5/fpsyg-15-1449272-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c63d/11688576/70006f31eacb/fpsyg-15-1449272-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c63d/11688576/02b19cc12319/fpsyg-15-1449272-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c63d/11688576/fffe2abe9a51/fpsyg-15-1449272-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c63d/11688576/5045931cfbaf/fpsyg-15-1449272-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c63d/11688576/e7c3b76ee7f5/fpsyg-15-1449272-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c63d/11688576/70006f31eacb/fpsyg-15-1449272-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c63d/11688576/02b19cc12319/fpsyg-15-1449272-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c63d/11688576/fffe2abe9a51/fpsyg-15-1449272-g005.jpg

相似文献

1
Transformers deep learning models for missing data imputation: an application of the ReMasker model on a psychometric scale.用于缺失数据插补的Transformer深度学习模型:ReMasker模型在心理测量量表上的应用
Front Psychol. 2024 Dec 17;15:1449272. doi: 10.3389/fpsyg.2024.1449272. eCollection 2024.
2
A Comparative Study on Imputation Techniques: Introducing a Transformer Model for Robust and Efficient Handling of Missing EEG Amplitude Data.插补技术的比较研究:引入一种用于稳健高效处理缺失脑电图幅度数据的Transformer模型。
Bioengineering (Basel). 2024 Jul 23;11(8):740. doi: 10.3390/bioengineering11080740.
3
Advanced methods for missing values imputation based on similarity learning.基于相似性学习的缺失值插补先进方法。
PeerJ Comput Sci. 2021 Jul 21;7:e619. doi: 10.7717/peerj-cs.619. eCollection 2021.
4
The Optimal Machine Learning-Based Missing Data Imputation for the Cox Proportional Hazard Model.基于最优机器学习的 Cox 比例风险模型缺失数据插补。
Front Public Health. 2021 Jul 5;9:680054. doi: 10.3389/fpubh.2021.680054. eCollection 2021.
5
Addressing Missing Data Challenges in Geriatric Health Monitoring: A Study of Statistical and Machine Learning Imputation Methods.应对老年健康监测中的数据缺失挑战:统计与机器学习插补方法研究
Sensors (Basel). 2025 Jan 21;25(3):614. doi: 10.3390/s25030614.
6
Multi-metric comparison of machine learning imputation methods with application to breast cancer survival.基于机器学习的插补方法的多指标比较及其在乳腺癌生存分析中的应用。
BMC Med Res Methodol. 2024 Aug 30;24(1):191. doi: 10.1186/s12874-024-02305-3.
7
Deep Learning Approach for Imputation of Missing Values in Actigraphy Data: Algorithm Development Study.深度学习方法在运动数据缺失值插补中的应用:算法开发研究。
JMIR Mhealth Uhealth. 2020 Jul 23;8(7):e16113. doi: 10.2196/16113.
8
A novel MissForest-based missing values imputation approach with recursive feature elimination in medical applications.一种基于 MissForest 的新的缺失值插补方法,在医学应用中采用递归特征消除。
BMC Med Res Methodol. 2024 Nov 8;24(1):269. doi: 10.1186/s12874-024-02392-2.
9
Comparison of the effects of imputation methods for missing data in predictive modelling of cohort study datasets.缺失数据插补方法对队列研究数据集预测建模效果的比较。
BMC Med Res Methodol. 2024 Feb 16;24(1):41. doi: 10.1186/s12874-024-02173-x.
10
Generative adversarial networks for imputing missing data for big data clinical research.生成对抗网络在大数据临床研究中用于填补缺失数据。
BMC Med Res Methodol. 2021 Apr 20;21(1):78. doi: 10.1186/s12874-021-01272-3.

本文引用的文献

1
Enhancing early autism diagnosis through machine learning: Exploring raw motion data for classification.通过机器学习增强早期自闭症诊断:探索原始运动数据进行分类。
PLoS One. 2024 Apr 22;19(4):e0302238. doi: 10.1371/journal.pone.0302238. eCollection 2024.
2
Artificial Neural Networks for Short-Form Development of Psychometric Tests: A Study on Synthetic Populations Using Autoencoders.用于心理测量测试简短形式开发的人工神经网络:一项使用自动编码器对合成人群的研究。
Educ Psychol Meas. 2024 Feb;84(1):62-90. doi: 10.1177/00131644231164363. Epub 2023 Apr 15.
3
Multimodal Learning With Transformers: A Survey.
基于Transformer的多模态学习:一项综述。
IEEE Trans Pattern Anal Mach Intell. 2023 Oct;45(10):12113-12132. doi: 10.1109/TPAMI.2023.3275156. Epub 2023 Sep 5.
4
COVIDiSTRESS diverse dataset on psychological and behavioural outcomes one year into the COVID-19 pandemic.COVIDiSTRESS 多样化数据集,涵盖 COVID-19 大流行一年后心理和行为结果。
Sci Data. 2022 Jun 21;9(1):331. doi: 10.1038/s41597-022-01383-6.
5
DLIN: Deep Ladder Imputation Network.DLIN:深度阶梯插补网络。
IEEE Trans Cybern. 2022 Sep;52(9):8629-8641. doi: 10.1109/TCYB.2021.3054878. Epub 2022 Aug 18.
6
A Deep Learning Algorithm for High-Dimensional Exploratory Item Factor Analysis.一种高维探索性项目因子分析的深度学习算法。
Psychometrika. 2021 Mar;86(1):1-29. doi: 10.1007/s11336-021-09748-3. Epub 2021 Feb 2.
7
Genomic data imputation with variational auto-encoders.基于变分自动编码器的基因组数据插补。
Gigascience. 2020 Aug 1;9(8). doi: 10.1093/gigascience/giaa082.
8
Psychometric and Machine Learning Approaches to Reduce the Length of Scales.心理计量学和机器学习方法可用于缩短量表的长度。
Multivariate Behav Res. 2021 Nov-Dec;56(6):903-919. doi: 10.1080/00273171.2020.1781585. Epub 2020 Aug 4.
9
Toward a Machine Learning Predictive-Oriented Approach to Complement Explanatory Modeling. An Application for Evaluating Psychopathological Traits Based on Affective Neurosciences and Phenomenology.迈向基于机器学习预测导向的补充性解释建模方法。基于情感神经科学和现象学评估精神病理特征的应用。
Front Psychol. 2020 Mar 24;11:446. doi: 10.3389/fpsyg.2020.00446. eCollection 2020.
10
Missing data and prediction: the pattern submodel.缺失数据和预测:模式子模型。
Biostatistics. 2020 Apr 1;21(2):236-252. doi: 10.1093/biostatistics/kxy040.