• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

使用多层感知机和动量梯度下降进行离散缺失数据插补。

Discrete Missing Data Imputation Using Multilayer Perceptron and Momentum Gradient Descent.

机构信息

School of Computer Science, Hubei University of Technology, Wuhan 430068, China.

Fujian Provincial Key Laboratory of Data Intensive Computing, Quanzhou 362000, China.

出版信息

Sensors (Basel). 2022 Jul 28;22(15):5645. doi: 10.3390/s22155645.

DOI:10.3390/s22155645
PMID:35957197
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9371018/
Abstract

Data are a strategic resource for industrial production, and an efficient data-mining process will increase productivity. However, there exist many missing values in data collected in real life due to various problems. Because the missing data may reduce productivity, missing value imputation is an important research topic in data mining. At present, most studies mainly focus on imputation methods for continuous missing data, while a few concentrate on discrete missing data. In this paper, a discrete missing value imputation method based on a multilayer perceptron (MLP) is proposed, which employs a momentum gradient descent algorithm, and some prefilling strategies are utilized to improve the convergence speed of the MLP. To verify the effectiveness of the method, experiments are conducted to compare the classification accuracy with eight common imputation methods, such as the mode, random, hot-deck, KNN, autoencoder, and MLP, under different missing mechanisms and missing proportions. Experimental results verify that the improved MLP model (IMLP) can effectively impute discrete missing values in most situations under three missing patterns.

摘要

数据是工业生产的战略资源,高效的数据挖掘过程将提高生产力。然而,由于各种问题,在实际收集的数据中存在许多缺失值。由于缺失数据可能会降低生产力,因此缺失值插补是数据挖掘中的一个重要研究课题。目前,大多数研究主要集中在连续缺失数据的插补方法上,而少数研究则集中在离散缺失数据上。本文提出了一种基于多层感知器(MLP)的离散缺失值插补方法,该方法采用动量梯度下降算法,并利用一些预填充策略来提高 MLP 的收敛速度。为了验证该方法的有效性,实验比较了在不同缺失机制和缺失比例下,该方法与模式、随机、热插补、KNN、自动编码器和 MLP 等八种常见插补方法的分类精度。实验结果验证了在三种缺失模式下,改进的 MLP 模型(IMLP)在大多数情况下都能有效地插补离散缺失值。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7683/9371018/2eca1c900d66/sensors-22-05645-g010.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7683/9371018/3dec0077ea80/sensors-22-05645-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7683/9371018/a1572adfc7ef/sensors-22-05645-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7683/9371018/b5c1ba722348/sensors-22-05645-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7683/9371018/08a35b09413b/sensors-22-05645-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7683/9371018/3f7a72551c27/sensors-22-05645-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7683/9371018/2a7d8c885def/sensors-22-05645-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7683/9371018/9a72fc93449f/sensors-22-05645-g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7683/9371018/17a0fc57b0db/sensors-22-05645-g008a.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7683/9371018/89a8995916c9/sensors-22-05645-g009.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7683/9371018/2eca1c900d66/sensors-22-05645-g010.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7683/9371018/3dec0077ea80/sensors-22-05645-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7683/9371018/a1572adfc7ef/sensors-22-05645-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7683/9371018/b5c1ba722348/sensors-22-05645-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7683/9371018/08a35b09413b/sensors-22-05645-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7683/9371018/3f7a72551c27/sensors-22-05645-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7683/9371018/2a7d8c885def/sensors-22-05645-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7683/9371018/9a72fc93449f/sensors-22-05645-g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7683/9371018/17a0fc57b0db/sensors-22-05645-g008a.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7683/9371018/89a8995916c9/sensors-22-05645-g009.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7683/9371018/2eca1c900d66/sensors-22-05645-g010.jpg

相似文献

1
Discrete Missing Data Imputation Using Multilayer Perceptron and Momentum Gradient Descent.使用多层感知机和动量梯度下降进行离散缺失数据插补。
Sensors (Basel). 2022 Jul 28;22(15):5645. doi: 10.3390/s22155645.
2
Missing value imputation on missing completely at random data using multilayer perceptrons.基于多层感知机对完全随机缺失数据进行缺失值插补。
Neural Netw. 2011 Jan;24(1):121-9. doi: 10.1016/j.neunet.2010.09.008. Epub 2010 Sep 17.
3
A fractional gradient descent algorithm robust to the initial weights of multilayer perceptron.一种对多层感知器初始权重具有鲁棒性的分数梯度下降算法。
Neural Netw. 2023 Jan;158:154-170. doi: 10.1016/j.neunet.2022.11.018. Epub 2022 Nov 17.
4
Advanced methods for missing values imputation based on similarity learning.基于相似性学习的缺失值插补先进方法。
PeerJ Comput Sci. 2021 Jul 21;7:e619. doi: 10.7717/peerj-cs.619. eCollection 2021.
5
Missing value imputation in high-dimensional phenomic data: imputable or not, and how?高维表型组数据中的缺失值插补:是否可插补以及如何插补?
BMC Bioinformatics. 2014 Nov 5;15(1):346. doi: 10.1186/s12859-014-0346-6.
6
GGA-MLP: A Greedy Genetic Algorithm to Optimize Weights and Biases in Multilayer Perceptron.GGA-MLP:一种在多层感知器中优化权重和偏差的贪婪遗传算法。
Contrast Media Mol Imaging. 2022 Feb 24;2022:4036035. doi: 10.1155/2022/4036035. eCollection 2022.
7
A novel discrete learning-based intelligent methodology for breast cancer classification purposes.一种基于离散学习的新型智能方法,用于乳腺癌分类。
Artif Intell Med. 2023 May;139:102492. doi: 10.1016/j.artmed.2023.102492. Epub 2023 Jan 19.
8
Self-Training With Quantile Errors for Multivariate Missing Data Imputation for Regression Problems in Electronic Medical Records: Algorithm Development Study.基于分位数误差的自训练在电子病历回归问题中对多变量缺失数据插补的应用:算法开发研究。
JMIR Public Health Surveill. 2021 Oct 13;7(10):e30824. doi: 10.2196/30824.
9
[Simulation study on missing data imputation methods for longitudinal data in cohort studies].队列研究中纵向数据缺失值插补方法的模拟研究
Zhonghua Liu Xing Bing Xue Za Zhi. 2021 Oct 10;42(10):1889-1894. doi: 10.3760/cma.j.cn112338-20201130-01363.
10
Data classification based on fractional order gradient descent with momentum for RBF neural network.基于分数阶梯度下降和动量的径向基函数神经网络数据分类
Network. 2020 Feb-Nov;31(1-4):166-185. doi: 10.1080/0954898X.2020.1849842. Epub 2020 Dec 6.

本文引用的文献

1
Evaluating the state of the art in missing data imputation for clinical data.评估临床数据缺失值插补的最新技术状态。
Brief Bioinform. 2022 Jan 17;23(1). doi: 10.1093/bib/bbab489.
2
Missing data is poorly handled and reported in prediction model studies using machine learning: a literature review.机器学习预测模型研究中对缺失数据的处理和报告很差劲:文献综述。
J Clin Epidemiol. 2022 Feb;142:218-229. doi: 10.1016/j.jclinepi.2021.11.023. Epub 2021 Nov 16.
3
A survey on missing data in machine learning.关于机器学习中缺失数据的一项调查。
J Big Data. 2021;8(1):140. doi: 10.1186/s40537-021-00516-9. Epub 2021 Oct 27.
4
Imputation methods for high-dimensional mixed-type datasets by nearest neighbors.基于最近邻的高维混合数据集插补方法。
Comput Biol Med. 2021 Aug;135:104577. doi: 10.1016/j.compbiomed.2021.104577. Epub 2021 Jun 17.
5
Handling Complex Missing Data Using Random Forest Approach for an Air Quality Monitoring Dataset: A Case Study of Kuwait Environmental Data (2012 to 2018).利用随机森林方法处理空气质量监测数据集的复杂缺失数据:以科威特环境数据(2012 年至 2018 年)为例。
Int J Environ Res Public Health. 2021 Feb 2;18(3):1333. doi: 10.3390/ijerph18031333.
6
Application of a multi-stage neural network approach for time-series landfill gas modeling with missing data imputation.应用多阶段神经网络方法对具有缺失数据插补的时间序列垃圾填埋场气体建模。
Waste Manag. 2020 Oct;116:66-78. doi: 10.1016/j.wasman.2020.07.034. Epub 2020 Aug 9.
7
A Deep Learning Approach for Missing Data Imputation of Rating Scales Assessing Attention-Deficit Hyperactivity Disorder.一种用于评估注意力缺陷多动障碍的评分量表缺失数据插补的深度学习方法。
Front Psychiatry. 2020 Jul 17;11:673. doi: 10.3389/fpsyt.2020.00673. eCollection 2020.
8
Missing data imputation with adversarially-trained graph convolutional networks.基于对抗训练图卷积网络的缺失数据插补。
Neural Netw. 2020 Sep;129:249-260. doi: 10.1016/j.neunet.2020.06.005. Epub 2020 Jun 13.
9
Bagging Ensemble of Multilayer Perceptrons for Missing Electricity Consumption Data Imputation.基于多层感知机的装袋集成算法实现缺失用电数据插补。
Sensors (Basel). 2020 Mar 23;20(6):1772. doi: 10.3390/s20061772.
10
A Transfer-Based Additive LS-SVM Classifier for Handling Missing Data.基于转移的增广最小二乘支持向量机缺失数据分类器。
IEEE Trans Cybern. 2020 Feb;50(2):739-752. doi: 10.1109/TCYB.2018.2872800. Epub 2018 Oct 15.