• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于鲸鱼优化算法和后期接受爬山算法的混合插补方法提高电子健康记录中的分类性能。

A hybrid of whale optimization and late acceptance hill climbing based imputation to enhance classification performance in electronic health records.

机构信息

School of Information Technology and Engineering, VIT University, India.

School of Information Technology and Engineering, VIT University, India.

出版信息

J Biomed Inform. 2019 Jun;94:103190. doi: 10.1016/j.jbi.2019.103190. Epub 2019 May 2.

DOI:10.1016/j.jbi.2019.103190
PMID:31054960
Abstract

Electronic health records (EHR) are a major source of information in biomedical informatics. Yet, missing values are prominent characteristics of EHR. Prediction on dataset with missing values results in inaccurate inferences. Nearest neighbour imputation based on lazy learning approach is a proven technique for missing data imputation and is recognized as one among the top ten data mining algorithms due to its simplicity and understandability. But its performance is deteriorated due to the curse of dimensionality as unimportant features are likely to dominate. We address this problem by proposing a novel approach for feature weighting based on a hybrid of metaheuristic whale optimization algorithm (WOA) and local search late acceptance hill climbing algorithm (LAHCA) on nearest neighbour imputation method. Our proposed approach Metaheuristic and Local Search based Feature Weighted Nearest Neighbour Imputation (kNN+LAHCAWOA) also learns different k values for different test points. Our approach is tested on benchmark EHR datasets with three proven classifiers Support Vector Machines(SVM), Random forest(RF) and Deep neural networks(DNN). The results prove that kNN+LAHCAWOA is an effective imputation strategy and aids in improving the classification performance when compared with its competitor methods.

摘要

电子健康记录 (EHR) 是生物医学信息学中的主要信息来源。然而,缺失值是 EHR 的突出特征。在具有缺失值的数据集上进行预测会导致不准确的推断。基于懒惰学习方法的最近邻插补是一种经过验证的缺失数据插补技术,由于其简单性和可理解性,被公认为十大数据挖掘算法之一。但是,由于维度的诅咒,不重要的特征可能会占据主导地位,其性能会恶化。我们通过提出一种基于混合元启发式鲸鱼优化算法 (WOA) 和最近邻插补方法的局部搜索后期接受爬山算法 (LAHCA) 的新特征加权方法来解决这个问题。我们提出的方法元启发式和基于局部搜索的特征加权最近邻插补 (kNN+LAHCAWOA) 还为不同的测试点学习不同的 k 值。我们的方法在基准 EHR 数据集上使用三种经过验证的分类器(支持向量机 (SVM)、随机森林 (RF) 和深度神经网络 (DNN))进行了测试。结果证明,与竞争对手的方法相比,kNN+LAHCAWOA 是一种有效的插补策略,可以帮助提高分类性能。

相似文献

1
A hybrid of whale optimization and late acceptance hill climbing based imputation to enhance classification performance in electronic health records.基于鲸鱼优化算法和后期接受爬山算法的混合插补方法提高电子健康记录中的分类性能。
J Biomed Inform. 2019 Jun;94:103190. doi: 10.1016/j.jbi.2019.103190. Epub 2019 May 2.
2
R-Ensembler: A greedy rough set based ensemble attribute selection algorithm with kNN imputation for classification of medical data.R-Ensembler:一种基于粗糙集的贪婪集成属性选择算法,具有 kNN 插补功能,用于医学数据的分类。
Comput Methods Programs Biomed. 2020 Feb;184:105122. doi: 10.1016/j.cmpb.2019.105122. Epub 2019 Oct 8.
3
Advanced methods for missing values imputation based on similarity learning.基于相似性学习的缺失值插补先进方法。
PeerJ Comput Sci. 2021 Jul 21;7:e619. doi: 10.7717/peerj-cs.619. eCollection 2021.
4
A novel missing data imputation approach based on clinical conditional Generative Adversarial Networks applied to EHR datasets.基于临床条件生成对抗网络的新型缺失数据插补方法在电子健康记录数据集的应用。
Comput Biol Med. 2023 Sep;163:107188. doi: 10.1016/j.compbiomed.2023.107188. Epub 2023 Jun 22.
5
On mining incomplete medical datasets: Ordering imputation and classification.关于挖掘不完整医学数据集:排序插补与分类。
Technol Health Care. 2015;23(5):619-25. doi: 10.3233/THC-151018.
6
NS-kNN: a modified k-nearest neighbors approach for imputing metabolomics data.NS-kNN:一种改进的 k-最近邻方法,用于代谢组学数据插补。
Metabolomics. 2018 Nov 23;14(12):153. doi: 10.1007/s11306-018-1451-8.
7
Extremely missing numerical data in Electronic Health Records for machine learning can be managed through simple imputation methods considering informative missingness: A comparative of solutions in a COVID-19 mortality case study.在电子健康记录中,针对机器学习的极度缺失数值数据可以通过考虑信息性缺失的简单插补方法来处理:一项关于COVID-19死亡率案例研究中各种解决方案的比较
Comput Methods Programs Biomed. 2023 Dec;242:107803. doi: 10.1016/j.cmpb.2023.107803. Epub 2023 Sep 7.
8
Comparison of the effects of imputation methods for missing data in predictive modelling of cohort study datasets.缺失数据插补方法对队列研究数据集预测建模效果的比较。
BMC Med Res Methodol. 2024 Feb 16;24(1):41. doi: 10.1186/s12874-024-02173-x.
9
A comparative analysis of feature selection models for spatial analysis of floods using hybrid metaheuristic and machine learning models.使用混合元启发式算法和机器学习模型进行洪水空间分析的特征选择模型的比较分析
Environ Sci Pollut Res Int. 2024 May;31(23):33495-33514. doi: 10.1007/s11356-024-33389-5. Epub 2024 Apr 29.
10
Exploiting mutual information for the imputation of static and dynamic mixed-type clinical data with an adaptive k-nearest neighbours approach.利用互信息,采用自适应 k-最近邻方法对静态和动态混合类型临床数据进行插补。
BMC Med Inform Decis Mak. 2020 Aug 20;20(Suppl 5):174. doi: 10.1186/s12911-020-01166-2.

引用本文的文献

1
Moving Beyond Medical Statistics: A Systematic Review on Missing Data Handling in Electronic Health Records.超越医学统计学:电子健康记录中缺失数据处理的系统评价
Health Data Sci. 2024 Dec 4;4:0176. doi: 10.34133/hds.0176. eCollection 2024.
2
A Sequential Machine Learning-cum-Attention Mechanism for Effective Segmentation of Brain Tumor.一种用于有效分割脑肿瘤的序列机器学习与注意力机制相结合的方法
Front Oncol. 2022 Jun 1;12:873268. doi: 10.3389/fonc.2022.873268. eCollection 2022.
3
A hybrid feature selection model based on improved squirrel search algorithm and rank aggregation using fuzzy techniques for biomedical data classification.
一种基于改进松鼠搜索算法和使用模糊技术进行秩聚合的混合特征选择模型,用于生物医学数据分类。
Netw Model Anal Health Inform Bioinform. 2021;10(1):39. doi: 10.1007/s13721-021-00313-7. Epub 2021 Jun 2.