• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

一种通过整合特征加权和核学习来处理不完整数据的广义模糊聚类框架。

A generalized fuzzy clustering framework for incomplete data by integrating feature weighted and kernel learning.

作者信息

Yang Ying, Chen Haoyu, Wu Haoshen

机构信息

College of Information and Intelligence, Hunan Agricultural University, Changsha, China.

New Energy College, Xi'an Shiyou University, Xi'an, China.

出版信息

PeerJ Comput Sci. 2023 Oct 5;9:e1600. doi: 10.7717/peerj-cs.1600. eCollection 2023.

DOI:10.7717/peerj-cs.1600
PMID:37869452
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10588703/
Abstract

Missing data presents a challenge to clustering algorithms, as traditional methods tend to pad incomplete data first before clustering. To combine the two processes of padding and clustering and improve the clustering accuracy, a generalized fuzzy clustering framework is proposed based on optimal completion strategy (OCS) and nearest prototype strategy (NPS) with four improved algorithms developed. Feature weights are introduced to reduce outliers' influence on the cluster centers, and kernel functions are used to solve the linear indistinguishability problem. The proposed algorithms are evaluated regarding correct clustering rate, iteration number, and external evaluation indexes with nine datasets from the UCI (University of California, Irvine) Machine Learning Repository. The results of the experiment indicate that the clustering accuracy of the feature weighted kernel fuzzy C-means algorithm with NPS (NPS-WKFCM) and feature weighted kernel fuzzy C-means algorithm with OCS (OCS-WKFCM) under varying missing rates is superior to that of seven conventional algorithms. Experiments demonstrate that the enhanced algorithm proposed for clustering incomplete data is superior.

摘要

缺失数据给聚类算法带来了挑战,因为传统方法往往在聚类之前先对不完整数据进行填充。为了将填充和聚类这两个过程结合起来并提高聚类精度,提出了一种基于最优补全策略(OCS)和最近原型策略(NPS)的广义模糊聚类框架,并开发了四种改进算法。引入特征权重以减少异常值对聚类中心的影响,并使用核函数来解决线性不可区分性问题。使用来自加州大学欧文分校(UCI)机器学习库的九个数据集,从正确聚类率、迭代次数和外部评估指标等方面对所提出的算法进行了评估。实验结果表明,在不同缺失率下,采用NPS的特征加权核模糊C均值算法(NPS-WKFCM)和采用OCS的特征加权核模糊C均值算法(OCS-WKFCM)的聚类精度优于七种传统算法。实验表明,所提出的用于聚类不完整数据的增强算法具有优越性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bcff/10588703/538c899a270e/peerj-cs-09-1600-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bcff/10588703/23e8124efbd9/peerj-cs-09-1600-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bcff/10588703/a069c7bd9fda/peerj-cs-09-1600-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bcff/10588703/346aa6f9da16/peerj-cs-09-1600-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bcff/10588703/538c899a270e/peerj-cs-09-1600-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bcff/10588703/23e8124efbd9/peerj-cs-09-1600-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bcff/10588703/a069c7bd9fda/peerj-cs-09-1600-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bcff/10588703/346aa6f9da16/peerj-cs-09-1600-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bcff/10588703/538c899a270e/peerj-cs-09-1600-g004.jpg

相似文献

1
A generalized fuzzy clustering framework for incomplete data by integrating feature weighted and kernel learning.一种通过整合特征加权和核学习来处理不完整数据的广义模糊聚类框架。
PeerJ Comput Sci. 2023 Oct 5;9:e1600. doi: 10.7717/peerj-cs.1600. eCollection 2023.
2
Adaptive kernel fuzzy clustering for missing data.自适应核模糊聚类处理缺失数据。
PLoS One. 2021 Nov 12;16(11):e0259266. doi: 10.1371/journal.pone.0259266. eCollection 2021.
3
A multiple kernel density clustering algorithm for incomplete datasets in bioinformatics.一种用于生物信息学中不完整数据集的多核密度聚类算法。
BMC Syst Biol. 2018 Nov 22;12(Suppl 6):111. doi: 10.1186/s12918-018-0630-6.
4
Apache Spark based kernelized fuzzy clustering framework for single nucleotide polymorphism sequence analysis.基于 Apache Spark 的核模糊聚类框架用于单核苷酸多态性序列分析。
Comput Biol Chem. 2021 Jun;92:107454. doi: 10.1016/j.compbiolchem.2021.107454. Epub 2021 Feb 10.
5
Hybrid fuzzy cluster ensemble framework for tumor clustering from biomolecular data.用于从生物分子数据中进行肿瘤聚类的混合模糊聚类集成框架。
IEEE/ACM Trans Comput Biol Bioinform. 2013 May-Jun;10(3):657-70. doi: 10.1109/TCBB.2013.59.
6
Retinal Blood-Vessel Extraction Using Weighted Kernel Fuzzy C-Means Clustering and Dilation-Based Functions.基于加权核模糊C均值聚类和基于扩张函数的视网膜血管提取
Diagnostics (Basel). 2023 Jan 17;13(3):342. doi: 10.3390/diagnostics13030342.
7
An improved fuzzy C-means clustering algorithm for assisted therapy of chronic bronchitis.一种用于慢性支气管炎辅助治疗的改进型模糊C均值聚类算法。
Technol Health Care. 2015;23(6):699-713. doi: 10.3233/THC-151023.
8
A New Validity Index Based on Fuzzy Energy and Fuzzy Entropy Measures in Fuzzy Clustering Problems.基于模糊聚类问题中模糊能量和模糊熵测度的一种新有效性指标。
Entropy (Basel). 2020 Oct 23;22(11):1200. doi: 10.3390/e22111200.
9
An effective fuzzy kernel clustering analysis approach for gene expression data.一种用于基因表达数据的有效模糊核聚类分析方法。
Biomed Mater Eng. 2015;26 Suppl 1:S1863-9. doi: 10.3233/BME-151489.
10
The Optimally Designed Variational Autoencoder Networks for Clustering and Recovery of Incomplete Multimedia Data.最优设计变分自编码器网络用于聚类和恢复不完全多媒体数据。
Sensors (Basel). 2019 Feb 16;19(4):809. doi: 10.3390/s19040809.

本文引用的文献

1
Multiple Imputation with Neural Network Gaussian Process for High-dimensional Incomplete Data.用于高维不完整数据的神经网络高斯过程多重插补
Proc Mach Learn Res. 2022 Dec;189:265-279.
2
Classification of ballpoint pen inks based on selective extraction and subsequent digital color and cluster analyses.基于选择性萃取及后续数字颜色和聚类分析的圆珠笔油墨分类
Analyst. 2022 Jun 27;147(13):3055-3064. doi: 10.1039/d2an00482h.
3
iCVI-ARTMAP: Using Incremental Cluster Validity Indices and Adaptive Resonance Theory Reset Mechanism to Accelerate Validation and Achieve Multiprototype Unsupervised Representations.
iCVI-ARTMAP:利用增量聚类有效性指标和自适应共振理论重置机制加速验证并实现多原型无监督表示
IEEE Trans Neural Netw Learn Syst. 2023 Dec;34(12):9757-9770. doi: 10.1109/TNNLS.2022.3160381. Epub 2023 Nov 30.
4
Fuzzy c-means clustering of incomplete data.不完整数据的模糊 c 均值聚类
IEEE Trans Syst Man Cybern B Cybern. 2001;31(5):735-44. doi: 10.1109/3477.956035.