具有不完整核的多核k均值算法

Multiple Kernel k-Means with Incomplete Kernels.

作者信息

Liu Xinwang, Zhu Xinzhong, Li Miaomiao, Wang Lei, Zhu En, Liu Tongliang, Kloft Marius, Shen Dinggang, Yin Jianping, Gao Wen

出版信息

IEEE Trans Pattern Anal Mach Intell. 2020 May;42(5):1191-1204. doi: 10.1109/TPAMI.2019.2892416. Epub 2019 Jan 14.

DOI:10.1109/TPAMI.2019.2892416

PMID:30640600

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC6626696/

Abstract

Multiple kernel clustering (MKC) algorithms optimally combine a group of pre-specified base kernel matrices to improve clustering performance. However, existing MKC algorithms cannot efficiently address the situation where some rows and columns of base kernel matrices are absent. This paper proposes two simple yet effective algorithms to address this issue. Different from existing approaches where incomplete kernel matrices are first imputed and a standard MKC algorithm is applied to the imputed kernel matrices, our first algorithm integrates imputation and clustering into a unified learning procedure. Specifically, we perform multiple kernel clustering directly with the presence of incomplete kernel matrices, which are treated as auxiliary variables to be jointly optimized. Our algorithm does not require that there be at least one complete base kernel matrix over all the samples. Also, it adaptively imputes incomplete kernel matrices and combines them to best serve clustering. Moreover, we further improve this algorithm by encouraging these incomplete kernel matrices to mutually complete each other. The three-step iterative algorithm is designed to solve the resultant optimization problems. After that, we theoretically study the generalization bound of the proposed algorithms. Extensive experiments are conducted on 13 benchmark data sets to compare the proposed algorithms with existing imputation-based methods. Our algorithms consistently achieve superior performance and the improvement becomes more significant with increasing missing ratio, verifying the effectiveness and advantages of the proposed joint imputation and clustering.

摘要

多核聚类（MKC）算法通过最优地组合一组预先指定的基核矩阵来提高聚类性能。然而，现有的MKC算法无法有效处理基核矩阵的某些行和列缺失的情况。本文提出了两种简单而有效的算法来解决这个问题。与现有方法不同，现有方法是先对不完整的核矩阵进行插补，然后将标准的MKC算法应用于插补后的核矩阵，而我们的第一种算法将插补和聚类集成到一个统一的学习过程中。具体来说，我们在存在不完整核矩阵的情况下直接进行多核聚类，将这些不完整核矩阵视为需要联合优化的辅助变量。我们的算法不要求在所有样本上至少有一个完整的基核矩阵。此外，它能自适应地插补不完整核矩阵并将它们组合起来以最好地服务于聚类。而且，我们通过鼓励这些不完整核矩阵相互补充来进一步改进该算法。设计了三步迭代算法来解决由此产生的优化问题。之后，我们从理论上研究了所提出算法的泛化界。在13个基准数据集上进行了广泛的实验，将所提出的算法与现有的基于插补的方法进行比较。我们的算法始终表现出卓越的性能，并且随着缺失率的增加，性能提升变得更加显著，验证了所提出的联合插补和聚类的有效性和优势。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/781f/6626696/1bae10d04755/nihms-1010546-f0001.jpg

相似文献

Multiple Kernel k-Means with Incomplete Kernels.具有不完整核的多核k均值算法

IEEE Trans Pattern Anal Mach Intell. 2020 May;42(5):1191-1204. doi: 10.1109/TPAMI.2019.2892416. Epub 2019 Jan 14.

Efficient and Effective Regularized Incomplete Multi-View Clustering.高效且有效的正则化不完全多视图聚类

IEEE Trans Pattern Anal Mach Intell. 2021 Aug;43(8):2634-2646. doi: 10.1109/TPAMI.2020.2974828. Epub 2021 Jul 1.

Late Fusion Incomplete Multi-View Clustering.晚期融合不完全多视图聚类

IEEE Trans Pattern Anal Mach Intell. 2019 Oct;41(10):2410-2423. doi: 10.1109/TPAMI.2018.2879108. Epub 2018 Nov 1.

Incomplete Multiple Kernel Alignment Maximization for Clustering.用于聚类的不完全多核对齐最大化

IEEE Trans Pattern Anal Mach Intell. 2024 Mar;46(3):1412-1424. doi: 10.1109/TPAMI.2021.3116948. Epub 2024 Feb 6.

Absent Multiple Kernel Learning Algorithms.缺失的多核学习算法。

IEEE Trans Pattern Anal Mach Intell. 2020 Jun;42(6):1303-1316. doi: 10.1109/TPAMI.2019.2895608. Epub 2019 Jan 28.

Localized Incomplete Multiple Kernel k-Means With Matrix-Induced Regularization.具有矩阵诱导正则化的局部不完全多核k均值算法

IEEE Trans Cybern. 2023 Jun;53(6):3479-3492. doi: 10.1109/TCYB.2021.3126727. Epub 2023 May 17.

Multiple Kernel Clustering With Neighbor-Kernel Subspace Segmentation.基于邻域核子空间分割的多核聚类

IEEE Trans Neural Netw Learn Syst. 2020 Apr;31(4):1351-1362. doi: 10.1109/TNNLS.2019.2919900. Epub 2019 Jun 28.

Late Fusion Multiple Kernel Clustering With Proxy Graph Refinement.基于代理图优化的晚期融合多核聚类

IEEE Trans Neural Netw Learn Syst. 2023 Aug;34(8):4359-4370. doi: 10.1109/TNNLS.2021.3117403. Epub 2023 Aug 4.

Regularized Simple Multiple Kernel k-Means With Kernel Average Alignment.带核平均对齐的正则化简单多核k均值算法

IEEE Trans Neural Netw Learn Syst. 2024 Nov;35(11):15910-15919. doi: 10.1109/TNNLS.2023.3290219. Epub 2024 Oct 29.

A multiple kernel density clustering algorithm for incomplete datasets in bioinformatics.一种用于生物信息学中不完整数据集的多核密度聚类算法。

BMC Syst Biol. 2018 Nov 22;12(Suppl 6):111. doi: 10.1186/s12918-018-0630-6.

引用本文的文献

Dynamic graph structure evolution for node classification with missing attributes.用于具有缺失属性的节点分类的动态图结构演化

Sci Rep. 2025 Jul 16;15(1):25687. doi: 10.1038/s41598-025-09840-z.

Exploring feature sparsity for out-of-distribution detection.探索用于分布外检测的特征稀疏性。

Sci Rep. 2024 Nov 18;14(1):28444. doi: 10.1038/s41598-024-79934-7.

[An MRI multi-sequence feature imputation and fusion mutual-aid model based on sequence deletion for differentiation of high-grade from low-grade glioma].基于序列删除的MRI多序列特征插补与融合互助模型用于高级别与低级别胶质瘤的鉴别

Nan Fang Yi Ke Da Xue Xue Bao. 2024 Aug 20;44(8):1561-1570. doi: 10.12122/j.issn.1673-4254.2024.08.15.

A simulation study on missing data imputation for dichotomous variables using statistical and machine learning methods.使用统计和机器学习方法对二分类变量缺失数据进行插补的模拟研究。

Sci Rep. 2023 Jun 9;13(1):9432. doi: 10.1038/s41598-023-36509-2.

Community Detection in Semantic Networks: A Multi-View Approach.语义网络中的社区检测：一种多视图方法。

Entropy (Basel). 2022 Aug 17;24(8):1141. doi: 10.3390/e24081141.

Kernel Probabilistic K-Means Clustering.核概率 K-均值聚类。

Sensors (Basel). 2021 Mar 8;21(5):1892. doi: 10.3390/s21051892.

A Novel Model on Reinforce K-Means Using Location Division Model and Outlier of Initial Value for Lowering Data Cost.一种基于位置划分模型和初始值离群点强化K均值的新型模型，用于降低数据成本。

Entropy (Basel). 2020 Aug 17;22(8):902. doi: 10.3390/e22080902.

Real-Time Pattern-Recognition of GPR Images with YOLO v3 Implemented by Tensorflow.基于 Tensorflow 实现的 YOLO v3 的探地雷达图像实时模式识别。

Sensors (Basel). 2020 Nov 12;20(22):6476. doi: 10.3390/s20226476.

An Adaptive Ellipse Distance Density Peak Fuzzy Clustering Algorithm Based on the Multi-target Traffic Radar.一种基于多目标交通雷达的自适应椭圆距离密度峰值模糊聚类算法。

Sensors (Basel). 2020 Aug 31;20(17):4920. doi: 10.3390/s20174920.

本文引用的文献

Flexible Multi-View Dimensionality Co-Reduction.灵活的多视角维度协同约减。

IEEE Trans Image Process. 2017 Feb;26(2):648-659. doi: 10.1109/TIP.2016.2627806. Epub 2016 Nov 10.

Multi-View Learning With Incomplete Views.多视角学习与不完全视角。

IEEE Trans Image Process. 2015 Dec;24(12):5812-25. doi: 10.1109/TIP.2015.2490539. Epub 2015 Oct 13.

Constrained Multi-View Video Face Clustering.约束多视角视频人脸聚类。

IEEE Trans Image Process. 2015 Nov;24(11):4381-93. doi: 10.1109/TIP.2015.2463223. Epub 2015 Jul 30.

An Efficient Approach to Integrating Radius Information into Multiple Kernel Learning.将半径信息有效整合到多核学习中的方法。

IEEE Trans Cybern. 2013 Apr;43(2):557-69. doi: 10.1109/TSMCB.2012.2212243. Epub 2013 Mar 7.

Optimized data fusion for kernel k-means clustering.核 K-均值聚类的数据优化融合。

IEEE Trans Pattern Anal Mach Intell. 2012 May;34(5):1031-9. doi: 10.1109/TPAMI.2011.255.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验