Suppr超能文献

基于弹珠损失的稀疏双子支持向量聚类

Sparse Twin Support Vector Clustering Using Pinball Loss.

出版信息

IEEE J Biomed Health Inform. 2021 Oct;25(10):3776-3783. doi: 10.1109/JBHI.2021.3059910. Epub 2021 Oct 5.

Abstract

Clustering is a widely used machine learning technique for unlabelled data. One of the recently proposed techniques is the twin support vector clustering (TWSVC) algorithm. The idea of TWSVC is to generate hyperplanes for each cluster. TWSVC utilizes the hinge loss function to penalize the misclassification. However, the hinge loss relies on shortest distance between different clusters, and is unstable for noise-corrupted datasets, and for re-sampling. In this paper, we propose a novel Sparse Pinball loss Twin Support Vector Clustering (SPTSVC). The proposed SPTSVC involves the ϵ-insensitive pinball loss function to formulate a sparse solution. Pinball loss function provides noise-insensitivity and re-sampling stability. The ϵ-insensitive zone provides sparsity to the model and improves testing time. Numerical experiments on synthetic as well as real world benchmark datasets are performed to show the efficacy of the proposed model. An analysis on the sparsity of various clustering algorithms is presented in this work. In order to show the feasibility and applicability of the proposed SPTSVC on biomedical data, experiments have been performed on epilepsy and breast cancer datasets.

摘要

聚类是一种广泛应用于无标签数据的机器学习技术。最近提出的一种技术是孪生支持向量聚类(TWSVC)算法。TWSVC 的思想是为每个簇生成超平面。TWSVC 利用 hinge 损失函数来惩罚分类错误。然而,hinge 损失依赖于不同簇之间的最短距离,对于噪声污染数据集和重采样来说是不稳定的。在本文中,我们提出了一种新颖的稀疏弹球损失孪生支持向量聚类(SPTSVC)。所提出的 SPTSVC 涉及到 ϵ-不敏感弹球损失函数来构建稀疏解。弹球损失函数提供了对噪声的不敏感性和重采样稳定性。ϵ-不敏感区域为模型提供了稀疏性,并提高了测试时间。在合成和真实世界基准数据集上进行了数值实验,以展示所提出模型的有效性。本文还对各种聚类算法的稀疏性进行了分析。为了展示所提出的 SPTSVC 在生物医学数据上的可行性和适用性,我们在癫痫和乳腺癌数据集上进行了实验。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验