Akhtar Mushir, Tanveer M, Arshad Mohd
IEEE Trans Pattern Anal Mach Intell. 2025 Jan;47(1):149-160. doi: 10.1109/TPAMI.2024.3465535. Epub 2024 Dec 4.
In the domain of machine learning, the significance of the loss function is paramount, especially in supervised learning tasks. It serves as a fundamental pillar that profoundly influences the behavior and efficacy of supervised learning algorithms. Traditional loss functions, though widely used, often struggle to handle outlier-prone and high-dimensional data, resulting in suboptimal outcomes and slow convergence during training. In this paper, we address the aforementioned constraints by proposing a novel robust, bounded, sparse, and smooth (RoBoSS) loss function for supervised learning. Further, we incorporate the RoBoSS loss within the framework of support vector machine (SVM) and introduce a new robust algorithm named -SVM. For the theoretical analysis, the classification-calibrated property and generalization ability are also presented. These investigations are crucial for gaining deeper insights into the robustness of the RoBoSS loss function in classification problems and its potential to generalize well to unseen data. To validate the potency of the proposed -SVM, we assess it on 88 benchmark datasets from KEEL and UCI repositories. Further, to rigorously evaluate its performance in challenging scenarios, we conducted an assessment using datasets intentionally infused with outliers and label noise. Additionally, to exemplify the effectiveness of -SVM within the biomedical domain, we evaluated it on two medical datasets: the electroencephalogram (EEG) signal dataset and the breast cancer (BreaKHis) dataset. The numerical results substantiate the superiority of the proposed -SVM model, both in terms of its remarkable generalization performance and its efficiency in training time.
在机器学习领域,损失函数的重要性至关重要,尤其是在监督学习任务中。它是一个基本支柱,对监督学习算法的行为和效果有着深远影响。传统损失函数虽然被广泛使用,但往往难以处理容易出现异常值和高维的数据,导致训练期间结果次优且收敛缓慢。在本文中,我们通过为监督学习提出一种新颖的鲁棒、有界、稀疏且平滑(RoBoSS)损失函数来解决上述限制。此外,我们将RoBoSS损失纳入支持向量机(SVM)框架,并引入一种名为-SVM的新鲁棒算法。对于理论分析,还给出了分类校准属性和泛化能力。这些研究对于更深入了解RoBoSS损失函数在分类问题中的鲁棒性及其对未见数据良好泛化的潜力至关重要。为了验证所提出的-SVM的有效性,我们在来自KEEL和UCI存储库的88个基准数据集上对其进行评估。此外,为了严格评估其在具有挑战性场景中的性能,我们使用故意注入异常值和标签噪声的数据集进行评估。此外,为了举例说明-SVM在生物医学领域的有效性,我们在两个医学数据集上对其进行评估:脑电图(EEG)信号数据集和乳腺癌(BreaKHis)数据集。数值结果证实了所提出的-SVM模型的优越性,无论是在其卓越的泛化性能还是在训练时间效率方面。