Chitra Pka, Balasubramanian Saravana Balaji, Khattab Omar, Al-Kadri Mhd Omar
Department of Information Technology, Rathinam Group of Institutions, Coimbatore, Tamil Nadu, India.
Department of Computing, De Montfort University Kazakhstan, Almaty, Kazakhstan.
PeerJ Comput Sci. 2024 Dec 10;10:e2093. doi: 10.7717/peerj-cs.2093. eCollection 2024.
In the realm of multi-label learning, instances are often characterized by a plurality of labels, diverging from the single-label paradigm prevalent in conventional datasets. Multi-label techniques often employ a similar feature space to build classification models for every label. Nevertheless, labels typically convey distinct semantic information and should possess their own unique attributes. Several approaches have been suggested to identify label-specific characteristics for creating distinct categorization models. Our proposed methodology seeks to encapsulate and systematically represent label correlations within the learning framework. The innovation of improved multi-label Naïve Bayes () lies in its strategic expansion of the input space, which assimilates meta information derived from the label space, thereby engendering a composite input domain that encompasses both continuous and categorical variables. To accommodate the heterogeneity of the expanded input space, we refine the likelihood parameters of using a joint density function, which is adept at handling the amalgamation of data types. We subject our enhanced model to a rigorous empirical evaluation, utilizing six benchmark datasets. The performance of our approach is gauged against the traditional multi-label Naïve Bayes () algorithm and is quantified through a suite of evaluation metrics. The empirical results not only affirm the competitive edge of our proposed method over the conventional but also demonstrate its superiority across the aforementioned metrics. This underscores the efficacy of modeling label dependencies in multi-label learning environments and positions our approach as a significant contribution to the field.
在多标签学习领域,实例通常由多个标签表征,这与传统数据集中普遍存在的单标签范式不同。多标签技术通常采用相似的特征空间为每个标签构建分类模型。然而,标签通常传达不同的语义信息,并且应该拥有自己独特的属性。已经提出了几种方法来识别标签特定的特征,以创建不同的分类模型。我们提出的方法旨在在学习框架内封装并系统地表示标签相关性。改进的多标签朴素贝叶斯()的创新之处在于其对输入空间的策略性扩展,它吸收了从标签空间派生的元信息,从而产生了一个包含连续变量和分类变量的复合输入域。为了适应扩展输入空间的异质性,我们使用联合密度函数来细化的似然参数,该函数擅长处理数据类型的合并。我们使用六个基准数据集对增强后的模型进行了严格的实证评估。我们方法的性能与传统的多标签朴素贝叶斯()算法进行了比较,并通过一套评估指标进行了量化。实证结果不仅证实了我们提出的方法相对于传统方法的竞争优势,还证明了其在上述指标上的优越性。这突出了在多标签学习环境中对标签依赖性进行建模的有效性,并将我们的方法定位为该领域的一项重要贡献。