Diaz-Garcia Jose A, Ruiz M Dolores, Martin-Bautista Maria J
Department of Computer Science and Artificial Intelligence, University of Granada, Granada, Spain.
Artif Intell Rev. 2023;56(2):1175-1200. doi: 10.1007/s10462-022-10196-3. Epub 2022 May 12.
The incursion of social media in our lives has been much accentuated in the last decade. This has led to a multiplication of data mining tools aimed at obtaining knowledge from these data sources. One of the greatest challenges in this area is to be able to obtain this knowledge without the need for training processes, which requires structured information and pre-labelled datasets. This is where unsupervised data mining techniques come in. These techniques can obtain value from these unstructured and unlabelled data, providing very interesting solutions to enhance the decision-making process. In this paper, we first address the problem of social media mining, as well as the need for unsupervised techniques, in particular association rules, for its treatment. We follow with a broad overview of the applications of association rules in the domain of social media mining, specifically, their application to the problems of mining textual entities, such as tweets. We also focus on the strengths and weaknesses of using association rules for solving different tasks in textual social media. Finally, the paper provides a perspective overview of the challenges that association rules must face in the next decade within the field of social media mining.
在过去十年中,社交媒体对我们生活的侵入显著加剧。这导致了旨在从这些数据源获取知识的数据挖掘工具成倍增加。该领域最大的挑战之一是能够在无需训练过程的情况下获取此类知识,而这需要结构化信息和预标注数据集。无监督数据挖掘技术正是在这种情况下应运而生。这些技术能够从这些非结构化和未标注的数据中获取价值,为加强决策过程提供非常有趣的解决方案。在本文中,我们首先探讨社交媒体挖掘问题,以及采用无监督技术(特别是关联规则)来处理该问题的必要性。接着,我们全面概述关联规则在社交媒体挖掘领域的应用,具体而言,是其在挖掘文本实体(如推文)问题上的应用。我们还着重分析了在文本社交媒体中使用关联规则解决不同任务的优缺点。最后,本文对关联规则在未来十年社交媒体挖掘领域必须面对的挑战进行了前瞻性概述。