Singh Jagrati, Singh Anil Kumar
CSED, Motilal Nehru National Institute of Technology Prayagraj, Prayagraj, India.
Ann Math Artif Intell. 2021;89(3-4):371-407. doi: 10.1007/s10472-020-09709-z. Epub 2020 Sep 24.
Social networks like Twitter, Facebook have recently become the most widely used communication platforms for people to propagate information rapidly. Fast diffusion of information creates accuracy and scalability issues towards topic detection. Most of the existing approaches can detect the most popular topics on a large scale. However, these approaches are not effective for faster detection. This article proposes a novel topic detection approach - Node Significance based Label Propagation Community Detection (NSLPCD) algorithm, which detects the topic faster without compromising accuracy. The proposed algorithm analyzes the frequency distribution of keywords in the collection of tweets and finds two types of keywords: topic-identifying and topic-describing keywords, which play an important role in topic detection. Based on these defined keywords, the keyword co-occurrence graph is built, and subsequently, the NSLPCD algorithm is applied to get topic clusters in the form of communities. The experimental results using the real data of Twitter, show that the proposed method is effective in quality as well as run-time performance as compared to other existing methods.
推特、脸书等社交网络最近已成为人们迅速传播信息最广泛使用的通信平台。信息的快速传播给主题检测带来了准确性和可扩展性问题。大多数现有方法能够大规模检测最热门的主题。然而,这些方法对于更快的检测并不有效。本文提出了一种新颖的主题检测方法——基于节点重要性的标签传播社区检测(NSLPCD)算法,该算法在不影响准确性的情况下能更快地检测主题。所提出的算法分析推文集合中关键词的频率分布,并找到两种类型的关键词:主题识别关键词和主题描述关键词,它们在主题检测中起着重要作用。基于这些定义的关键词构建关键词共现图,随后应用NSLPCD算法以社区的形式获取主题簇。使用推特真实数据的实验结果表明,与其他现有方法相比,所提出的方法在质量以及运行时性能方面都是有效的。