Suppr超能文献

NSLPCD:基于节点重要性的标签传播社区检测算法的主题推文聚类

NSLPCD: Topic based tweets clustering using Node significance based label propagation community detection algorithm.

作者信息

Singh Jagrati, Singh Anil Kumar

机构信息

CSED, Motilal Nehru National Institute of Technology Prayagraj, Prayagraj, India.

出版信息

Ann Math Artif Intell. 2021;89(3-4):371-407. doi: 10.1007/s10472-020-09709-z. Epub 2020 Sep 24.

Abstract

Social networks like Twitter, Facebook have recently become the most widely used communication platforms for people to propagate information rapidly. Fast diffusion of information creates accuracy and scalability issues towards topic detection. Most of the existing approaches can detect the most popular topics on a large scale. However, these approaches are not effective for faster detection. This article proposes a novel topic detection approach - Node Significance based Label Propagation Community Detection (NSLPCD) algorithm, which detects the topic faster without compromising accuracy. The proposed algorithm analyzes the frequency distribution of keywords in the collection of tweets and finds two types of keywords: topic-identifying and topic-describing keywords, which play an important role in topic detection. Based on these defined keywords, the keyword co-occurrence graph is built, and subsequently, the NSLPCD algorithm is applied to get topic clusters in the form of communities. The experimental results using the real data of Twitter, show that the proposed method is effective in quality as well as run-time performance as compared to other existing methods.

摘要

推特、脸书等社交网络最近已成为人们迅速传播信息最广泛使用的通信平台。信息的快速传播给主题检测带来了准确性和可扩展性问题。大多数现有方法能够大规模检测最热门的主题。然而,这些方法对于更快的检测并不有效。本文提出了一种新颖的主题检测方法——基于节点重要性的标签传播社区检测(NSLPCD)算法,该算法在不影响准确性的情况下能更快地检测主题。所提出的算法分析推文集合中关键词的频率分布,并找到两种类型的关键词:主题识别关键词和主题描述关键词,它们在主题检测中起着重要作用。基于这些定义的关键词构建关键词共现图,随后应用NSLPCD算法以社区的形式获取主题簇。使用推特真实数据的实验结果表明,与其他现有方法相比,所提出的方法在质量以及运行时性能方面都是有效的。

相似文献

4
On Quantifying Diffusion of Health Information on Twitter.论推特上健康信息传播的量化
IEEE EMBS Int Conf Biomed Health Inform. 2017 Feb;2017:485-488. doi: 10.1109/BHI.2017.7897311. Epub 2017 Apr 13.
8
Clustering and topic modeling over tweets: A comparison over a health dataset.推特上的聚类与主题建模:基于健康数据集的比较
Proceedings (IEEE Int Conf Bioinformatics Biomed). 2019 Nov;2019:1544-1547. doi: 10.1109/bibm47256.2019.8983167. Epub 2020 Feb 6.

本文引用的文献

1
Real-Time Multimedia Social Event Detection in Microblog.微博中的实时多媒体社会事件检测
IEEE Trans Cybern. 2018 Nov;48(11):3218-3231. doi: 10.1109/TCYB.2017.2762344. Epub 2017 Oct 24.
4
Sequential algorithm for fast clique percolation.用于快速团渗透的顺序算法。
Phys Rev E Stat Nonlin Soft Matter Phys. 2008 Aug;78(2 Pt 2):026109. doi: 10.1103/PhysRevE.78.026109. Epub 2008 Aug 15.
5
Near linear time algorithm to detect community structures in large-scale networks.用于检测大规模网络中社区结构的近线性时间算法。
Phys Rev E Stat Nonlin Soft Matter Phys. 2007 Sep;76(3 Pt 2):036106. doi: 10.1103/PhysRevE.76.036106. Epub 2007 Sep 11.
7
Finding community structure in very large networks.在超大型网络中寻找社区结构。
Phys Rev E Stat Nonlin Soft Matter Phys. 2004 Dec;70(6 Pt 2):066111. doi: 10.1103/PhysRevE.70.066111. Epub 2004 Dec 6.
8
Analysis of weighted networks.加权网络分析
Phys Rev E Stat Nonlin Soft Matter Phys. 2004 Nov;70(5 Pt 2):056131. doi: 10.1103/PhysRevE.70.056131. Epub 2004 Nov 24.
9
Fast algorithm for detecting community structure in networks.网络中社区结构检测的快速算法。
Phys Rev E Stat Nonlin Soft Matter Phys. 2004 Jun;69(6 Pt 2):066133. doi: 10.1103/PhysRevE.69.066133. Epub 2004 Jun 18.
10
Finding and evaluating community structure in networks.在网络中寻找并评估社区结构。
Phys Rev E Stat Nonlin Soft Matter Phys. 2004 Feb;69(2 Pt 2):026113. doi: 10.1103/PhysRevE.69.026113. Epub 2004 Feb 26.

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验