• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

一种用于推特数据流的主题标签推荐系统。

A hashtag recommendation system for twitter data streams.

作者信息

Otsuka Eriko, Wallace Scott A, Chiu David

机构信息

School of Engineering and Computer Science, Washington State University, Vancouver, USA.

Department of Mathematics and Computer Science, University of Puget Sound, Tacoma, USA.

出版信息

Comput Soc Netw. 2016;3(1):3. doi: 10.1186/s40649-016-0028-9. Epub 2016 May 31.

DOI:10.1186/s40649-016-0028-9
PMID:29355223
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC5749337/
Abstract

BACKGROUND

Twitter has evolved into a powerful communication and information sharing tool used by millions of people around the world to post what is happening now. A hashtag, a keyword prefixed with a hash symbol (#), is a feature in Twitter to organize tweets and facilitate effective search among a massive volume of data. In this paper, we propose an automatic hashtag recommendation system that helps users find new hashtags related to their interests on-demand.

METHODS

For hashtag ranking, we propose the Hashtag Frequency-Inverse Hashtag Ubiquity (HF-IHU) ranking scheme, which is a variation of the well-known TF-IDF, that considers hashtag relevancy, as well as data sparseness which is one of the key challenges in analyzing microblog data. Our system is built on top of Hadoop, a leading platform for distributed computing, to provide scalable performance using Map-Reduce. Experiments on a large Twitter data set demonstrate that our method successfully yields relevant hashtags for user's interest and that recommendations are more stable and reliable than ranking tags based on tweet content similarity.

RESULTS AND CONCLUSIONS

Our results show that HF-IHU can achieve over 30 % hashtag recall when asked to identify the top 10 relevant hashtags for a particular tweet. Furthermore, our method out-performs kNN, k-popularity, and Naïve Bayes by 69, 54, and 17 %, respectively, on recall of the top 200 hashtags.

摘要

背景

推特已发展成为一种强大的通信和信息共享工具,全球数百万人用它来发布当下正在发生的事情。话题标签是推特中一种以井号(#)为前缀的关键词,用于组织推文并便于在海量数据中进行有效搜索。在本文中,我们提出了一种自动话题标签推荐系统,该系统可帮助用户按需找到与其兴趣相关的新话题标签。

方法

对于话题标签排名,我们提出了话题标签频率-逆话题标签普遍性(HF-IHU)排名方案,它是著名的词频-逆文档频率(TF-IDF)的一种变体,该方案既考虑了话题标签的相关性,也考虑了数据稀疏性,而数据稀疏性是分析微博数据时的关键挑战之一。我们的系统构建在分布式计算的领先平台Hadoop之上,以使用Map-Reduce提供可扩展的性能。在一个大型推特数据集上进行的实验表明,我们的方法成功地为用户兴趣生成了相关的话题标签,并且与基于推文内容相似度对标签进行排名相比,推荐结果更稳定、更可靠。

结果与结论

我们的结果表明,当被要求为某条特定推文识别前10个相关话题标签时,HF-IHU的话题标签召回率可超过30%。此外,在召回前200个话题标签方面,我们的方法分别比k近邻算法、k流行度算法和朴素贝叶斯算法高出69%、54%和17%。

相似文献

1
A hashtag recommendation system for twitter data streams.一种用于推特数据流的主题标签推荐系统。
Comput Soc Netw. 2016;3(1):3. doi: 10.1186/s40649-016-0028-9. Epub 2016 May 31.
2
Comparison of Intercom and Megaphone Hashtags Using Four Years of Tweets From the Top 44 Schools of Nursing: Thematic Analysis.使用排名前44位护理学院的四年推文对对讲机和扩音器主题标签进行比较:主题分析
JMIR Nurs. 2021 Apr 20;4(2):e25114. doi: 10.2196/25114. eCollection 2021 Apr-Jun.
3
#Menopause: the menopause ontology project.绝经:绝经本体项目。
Menopause. 2022 Sep 1;29(9):1037-1039. doi: 10.1097/GME.0000000000002012. Epub 2022 Aug 2.
4
#Covid-19: An exploratory investigation of hashtag usage on Twitter.Covid-19:推特话题标签使用情况的探索性调查。
Health Policy. 2021 Apr;125(4):541-547. doi: 10.1016/j.healthpol.2021.01.001. Epub 2021 Jan 9.
5
Modeling the popularity of twitter hashtags with master equations.用主方程对推特话题标签的流行度进行建模。
Soc Netw Anal Min. 2022;12(1):29. doi: 10.1007/s13278-022-00861-4. Epub 2022 Feb 2.
6
Recommendations From the Twitter Hashtag #DoctorsAreDickheads: Qualitative Analysis.从 Twitter 话题标签 #医生都是混蛋 中得到的建议:定性分析。
J Med Internet Res. 2020 Oct 28;22(10):e17595. doi: 10.2196/17595.
7
Patient safety discourse in a pandemic: a Twitter hashtag analysis study on #PatientSafety.大流行中的患者安全论述:关于#PatientSafety 的 Twitter 话题标签分析研究
Front Public Health. 2023 Nov 16;11:1268730. doi: 10.3389/fpubh.2023.1268730. eCollection 2023.
8
Use of the Hashtag #DataSavesLives on Twitter: Exploratory and Thematic Analysis.在 Twitter 上使用话题标签#DataSavesLives:探索性和主题分析。
J Med Internet Res. 2022 Nov 15;24(11):e38232. doi: 10.2196/38232.
9
Analyzing Online Twitter Discussion for Male Infertility via the Hashtag #MaleInfertility.通过#男性不育#话题标签分析推特上关于男性不育的在线讨论。
Urol Pract. 2020 Jan;7(1):68-74. doi: 10.1097/UPJ.0000000000000066. Epub 2019 May 7.
10
Quantification of Urology Related Twitter Traffic Activity through a Standardized List of Social Media Communication Descriptors.通过社交媒体通信描述符的标准化列表对泌尿外科相关推特流量活动进行量化。
Urol Pract. 2017 Jul;4(4):349-354. doi: 10.1016/j.urpr.2016.07.011. Epub 2016 Oct 22.

引用本文的文献

1
Impact of #PCOSweightloss: A global X hashtag analysis study of weight loss narratives in the PCOS community.#多囊卵巢综合征与减肥的影响:一项关于多囊卵巢综合征群体减肥经历的全球X主题标签分析研究
Digit Health. 2025 Jan 29;11:20552076251314100. doi: 10.1177/20552076251314100. eCollection 2025 Jan-Dec.
2
Developing insights from the collective voice of target users in Twitter.从推特上目标用户的集体声音中获取见解。
J Big Data. 2022;9(1):75. doi: 10.1186/s40537-022-00611-5. Epub 2022 Jun 2.
3
Perspectives Toward Seeking Treatment Among Patients With Psoriasis: Protocol for a Twitter Content Analysis.

本文引用的文献

1
Scalable Nearest Neighbor Algorithms for High Dimensional Data.高维数据的可扩展最近邻算法。
IEEE Trans Pattern Anal Mach Intell. 2014 Nov;36(11):2227-40. doi: 10.1109/TPAMI.2014.2321376.
银屑病患者寻求治疗的观点:一项推特内容分析方案
JMIR Res Protoc. 2021 Feb 18;10(2):e13731. doi: 10.2196/13731.