Suppr超能文献

2023年癌症相关关键词:来自某大型消费门户网站文本挖掘的见解

Cancer-related Keywords in 2023: Insights from Text Mining of a Major Consumer Portal.

作者信息

Jeong Wonjeong, Song Eunkyoung, Jeong Eunzi, Oh Kyoung Hee, Lee Hye-Sun, Jun Jae Kwan

机构信息

Cancer Knowledge & Information Center, National Cancer Control Institute, National Cancer Center, Goyang, Korea.

出版信息

Healthc Inform Res. 2024 Oct;30(4):398-408. doi: 10.4258/hir.2024.30.4.398. Epub 2024 Oct 31.

Abstract

OBJECTIVES

With the growing importance of monitoring cancer patients' internet usage, there is an increasing need for technology that expands access to relevant information through text mining. This study analyzed internet articles from portal sites in 2023 to identify trends in the information available to cancer patients and to derive meaningful insights.

METHODS

This study analyzed 19,578 news articles published on Naver, a major Korean portal site, from January 1, 2023, to December 31, 2023. Natural language processing, text mining, network analysis, and word cloud analysis were employed. The search term "am" (Korean for "cancer") was used to identify keywords related to cancer.

RESULTS

In 2023, an average of 1,631 cancer-related articles were published monthly, with a peak of 1,946 in September and a low of 1,371 in February. A total of 132,456 keywords were extracted, with "cure" (2,218 occurrences), "lung cancer" (1,652), and "breast cancer" (1,235) being the most frequent. Term frequency-inverse document frequency analysis ranked "struggle" (1064.172) as the most significant keyword, followed by "lung cancer" (839.988) and "breast cancer" (744.840). Network analysis revealed four distinct clusters focusing on treatment, celebrity-related issues, major cancer types, and cancer-causing factors.

CONCLUSIONS

The analysis of cancer-related keywords in 2023 indicates that news articles often prioritize gossip over essential information. These findings provide foundational data for future policy directions and strategies to address misinformation. This study underscores the importance of understanding the nature of cancer-related information consumed by the public and offers insights to guide official policies and healthcare practices.

摘要

目的

随着监测癌症患者互联网使用情况的重要性日益增加,对通过文本挖掘扩大获取相关信息渠道的技术需求也在不断增长。本研究分析了2023年门户网站上的互联网文章,以确定癌症患者可获取信息的趋势并得出有意义的见解。

方法

本研究分析了2023年1月1日至2023年12月31日在韩国主要门户网站Naver上发表的19578篇新闻文章。采用了自然语言处理、文本挖掘、网络分析和词云分析。搜索词“암”(韩语中“癌症”的意思)用于识别与癌症相关的关键词。

结果

2023年,每月平均发表1631篇与癌症相关的文章,9月达到峰值1946篇,2月降至最低点1371篇。共提取了132456个关键词,其中“治愈”(出现2218次)、“肺癌”(1652次)和“乳腺癌”(1235次)最为常见。词频逆文档频率分析将“抗争”(1064.172)列为最显著的关键词,其次是“肺癌”(839.988)和“乳腺癌”(744.840)。网络分析揭示了四个不同的聚类,分别关注治疗、名人相关问题、主要癌症类型和致癌因素。

结论

对2023年癌症相关关键词的分析表明,新闻文章往往将八卦置于重要信息之上。这些发现为未来解决错误信息的政策方向和策略提供了基础数据。本研究强调了了解公众消费的癌症相关信息性质的重要性,并为指导官方政策和医疗实践提供了见解。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7f1e/11570664/33671307d0d8/hir-2024-30-4-398f1.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验