• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

利用社交媒体上的症状报告和诊断信息预测中国大陆的新冠肺炎病例数:观察性信息监测研究

Using Reports of Symptoms and Diagnoses on Social Media to Predict COVID-19 Case Counts in Mainland China: Observational Infoveillance Study.

作者信息

Shen Cuihua, Chen Anfan, Luo Chen, Zhang Jingwen, Feng Bo, Liao Wang

机构信息

Department of Communication, University of California, Davis, Davis, CA, United States.

Department of Science Communication and Science Policy, University of Science and Technology of China, Hefei, China.

出版信息

J Med Internet Res. 2020 May 28;22(5):e19421. doi: 10.2196/19421.

DOI:10.2196/19421
PMID:32452804
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC7257484/
Abstract

BACKGROUND

Coronavirus disease (COVID-19) has affected more than 200 countries and territories worldwide. This disease poses an extraordinary challenge for public health systems because screening and surveillance capacity is often severely limited, especially during the beginning of the outbreak; this can fuel the outbreak, as many patients can unknowingly infect other people.

OBJECTIVE

The aim of this study was to collect and analyze posts related to COVID-19 on Weibo, a popular Twitter-like social media site in China. To our knowledge, this infoveillance study employs the largest, most comprehensive, and most fine-grained social media data to date to predict COVID-19 case counts in mainland China.

METHODS

We built a Weibo user pool of 250 million people, approximately half the entire monthly active Weibo user population. Using a comprehensive list of 167 keywords, we retrieved and analyzed around 15 million COVID-19-related posts from our user pool from November 1, 2019 to March 31, 2020. We developed a machine learning classifier to identify "sick posts," in which users report their own or other people's symptoms and diagnoses related to COVID-19. Using officially reported case counts as the outcome, we then estimated the Granger causality of sick posts and other COVID-19 posts on daily case counts. For a subset of geotagged posts (3.10% of all retrieved posts), we also ran separate predictive models for Hubei province, the epicenter of the initial outbreak, and the rest of mainland China.

RESULTS

We found that reports of symptoms and diagnosis of COVID-19 significantly predicted daily case counts up to 14 days ahead of official statistics, whereas other COVID-19 posts did not have similar predictive power. For the subset of geotagged posts, we found that the predictive pattern held true for both Hubei province and the rest of mainland China regardless of the unequal distribution of health care resources and the outbreak timeline.

CONCLUSIONS

Public social media data can be usefully harnessed to predict infection cases and inform timely responses. Researchers and disease control agencies should pay close attention to the social media infosphere regarding COVID-19. In addition to monitoring overall search and posting activities, leveraging machine learning approaches and theoretical understanding of information sharing behaviors is a promising approach to identify true disease signals and improve the effectiveness of infoveillance.

摘要

背景

冠状病毒病(COVID-19)已影响全球200多个国家和地区。这种疾病给公共卫生系统带来了巨大挑战,因为筛查和监测能力往往严重受限,尤其是在疫情爆发初期;这可能会助长疫情传播,因为许多患者可能在不知情的情况下感染他人。

目的

本研究的目的是收集和分析中国类似推特的热门社交媒体微博上与COVID-19相关的帖子。据我们所知,这项信息监测研究采用了迄今为止规模最大、最全面、粒度最细的社交媒体数据来预测中国大陆的COVID-19病例数。

方法

我们建立了一个2.5亿人的微博用户池,约占微博月活跃用户总数的一半。使用167个关键词的综合列表,我们从2019年11月1日至2020年3月31日从用户池中检索并分析了约1500万条与COVID-19相关的帖子。我们开发了一种机器学习分类器来识别“患病帖子”,即用户报告自己或他人与COVID-19相关的症状和诊断的帖子。以官方报告的病例数为结果,然后我们估计了患病帖子和其他COVID-19帖子对每日病例数的格兰杰因果关系。对于一部分带有地理标签的帖子(占所有检索帖子的3.10%),我们还分别针对疫情最初爆发的中心湖北省和中国大陆其他地区运行了预测模型。

结果

我们发现,COVID-19症状和诊断报告在官方统计前长达14天就能显著预测每日病例数,而其他COVID-19帖子则没有类似的预测能力。对于带有地理标签的帖子子集,我们发现,无论医疗资源分配不均和疫情时间线如何,这种预测模式在湖北省和中国大陆其他地区都成立。

结论

公共社交媒体数据可有效地用于预测感染病例并为及时应对提供信息。研究人员和疾病控制机构应密切关注有关COVID-19的社交媒体信息圈。除了监测整体搜索和发布活动外,利用机器学习方法和对信息共享行为的理论理解是识别真正疾病信号并提高信息监测有效性的一种有前途的方法。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/81e1/7257484/c7521eb462d9/jmir_v22i5e19421_fig4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/81e1/7257484/8a53064e3687/jmir_v22i5e19421_fig1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/81e1/7257484/d87e2c63691c/jmir_v22i5e19421_fig2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/81e1/7257484/288f24f178b2/jmir_v22i5e19421_fig3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/81e1/7257484/c7521eb462d9/jmir_v22i5e19421_fig4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/81e1/7257484/8a53064e3687/jmir_v22i5e19421_fig1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/81e1/7257484/d87e2c63691c/jmir_v22i5e19421_fig2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/81e1/7257484/288f24f178b2/jmir_v22i5e19421_fig3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/81e1/7257484/c7521eb462d9/jmir_v22i5e19421_fig4.jpg

相似文献

1
Using Reports of Symptoms and Diagnoses on Social Media to Predict COVID-19 Case Counts in Mainland China: Observational Infoveillance Study.利用社交媒体上的症状报告和诊断信息预测中国大陆的新冠肺炎病例数:观察性信息监测研究
J Med Internet Res. 2020 May 28;22(5):e19421. doi: 10.2196/19421.
2
Data Mining and Content Analysis of the Chinese Social Media Platform Weibo During the Early COVID-19 Outbreak: Retrospective Observational Infoveillance Study.新冠疫情早期的中文社交媒体平台微博数据挖掘和内容分析:回顾性观察性信息监测研究。
JMIR Public Health Surveill. 2020 Apr 21;6(2):e18700. doi: 10.2196/18700.
3
Public Engagement and Government Responsiveness in the Communications About COVID-19 During the Early Epidemic Stage in China: Infodemiology Study on Social Media Data.中国疫情早期阶段新冠疫情信息传播中的公众参与和政府回应:基于社交媒体数据的信息流行病学研究
J Med Internet Res. 2020 May 26;22(5):e18796. doi: 10.2196/18796.
4
Machine Learning to Detect Self-Reporting of Symptoms, Testing Access, and Recovery Associated With COVID-19 on Twitter: Retrospective Big Data Infoveillance Study.基于机器学习的方法在推特上检测与 COVID-19 相关的自我报告症状、检测途径和康复情况:回顾性大数据信息监测研究。
JMIR Public Health Surveill. 2020 Jun 8;6(2):e19509. doi: 10.2196/19509.
5
Mining the Characteristics of COVID-19 Patients in China: Analysis of Social Media Posts.挖掘中国新冠肺炎患者的特征:基于社交媒体帖子的分析
J Med Internet Res. 2020 May 17;22(5):e19087. doi: 10.2196/19087.
6
Using WeChat, a Chinese Social Media App, for Early Detection of the COVID-19 Outbreak in December 2019: Retrospective Study.利用微信,一款中国社交媒体应用程序,对 2019 年 12 月新冠肺炎疫情进行早期检测:回顾性研究。
JMIR Mhealth Uhealth. 2020 Oct 5;8(10):e19589. doi: 10.2196/19589.
7
Temporal and Location Variations, and Link Categories for the Dissemination of COVID-19-Related Information on Twitter During the SARS-CoV-2 Outbreak in Europe: Infoveillance Study.欧洲SARS-CoV-2疫情期间推特上新冠疫情相关信息传播的时间和地点变化以及链接类别:信息监测研究
J Med Internet Res. 2020 Aug 28;22(8):e19629. doi: 10.2196/19629.
8
Grappling With the COVID-19 Health Crisis: Content Analysis of Communication Strategies and Their Effects on Public Engagement on Social Media.应对新冠疫情健康危机:社交媒体上传播策略及其对公众参与度影响的内容分析
J Med Internet Res. 2020 Aug 24;22(8):e21360. doi: 10.2196/21360.
9
Concerns Expressed by Chinese Social Media Users During the COVID-19 Pandemic: Content Analysis of Sina Weibo Microblogging Data.新冠疫情期间中国社交媒体用户表达的担忧:对新浪微博数据的内容分析
J Med Internet Res. 2020 Nov 26;22(11):e22152. doi: 10.2196/22152.
10
Top Concerns of Tweeters During the COVID-19 Pandemic: Infoveillance Study.新冠疫情期间推特用户的主要担忧:信息监测研究
J Med Internet Res. 2020 Apr 21;22(4):e19016. doi: 10.2196/19016.

引用本文的文献

1
Analyzing Health Care Professionals' Resilience and Emotional Responses to COVID-19 via Twitter: Retrospective Cohort and Matched Comparison Group Study.通过推特分析医疗保健专业人员对新冠病毒疾病的适应力和情绪反应:回顾性队列研究与匹配比较组研究
J Med Internet Res. 2025 Sep 3;27:e72521. doi: 10.2196/72521.
2
A Forecast Model for COVID-19 Spread Trends Using Blog and GPS Data from Smartphones.一种利用智能手机博客和GPS数据预测新冠病毒传播趋势的模型
Entropy (Basel). 2025 Jun 26;27(7):686. doi: 10.3390/e27070686.
3
Early Warning of Infectious Disease Outbreaks Using Social Media and Digital Data: A Scoping Review.

本文引用的文献

1
Improving epidemic surveillance and response: big data is dead, long live big data.改善疫情监测与应对:大数据已死,大数据万岁。
Lancet Digit Health. 2020 May;2(5):e218-e220. doi: 10.1016/S2589-7500(20)30059-5. Epub 2020 Mar 17.
2
Mining the Characteristics of COVID-19 Patients in China: Analysis of Social Media Posts.挖掘中国新冠肺炎患者的特征:基于社交媒体帖子的分析
J Med Internet Res. 2020 May 17;22(5):e19087. doi: 10.2196/19087.
3
Chinese Public's Attention to the COVID-19 Epidemic on Social Media: Observational Descriptive Study.
利用社交媒体和数字数据进行传染病爆发的早期预警:一项范围综述。
Int J Environ Res Public Health. 2025 Jul 13;22(7):1104. doi: 10.3390/ijerph22071104.
4
Extracting circumstances of Covid-19 transmission from free text with large language models.使用大语言模型从自由文本中提取新冠病毒-19传播情况
Nat Commun. 2025 Jul 1;16(1):5836. doi: 10.1038/s41467-025-60762-w.
5
Descriptive analysis of TikTok content on vaccination in Arabic.阿拉伯语TikTok上关于疫苗接种内容的描述性分析。
AIMS Public Health. 2025 Jan 17;12(1):137-161. doi: 10.3934/publichealth.2025010. eCollection 2025.
6
Quality of cerebral palsy videos on Chinese social media platforms.中国社交媒体平台上脑瘫相关视频的质量
Sci Rep. 2025 Apr 17;15(1):13323. doi: 10.1038/s41598-024-84845-8.
7
Natural Language Processing for Digital Health in the Era of Large Language Models.大语言模型时代数字健康领域的自然语言处理
Yearb Med Inform. 2024 Aug;33(1):229-240. doi: 10.1055/s-0044-1800750. Epub 2025 Apr 8.
8
An explainable GeoAI approach for the multimodal analysis of urban human dynamics: a case study for the COVID-19 pandemic in Rio de Janeiro.一种用于城市人类动态多模态分析的可解释地理人工智能方法:以里约热内卢的新冠疫情为例
Comput Urban Sci. 2025;5(1):13. doi: 10.1007/s43762-025-00172-2. Epub 2025 Mar 3.
9
To live or to stay alive? A thematic and sentiment analysis of public posts on social media during the 2022 Shanghai COVID-19 outbreak.生存还是活着?对2022年上海新冠疫情期间社交媒体上公众帖子的主题和情感分析。
Digit Health. 2024 Nov 10;10:20552076241288731. doi: 10.1177/20552076241288731. eCollection 2024 Jan-Dec.
10
Early warning and predicting of COVID-19 using zero-inflated negative binomial regression model and negative binomial regression model.使用零膨胀负二项回归模型和负二项回归模型对 COVID-19 进行预警和预测。
BMC Infect Dis. 2024 Sep 19;24(1):1006. doi: 10.1186/s12879-024-09940-7.
中国公众在社交媒体上对新冠疫情的关注度:观察性描述性研究
J Med Internet Res. 2020 May 4;22(5):e18825. doi: 10.2196/18825.
4
Early epidemiological analysis of the coronavirus disease 2019 outbreak based on crowdsourced data: a population-level observational study.基于众包数据的 2019 年冠状病毒病早期流行病学分析:人群水平观察研究。
Lancet Digit Health. 2020 Apr;2(4):e201-e208. doi: 10.1016/S2589-7500(20)30026-1. Epub 2020 Feb 20.
5
Crowdsourcing data to mitigate epidemics.众包数据以缓解疫情。
Lancet Digit Health. 2020 Apr;2(4):e156-e157. doi: 10.1016/S2589-7500(20)30055-8. Epub 2020 Feb 20.
6
Data Mining and Content Analysis of the Chinese Social Media Platform Weibo During the Early COVID-19 Outbreak: Retrospective Observational Infoveillance Study.新冠疫情早期的中文社交媒体平台微博数据挖掘和内容分析:回顾性观察性信息监测研究。
JMIR Public Health Surveill. 2020 Apr 21;6(2):e18700. doi: 10.2196/18700.
7
Limited Early Warnings and Public Attention to Coronavirus Disease 2019 in China, January-February, 2020: A Longitudinal Cohort of Randomly Sampled Weibo Users.2020 年 1 月至 2 月中国对 2019 冠状病毒病的有限早期预警和公众关注度:一项随机抽样微博用户的纵向队列研究。
Disaster Med Public Health Prep. 2020 Oct;14(5):e24-e27. doi: 10.1017/dmp.2020.68. Epub 2020 Apr 3.
8
Corona Virus (COVID-19) "Infodemic" and Emerging Issues through a Data Lens: The Case of China.冠状病毒(COVID-19)“信息疫情”与数据视角下的新问题:以中国为例。
Int J Environ Res Public Health. 2020 Mar 30;17(7):2309. doi: 10.3390/ijerph17072309.
9
Retrospective analysis of the possibility of predicting the COVID-19 outbreak from Internet searches and social media data, China, 2020.回顾性分析 2020 年中国从互联网搜索和社交媒体数据预测 COVID-19 疫情爆发的可能性。
Euro Surveill. 2020 Mar;25(10). doi: 10.2807/1560-7917.ES.2020.25.10.2000199.
10
Early dynamics of transmission and control of COVID-19: a mathematical modelling study.COVID-19 的传播和控制的早期动态:一项数学建模研究。
Lancet Infect Dis. 2020 May;20(5):553-558. doi: 10.1016/S1473-3099(20)30144-4. Epub 2020 Mar 11.