Pacific Northwest National Laboratory, 902 Battelle Blvd., Richland, WA 99352, USA.
Int J Environ Res Public Health. 2010 Feb;7(2):596-615. doi: 10.3390/ijerph7020596. Epub 2010 Feb 22.
Text and structural data mining of web and social media (WSM) provides a novel disease surveillance resource and can identify online communities for targeted public health communications (PHC) to assure wide dissemination of pertinent information. WSM that mention influenza are harvested over a 24-week period, 5 October 2008 to 21 March 2009. Link analysis reveals communities for targeted PHC. Text mining is shown to identify trends in flu posts that correlate to real-world influenza-like illness patient report data. We also bring to bear a graph-based data mining technique to detect anomalies among flu blogs connected by publisher type, links, and user-tags.
网络和社交媒体(WSM)的文本和结构数据挖掘为疾病监测提供了新颖的资源,并且可以识别在线社区,以便有针对性地进行公共卫生通信(PHC),从而确保相关信息的广泛传播。在 2008 年 10 月 5 日至 2009 年 3 月 21 日的 24 周期间,我们收集了提到流感的 WSM。链接分析揭示了有针对性的 PHC 社区。文本挖掘可识别与现实世界中流感样疾病患者报告数据相关的流感帖子趋势。我们还采用基于图的数据挖掘技术来检测通过发布者类型、链接和用户标签连接的流感博客之间的异常。