Takahashi Yoshimitsu, Miyaki Koichi, Nakayama Takeo
Department of Health Informatics, Kyoto University School of Public Health, Yoshida Konoe, Sakyo, Kyoto, Japan 606-8501.
J Public Health (Oxf). 2007 Mar;29(1):62-9. doi: 10.1093/pubmed/fdl081. Epub 2007 Jan 16.
Asbestos-linked public health problems were widely reported in Japan, in 2005. The objective is to apply text mining with network analysis to characterize these problems.
Text mining with network analysis of newspaper headlines including the word 'asbestos' published in 1987 and 2005 was conducted. Outcome measures are occurrence of the words and simultaneous occurrence of two words in the newspaper headlines.
In 36 headlines, which contained the word 'asbestos' in 1987, the word 'pollution' (40%) appeared most frequently, followed by 'removal' (31%) and 'campaign' (29%). For combinations of words, the following occurred most frequently: 'campaign and expulsion' (26%) followed by 'removal and campaign' (14%). Of 293 headlines in 2005, the following words appeared: 'hazard' (31%), 'person' (16%) and 'death' (13%). For combinations, the following appeared: 'person and death' (9%). Asbestos pollution and removal campaigns were reported in 1987, but the death of citizens was reported in 2005.
Text mining with network analysis, which presents one of the methods for visualization of text data, suggests the following insight. Insufficient steps against asbestos had been taken for 20 years, which is compatible with the latency period. It has resulted in widespread exposure to asbestos and more severe asbestos-related public health problems among citizens. This methodology suggests that analyzing text data by this method can serve future surveillance and efficient use of epidemiological knowledge.
2005年,日本广泛报道了与石棉相关的公共卫生问题。目的是应用文本挖掘和网络分析来描述这些问题。
对1987年和2005年发表的包含“石棉”一词的报纸标题进行文本挖掘和网络分析。结果指标是报纸标题中单词的出现情况以及两个单词的同时出现情况。
1987年,在36条包含“石棉”一词的标题中,“污染”一词出现频率最高(40%),其次是“清除”(31%)和“运动”(29%)。对于单词组合,出现频率最高的是:“运动和驱逐”(26%),其次是“清除和运动”(14%)。在2005年的293条标题中,出现了以下单词:“危害”(31%)、“人”(16%)和“死亡”(13%)。对于组合,出现了以下情况:“人和死亡”(9%)。1987年报道了石棉污染和清除运动,但2005年报道了公民死亡。
文本挖掘和网络分析作为文本数据可视化的方法之一,提供了以下见解。20年来,针对石棉采取的措施不足,这与潜伏期相符。这导致了石棉的广泛暴露以及公民中更严重的石棉相关公共卫生问题。这种方法表明,通过这种方法分析文本数据可用于未来的监测和流行病学知识的有效利用。