Ministry of Employment and Social Insurance, General Secretariat of Social Security, Department of National Security Registries and Internet, Athens.
Inform Health Soc Care. 2012 Mar;37(2):106-24. doi: 10.3109/17538157.2011.647934.
Recent research has shown the potential of Web queries as a source for syndromic surveillance, and existing studies show that these queries can be used as a basis for estimation and prediction of the development of a syndromic disease, such as influenza, using log linear (logit) statistical models. Two alternative models are applied to the relationship between cases and Web queries in this paper. We examine the applicability of using statistical methods to relate search engine queries with scarlet fever cases in the UK, taking advantage of tools to acquire the appropriate data from Google, and using an alternative statistical method based on gamma distributions. The results show that using logit models, the Pearson correlation factor between Web queries and the data obtained from the official agencies must be over 0.90, otherwise the prediction of the peak and the spread of the distributions gives significant deviations. In this paper, we describe the gamma distribution model and show that we can obtain better results in all cases using gamma transformations, and especially in those with a smaller correlation factor.
最近的研究表明,网络查询作为一种综合征监测的来源具有潜力,现有研究表明,这些查询可以作为使用对数线性(对数几率)统计模型估计和预测流感等综合征疾病发展的基础。本文应用了两种替代模型来研究病例与网络查询之间的关系。我们检验了使用统计方法将搜索引擎查询与英国猩红热病例联系起来的适用性,利用工具从谷歌获取相关数据,并使用基于伽马分布的替代统计方法。结果表明,使用对数几率模型,网络查询与官方机构获取的数据之间的皮尔逊相关系数必须超过 0.90,否则对峰值和分布扩散的预测会产生显著偏差。在本文中,我们描述了伽马分布模型,并表明我们可以通过伽马变换在所有情况下获得更好的结果,尤其是在相关系数较小的情况下。