Suppr超能文献

一种整合多个商业数据源以改进食品和酒精环境测量的方法:地理信息系统的应用

A methodology for combining multiple commercial data sources to improve measurement of the food and alcohol environment: applications of geographical information systems.

作者信息

Mendez Dara D, Duell Jessica, Reiser Sarah, Martin Deborah, Gradeck Robert, Fabio Anthony

机构信息

University of Pittsburgh, Graduate School of Public Health, Department of Epidemiology, Pittsburgh, USA.

University of Pittsburgh, Center for Social and Urban Research, Pittsburgh, USA.

出版信息

Geospat Health. 2014 Nov;9(1):71-96. doi: 10.4081/gh.2014.7.

Abstract

Commercial data sources have been increasingly used to measure and locate community resources. We describe a methodology for combining and comparing the differences in commercial data of the food and alcohol environment. We used commercial data from two commercial databases (InfoUSA and Dun&Bradstreet) for 2003 and 2009 to obtain information on food and alcohol establishments and developed a matching process using computer algorithms and manual review by applying ArcGIS to geocode addresses, standard industrial classification and North American industry classification taxonomy for type of establishment and establishment name. We constructed population and area-based density measures (e.g. grocery stores) and assessed differences across data sources and used ArcGIS to map the densities. The matching process resulted in 8,705 and 7,078 unique establishments for 2003 and 2009, respectively. There were more establishments captured in the combined dataset than relying on one data source alone, and the additional establishments captured ranged from 1,255 to 2,752 in 2009. The correlations for the density measures between the two data sources was highest for alcohol outlets (r = 0.75 and 0.79 for per capita and area, respectively) and lowest for grocery stores/supermarkets (r = 0.32 for both). This process for applying geographical information systems to combine multiple commercial data sources and develop measures of the food and alcohol environment captured more establishments than relying on one data source alone. This replicable methodology was found to be useful for understanding the food and alcohol environment when local or public data are limited.

摘要

商业数据源越来越多地被用于衡量和定位社区资源。我们描述了一种用于合并和比较食品与酒精环境商业数据差异的方法。我们使用了来自两个商业数据库(InfoUSA和邓白氏)2003年和2009年的商业数据,以获取食品和酒精经营场所的信息,并通过应用ArcGIS对地址进行地理编码、使用标准产业分类和北美产业分类法对经营场所类型和名称进行分类,开发了一种结合计算机算法和人工审核的匹配流程。我们构建了基于人口和面积的密度指标(如杂货店),评估了不同数据源之间的差异,并使用ArcGIS绘制密度图。匹配过程分别为2003年和2009年产生了8705个和7078个独特的经营场所。合并数据集中捕获的经营场所比仅依赖一个数据源时更多,2009年捕获的额外经营场所数量在1255至2752个之间。两个数据源之间密度指标的相关性,酒精销售点最高(人均和面积的相关性分别为r = 0.75和0.79),杂货店/超市最低(两者均为r = 0.32)。这种应用地理信息系统来合并多个商业数据源并开发食品与酒精环境指标的方法,比仅依赖一个数据源捕获了更多的经营场所。当本地或公共数据有限时,这种可重复的方法被发现对于理解食品与酒精环境很有用。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验