University of Pittsburgh, USA.
Health Informatics J. 2019 Dec;25(4):1314-1324. doi: 10.1177/1460458217754242. Epub 2018 Feb 5.
Waterpipe tobacco smoking has grown in popularity among US college students and is associated with serious health risks. Much of the waterpipe tobacco smoking takes place in establishments such as "hookah bars" or in lounge settings. Web-based data platforms such as Yelp have demonstrated utility in locating these establishments but are prone to over- and underestimation. The purpose of this study was to optimize strategies for algorithmically estimating the prevalence of waterpipe tobacco smoking establishments. We conducted searches for potential waterpipe tobacco smoking establishments near highly residential US universities ( = 41). Of 521 potential establishments, independent coders confirmed 257 as permitting waterpipe tobacco smoking. We compared four strategies for using Yelp metadata to estimate the number of confirmed waterpipe tobacco smoking establishments by location. An accuracy-weighted approach generated estimates that closely matched confirmed data without significant over- or underestimation. The use of algorithms such as these may dramatically improve the feasibility and efficacy of future research linking environmental data and health outcomes.
水烟烟草在美大学生中越来越流行,与严重的健康风险相关。水烟烟草的大部分吸食发生在“水烟吧”或休息室等场所。基于网络的数据平台,如 Yelp,已经证明了在定位这些场所方面的实用性,但也容易出现过高和过低的估计。本研究的目的是优化算法估算水烟烟草吸食场所流行率的策略。我们在美国多所大学附近( = 41)进行了水烟烟草吸食场所的搜索。在 521 个潜在的场所中,独立的编码员确认了 257 个允许吸食水烟烟草的场所。我们比较了四种利用 Yelp 元数据来估算按地点确认的水烟烟草吸食场所数量的策略。一种准确性加权的方法生成的估计值与确认数据非常匹配,没有明显的过高或过低估计。这些算法的使用可能会极大地提高未来将环境数据与健康结果联系起来的研究的可行性和效果。