Miaou Shaw-Pin, Song Joon Jin
Texas Transportation Institute, Texas A&M University System, 3135 TAMU, College Station, TX 77843-3135, USA.
Accid Anal Prev. 2005 Jul;37(4):699-720. doi: 10.1016/j.aap.2005.03.012. Epub 2005 Apr 12.
In recent years, there has been a renewed interest in applying statistical ranking criteria to identify sites on a road network, which potentially present high traffic crash risks or are over-represented in certain type of crashes, for further engineering evaluation and safety improvement. This requires that good estimates of ranks of crash risks be obtained at individual intersections or road segments, or some analysis zones. The nature of this site ranking problem in roadway safety is related to two well-established statistical problems known as the small area (or domain) estimation problem and the disease mapping problem. The former arises in the context of providing estimates using sample survey data for a small geographical area or a small socio-demographic group in a large area, while the latter stems from estimating rare disease incidences for typically small geographical areas. The statistical problem is such that direct estimates of certain parameters associated with a site (or a group of sites) with adequate precision cannot be produced, due to a small available sample size, the rareness of the event of interest, and/or a small exposed population or sub-population in question. Model based approaches have offered several advantages to these estimation problems, including increased precision by "borrowing strengths" across the various sites based on available auxiliary variables, including their relative locations in space. Within the model based approach, generalized linear mixed models (GLMM) have played key roles in addressing these problems for many years. The objective of the study, on which this paper is based, was to explore some of the issues raised in recent roadway safety studies regarding ranking methodologies in light of the recent statistical development in space-time GLMM. First, general ranking approaches are reviewed, which include naïve or raw crash-risk ranking, scan based ranking, and model based ranking. Through simulations, the limitation of using the naïve approach in ranking is illustrated. Second, following the model based approach, the choice of decision parameters and consideration of treatability are discussed. Third, several statistical ranking criteria that have been used in biomedical, health, and other scientific studies are presented from a Bayesian perspective. Their applications in roadway safety are then demonstrated using two data sets: one for individual urban intersections and one for rural two-lane roads at the county level. As part of the demonstration, it is shown how multivariate spatial GLMM can be used to model traffic crashes of several injury severity types simultaneously and how the model can be used within a Bayesian framework to rank sites by crash cost per vehicle-mile traveled (instead of by crash frequency rate). Finally, the significant impact of spatial effects on the overall model goodness-of-fit and site ranking performances are discussed for the two data sets examined. The paper is concluded with a discussion on possible directions in which the study can be extended.
近年来,人们对应用统计排名标准来识别道路网络上可能存在高交通事故风险或在某些类型的事故中占比过高的地点重新产生了兴趣,以便进行进一步的工程评估和安全改进。这就要求在各个十字路口、路段或某些分析区域获得对事故风险排名的良好估计。道路安全中这种地点排名问题的性质与两个已确立的统计问题相关,即小区域(或域)估计问题和疾病映射问题。前者出现在使用样本调查数据为小地理区域或大区域中的小社会人口群体提供估计的背景下,而后者源于对通常小地理区域中罕见疾病发病率的估计。统计问题在于,由于可用样本量小、感兴趣事件的稀有性和/或相关暴露人群或亚人群小,无法以足够的精度直接估计与一个地点(或一组地点)相关的某些参数。基于模型的方法为这些估计问题提供了几个优点,包括通过基于可用辅助变量(包括它们在空间中的相对位置)在各个地点“借用优势”来提高精度。在基于模型的方法中,广义线性混合模型(GLMM)多年来在解决这些问题方面发挥了关键作用。本文所基于的研究目标是,根据时空GLMM的最新统计发展,探讨近期道路安全研究中提出的一些关于排名方法的问题。首先,回顾了一般的排名方法,包括简单或原始的事故风险排名、基于扫描的排名和基于模型的排名。通过模拟,说明了在排名中使用简单方法的局限性。其次,遵循基于模型的方法,讨论了决策参数的选择和可处理性的考虑。第三,从贝叶斯的角度介绍了生物医学、健康和其他科学研究中使用的几种统计排名标准。然后使用两个数据集展示了它们在道路安全中的应用:一个用于城市单个十字路口,一个用于县级农村双车道道路。作为演示的一部分,展示了如何使用多元空间GLMM同时对几种伤害严重程度类型的交通事故进行建模,以及如何在贝叶斯框架内使用该模型按每行驶车辆英里的事故成本(而不是按事故频率)对地点进行排名。最后,针对所研究的两个数据集,讨论了空间效应对整体模型拟合优度和地点排名性能的重大影响。本文最后讨论了该研究可能的扩展方向。