Sartin Emma B, Metzger Kristina B, Pfeiffer Melissa R, Myers Rachel K, Curry Allison E
Center for Injury Research and Prevention, Children's Hospital of Philadelphia, Philadelphia, Pennsylvania.
Division of Emergency Medicine, Department of Pediatrics, Perelman School of Medicine at the University of Pennsylvania, Philadelphia, Pennsylvania.
Traffic Inj Prev. 2021;22(sup1):S32-S37. doi: 10.1080/15389588.2021.1955109. Epub 2021 Aug 17.
Racial and ethnic disparities and/or inequities have been documented in traffic safety research. However, race/ethnicity data are often not captured in population-level traffic safety databases, limiting the field's ability to comprehensively study racial/ethnic differences in transportation outcomes, as well as our ability to mitigate them. To overcome this limitation, we explored the utility of estimating race and ethnicity for drivers in the New Jersey Safety and Health Outcomes (NJ-SHO) data warehouse using the Bayesian Improved Surname Geocoding (BISG) algorithm. In addition, we summarize important recommendations established to guide researchers developing and implementing racial and ethnic disparity research.
We applied BISG to estimate population-level race/ethnicity for New Jersey drivers in 2017 and evaluated the concordance between reported values available in integrated administrative sources (e.g., hospital records) and BISG probability distributions using an area under the receiver operator curve (AUC) within each race/ethnicity category. Overall AUC was calculated by weighting each AUC value by the population count in each reported category. In an exemplar analysis using 2017 crash data, we conducted an analysis of average monthly police-reported crash rates in 2017 by race/ethnicity using the NJ-SHO and BISG sets of race/ethnicity values to compare their outputs.
We found excellent or outstanding concordance (AUC ≥0.86) between reported race/ethnicity and BISG probabilities for White, Hispanic, Black, and Asian/Pacific Islander drivers. We found poor concordance for American Indian/Alaskan Native drivers (AUC= 0.65), and concordance was no better than random assignment for Multiracial drivers (AUC = 0.52). Among White, Hispanic, Asian/Pacific Islander, and American Indian/Alaskan native drivers, monthly crash rates calculated using both NJ-SHO reported race/ethnicity values and BISG probabilities were similar. Monthly crash rates differed by 11% for Black drivers, and by more than 200% for Multiracial drivers.
Findings of excellent or outstanding concordance between and mostly similar crash rates derived from reported race/ethnicity and BISG probabilities for White, Hispanic, Black, and Asian/Pacific Islander drivers (98.9% of all drivers in this sample) demonstrate the potential utility of BISG in enabling research on transportation disparities and inequities. Concordance between race/ethnicity values were not acceptable for American Indian/Alaskan Native and Multiracial drivers, which is similar to previous applications and evaluations of BISG. Future work is needed to determine the extent to which BISG may be applied to traffic safety contexts.
交通安全研究中已记录了种族和民族差异及/或不平等现象。然而,人口层面的交通安全数据库往往未收集种族/民族数据,这限制了该领域全面研究交通结果中的种族/民族差异的能力,以及我们缓解这些差异的能力。为克服这一限制,我们探索了使用贝叶斯改进姓氏地理编码(BISG)算法在新泽西州安全与健康结果(NJ-SHO)数据仓库中估算驾驶员种族和民族的效用。此外,我们总结了为指导研究人员开展和实施种族与民族差异研究而制定的重要建议。
我们应用BISG估算2017年新泽西州驾驶员的人口层面种族/民族,并使用每个种族/民族类别内的接收者操作特征曲线(AUC)下的面积评估综合行政来源(如医院记录)中可用的报告值与BISG概率分布之间的一致性。通过按每个报告类别中的人口数量对每个AUC值进行加权来计算总体AUC。在一项使用2017年碰撞数据的示例分析中,我们使用NJ-SHO和BISG种族/民族值集按种族/民族对2017年警方报告的月平均碰撞率进行了分析,以比较它们的输出结果。
我们发现,白人、西班牙裔、黑人以及亚裔/太平洋岛民驾驶员的报告种族/民族与BISG概率之间具有良好或出色的一致性(AUC≥0.86)。我们发现美洲印第安人/阿拉斯加原住民驾驶员的一致性较差(AUC = 0.65),而多种族驾驶员的一致性不比随机分配好(AUC = 0.52)。在白人、西班牙裔、亚裔/太平洋岛民和美洲印第安人/阿拉斯加原住民驾驶员中,使用NJ-SHO报告的种族/民族值和BISG概率计算出的月碰撞率相似。黑人驾驶员的月碰撞率相差11%,多种族驾驶员的月碰撞率相差超过200%。
白人、西班牙裔、黑人以及亚裔/太平洋岛民驾驶员(该样本中所有驾驶员中的98.9%)的报告种族/民族与BISG概率之间具有良好或出色的一致性,且得出的碰撞率大多相似,这表明BISG在推动交通差异和不平等研究方面具有潜在效用。美洲印第安人/阿拉斯加原住民和多种族驾驶员的种族/民族值之间的一致性不可接受,这与之前BISG的应用和评估情况类似。未来需要开展工作来确定BISG在交通安全背景下的适用程度。