促进交通领域种族和族裔差异与不平等问题的研究：贝叶斯改进姓氏地理编码（BISG）算法的应用与评估

Facilitating research on racial and ethnic disparities and inequities in transportation: Application and evaluation of the Bayesian Improved Surname Geocoding (BISG) algorithm.

作者信息

Sartin Emma B, Metzger Kristina B, Pfeiffer Melissa R, Myers Rachel K, Curry Allison E

机构信息

Center for Injury Research and Prevention, Children's Hospital of Philadelphia, Philadelphia, Pennsylvania.

Division of Emergency Medicine, Department of Pediatrics, Perelman School of Medicine at the University of Pennsylvania, Philadelphia, Pennsylvania.

出版信息

Traffic Inj Prev. 2021;22(sup1):S32-S37. doi: 10.1080/15389588.2021.1955109. Epub 2021 Aug 17.

DOI:10.1080/15389588.2021.1955109

PMID:34402327

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC8792156/

Abstract

OBJECTIVE

Racial and ethnic disparities and/or inequities have been documented in traffic safety research. However, race/ethnicity data are often not captured in population-level traffic safety databases, limiting the field's ability to comprehensively study racial/ethnic differences in transportation outcomes, as well as our ability to mitigate them. To overcome this limitation, we explored the utility of estimating race and ethnicity for drivers in the New Jersey Safety and Health Outcomes (NJ-SHO) data warehouse using the Bayesian Improved Surname Geocoding (BISG) algorithm. In addition, we summarize important recommendations established to guide researchers developing and implementing racial and ethnic disparity research.

METHODS

We applied BISG to estimate population-level race/ethnicity for New Jersey drivers in 2017 and evaluated the concordance between reported values available in integrated administrative sources (e.g., hospital records) and BISG probability distributions using an area under the receiver operator curve (AUC) within each race/ethnicity category. Overall AUC was calculated by weighting each AUC value by the population count in each reported category. In an exemplar analysis using 2017 crash data, we conducted an analysis of average monthly police-reported crash rates in 2017 by race/ethnicity using the NJ-SHO and BISG sets of race/ethnicity values to compare their outputs.

RESULTS

We found excellent or outstanding concordance (AUC ≥0.86) between reported race/ethnicity and BISG probabilities for White, Hispanic, Black, and Asian/Pacific Islander drivers. We found poor concordance for American Indian/Alaskan Native drivers (AUC= 0.65), and concordance was no better than random assignment for Multiracial drivers (AUC = 0.52). Among White, Hispanic, Asian/Pacific Islander, and American Indian/Alaskan native drivers, monthly crash rates calculated using both NJ-SHO reported race/ethnicity values and BISG probabilities were similar. Monthly crash rates differed by 11% for Black drivers, and by more than 200% for Multiracial drivers.

CONCLUSION

Findings of excellent or outstanding concordance between and mostly similar crash rates derived from reported race/ethnicity and BISG probabilities for White, Hispanic, Black, and Asian/Pacific Islander drivers (98.9% of all drivers in this sample) demonstrate the potential utility of BISG in enabling research on transportation disparities and inequities. Concordance between race/ethnicity values were not acceptable for American Indian/Alaskan Native and Multiracial drivers, which is similar to previous applications and evaluations of BISG. Future work is needed to determine the extent to which BISG may be applied to traffic safety contexts.

摘要

目的

交通安全研究中已记录了种族和民族差异及/或不平等现象。然而，人口层面的交通安全数据库往往未收集种族/民族数据，这限制了该领域全面研究交通结果中的种族/民族差异的能力，以及我们缓解这些差异的能力。为克服这一限制，我们探索了使用贝叶斯改进姓氏地理编码（BISG）算法在新泽西州安全与健康结果（NJ-SHO）数据仓库中估算驾驶员种族和民族的效用。此外，我们总结了为指导研究人员开展和实施种族与民族差异研究而制定的重要建议。

方法

我们应用BISG估算2017年新泽西州驾驶员的人口层面种族/民族，并使用每个种族/民族类别内的接收者操作特征曲线（AUC）下的面积评估综合行政来源（如医院记录）中可用的报告值与BISG概率分布之间的一致性。通过按每个报告类别中的人口数量对每个AUC值进行加权来计算总体AUC。在一项使用2017年碰撞数据的示例分析中，我们使用NJ-SHO和BISG种族/民族值集按种族/民族对2017年警方报告的月平均碰撞率进行了分析，以比较它们的输出结果。

结果

我们发现，白人、西班牙裔、黑人以及亚裔/太平洋岛民驾驶员的报告种族/民族与BISG概率之间具有良好或出色的一致性（AUC≥0.86）。我们发现美洲印第安人/阿拉斯加原住民驾驶员的一致性较差（AUC = 0.65），而多种族驾驶员的一致性不比随机分配好（AUC = 0.52）。在白人、西班牙裔、亚裔/太平洋岛民和美洲印第安人/阿拉斯加原住民驾驶员中，使用NJ-SHO报告的种族/民族值和BISG概率计算出的月碰撞率相似。黑人驾驶员的月碰撞率相差11%，多种族驾驶员的月碰撞率相差超过200%。

结论

白人、西班牙裔、黑人以及亚裔/太平洋岛民驾驶员（该样本中所有驾驶员中的98.9%）的报告种族/民族与BISG概率之间具有良好或出色的一致性，且得出的碰撞率大多相似，这表明BISG在推动交通差异和不平等研究方面具有潜在效用。美洲印第安人/阿拉斯加原住民和多种族驾驶员的种族/民族值之间的一致性不可接受，这与之前BISG的应用和评估情况类似。未来需要开展工作来确定BISG在交通安全背景下的适用程度。

相似文献

Facilitating research on racial and ethnic disparities and inequities in transportation: Application and evaluation of the Bayesian Improved Surname Geocoding (BISG) algorithm.

Traffic Inj Prev. 2021;22(sup1):S32-S37. doi: 10.1080/15389588.2021.1955109. Epub 2021 Aug 17.

Using the Bayesian Improved Surname Geocoding Method (BISG) to create a working classification of race and ethnicity in a diverse managed care population: a validation study.

Health Serv Res. 2014 Feb;49(1):268-83. doi: 10.1111/1475-6773.12089. Epub 2013 Jul 16.

Improving Occupational Health Disparity Research: Testing a method to estimate race and ethnicity in a working population.

Am J Ind Med. 2018 Apr 2. doi: 10.1002/ajim.22850.

Implications of missingness in self-reported data for estimating racial and ethnic disparities in Medicaid quality measures.

Health Serv Res. 2022 Dec;57(6):1370-1378. doi: 10.1111/1475-6773.14025. Epub 2022 Jul 25.

Differential Privacy Protections in 2020 U.S. Decennial Census Data Do Not Impede Measurement of Racial and Ethnic Disparities.

Med Care Res Rev. 2024 Aug;81(4):346-350. doi: 10.1177/10775587241251870. Epub 2024 May 14.

Addressing bias in preterm birth research: The role of advanced imputation techniques for missing race and ethnicity in perinatal health data.

Ann Epidemiol. 2024 Jun;94:120-126. doi: 10.1016/j.annepidem.2024.05.003. Epub 2024 May 10.

Evaluating Cardiovascular Health Disparities Using Estimated Race/Ethnicity: A Validation Study.

Med Care. 2015 Dec;53(12):1050-7. doi: 10.1097/MLR.0000000000000438.

Race and ethnicity data for first, middle, and surnames.

Sci Data. 2023 May 19;10(1):299. doi: 10.1038/s41597-023-02202-2.

Use of geocoding and surname analysis to estimate race and ethnicity.

Health Serv Res. 2006 Aug;41(4 Pt 1):1482-500. doi: 10.1111/j.1475-6773.2006.00551.x.

Patterns and drivers of disparities in pediatric asthma outcomes among Medicaid-enrolled children living in subsidized housing in NYC.

Prev Med. 2024 Aug;185:108023. doi: 10.1016/j.ypmed.2024.108023. Epub 2024 Jun 20.

引用本文的文献

Applying individual- and residence-based equity measures to characterize disparities in crash outcomes.

J Safety Res. 2025 Feb;92:522-531. doi: 10.1016/j.jsr.2025.01.006. Epub 2025 Feb 5.

Disaggregating Racial and Ethnic Data: A Step Toward Diversity, Equity, and Inclusion.

Clin Gastroenterol Hepatol. 2023 Mar;21(3):567-571. doi: 10.1016/j.cgh.2022.12.001.

Disaggregating Racial and Ethnic Data: A Step Toward Diversity, Equity, and Inclusion.

Gastroenterology. 2023 Mar;164(3):320-324. doi: 10.1053/j.gastro.2023.01.008.

Improving identification of crash injuries: Statewide integration of hospital discharge and crash report data.

Traffic Inj Prev. 2022;23(sup1):S130-S136. doi: 10.1080/15389588.2022.2083612. Epub 2022 Jun 13.

本文引用的文献

Development of the integrated New Jersey Safety and Health Outcomes (NJ-SHO) data warehouse: catalysing advancements in injury prevention research.

Inj Prev. 2021 Oct;27(5):472-478. doi: 10.1136/injuryprev-2020-044101. Epub 2021 Mar 8.

The Reporting of Race and Ethnicity in Medical and Science Journals: Comments Invited.

JAMA. 2021 Mar 16;325(11):1049-1052. doi: 10.1001/jama.2021.2104.

Estimating the Unknown: Greater Racial and Ethnic Disparities in COVID-19 Burden After Accounting for Missing Race and Ethnicity Data.

Epidemiology. 2021 Mar 1;32(2):157-161. doi: 10.1097/EDE.0000000000001314.

Am J Ind Med. 2020 Apr;63(4):300-311. doi: 10.1002/ajim.23092. Epub 2020 Jan 28.

Catalyzing traffic safety advancements via data linkage: Development of the New Jersey Safety and Health Outcomes (NJ-SHO) data warehouse.

Traffic Inj Prev. 2019;20(sup2):S151-S155. doi: 10.1080/15389588.2019.1679552. Epub 2019 Nov 12.

The Validity of Race and Hispanic-origin Reporting on Death Certificates in the United States: An Update.

Vital Health Stat 2. 2016 Aug 1(172):1-21.

When Race/Ethnicity Data Are Lacking: Using Advanced Indirect Estimation Methods to Measure Disparities.

Rand Health Q. 2016 Jun 20;6(1):16.

Driver licensing and reasons for delaying licensure among young adults ages 18-20, United States, 2012.

Inj Epidemiol. 2014 Dec;1(1):4. doi: 10.1186/2197-1714-1-4. Epub 2014 Mar 20.

Trends in Socioeconomic Inequalities in Motor Vehicle Accident Deaths in the United States, 1995-2010.

Am J Epidemiol. 2015 Oct 1;182(7):606-14. doi: 10.1093/aje/kwv099. Epub 2015 Sep 8.

Accuracy of race, ethnicity, and language preference in an electronic health record.

J Gen Intern Med. 2015 Jun;30(6):719-23. doi: 10.1007/s11606-014-3102-8. Epub 2014 Dec 20.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

促进交通领域种族和族裔差异与不平等问题的研究：贝叶斯改进姓氏地理编码（BISG）算法的应用与评估

Facilitating research on racial and ethnic disparities and inequities in transportation: Application and evaluation of the Bayesian Improved Surname Geocoding (BISG) algorithm.

作者信息

机构信息

出版信息

OBJECTIVE

METHODS

RESULTS

CONCLUSION

目的

方法

结果

结论

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献