Tsai Thomas C, Arik Sercan, Jacobson Benjamin H, Yoon Jinsung, Yoder Nate, Sava Dario, Mitchell Margaret, Graham Garth, Pfister Tomas
Department of Health Policy and Management, Harvard T.H. Chan School of Public Health, Boston, MA, USA.
Department of Surgery, Brigham and Women's Hospital, Boston, MA, USA.
NPJ Digit Med. 2022 May 10;5(1):59. doi: 10.1038/s41746-022-00602-z.
Racial and ethnic minorities have borne a particularly acute burden of the COVID-19 pandemic in the United States. There is a growing awareness from both researchers and public health leaders of the critical need to ensure fairness in forecast results. Without careful and deliberate bias mitigation, inequities embedded in data can be transferred to model predictions, perpetuating disparities, and exacerbating the disproportionate harms of the COVID-19 pandemic. These biases in data and forecasts can be viewed through both statistical and sociological lenses, and the challenges of both building hierarchical models with limited data availability and drawing on data that reflects structural inequities must be confronted. We present an outline of key modeling domains in which unfairness may be introduced and draw on our experience building and testing the Google-Harvard COVID-19 Public Forecasting model to illustrate these challenges and offer strategies to address them. While targeted toward pandemic forecasting, these domains of potentially biased modeling and concurrent approaches to pursuing fairness present important considerations for equitable machine-learning innovation.
在美国,少数族裔在新冠疫情中承受了尤为沉重的负担。研究人员和公共卫生领导人越来越意识到,确保预测结果的公平性至关重要。如果不谨慎且刻意地减轻偏差,数据中固有的不平等就会被转移到模型预测中,使差距持续存在,并加剧新冠疫情造成的不成比例的危害。数据和预测中的这些偏差可以从统计学和社会学的角度来看待,必须面对在数据可用性有限的情况下构建分层模型以及利用反映结构性不平等的数据所面临的挑战。我们概述了可能引入不公平性的关键建模领域,并借鉴我们构建和测试谷歌-哈佛新冠疫情公共预测模型的经验,来说明这些挑战并提供应对策略。虽然这些内容是针对疫情预测的,但这些可能存在偏差的建模领域以及追求公平性的并行方法,为公平的机器学习创新提供了重要的思考方向。