Idaikkadar Nimi, Bodin Eva, Cholli Preetam, Navon Livia, Ortmann Leonard, Banja John, Waller Lance A, Alic Alen, Yuan Keming, Law Royal
Division of Injury Prevention, National Center for Injury Prevention and Control, Centers for Disease Control and Prevention, Atlanta, GA, USA.
Office of Readiness and Response, Immediate Office of the Director, Centers for Disease Control and Prevention, Atlanta, GA, USA.
Public Health Rep. 2025 Jan 20:333549241312055. doi: 10.1177/00333549241312055.
Data science is an emerging field that provides new analytical methods. It incorporates novel data sources (eg, internet data) and methods (eg, machine learning) that offer valuable and timely insights into public health issues, including injury and violence prevention. The objective of this research was to describe ethical considerations for public health data scientists conducting injury and violence prevention-related data science projects to prevent unintended ethical, legal, and social consequences, such as loss of privacy or loss of public trust. We first reviewed foundational bioethics and public health ethics literature to identify key ethical concepts relevant to public health data science. After identifying these ethics concepts, we held a series of discussions to organize them under broad ethical domains. Within each domain, we examined relevant ethics concepts from our review of the primary literature. Lastly, we developed questions for each ethical domain to facilitate the early conceptualization stage of the ethical analysis of injury and violence prevention projects. We identified 4 ethical domains: privacy, responsible stewardship, justice as fairness, and inclusivity and engagement. We determined that each domain carries equal weight, with no consideration bearing more importance than the others. Examples of ethical considerations are clearly identifying project goals, determining whether people included in projects are at risk of reidentification through external sources or linkages, and evaluating and minimizing the potential for bias in data sources used. As data science methodologies are incorporated into public health research to work toward reducing the effect of injury and violence on individuals, families, and communities in the United States, we recommend that relevant ethical issues be identified, considered, and addressed.
数据科学是一个新兴领域,它提供了新的分析方法。它纳入了新颖的数据源(如互联网数据)和方法(如机器学习),这些能够为包括伤害和暴力预防在内的公共卫生问题提供有价值且及时的见解。本研究的目的是描述从事与伤害和暴力预防相关数据科学项目的公共卫生数据科学家的伦理考量,以防止出现意外的伦理、法律和社会后果,如隐私丧失或公众信任丧失。我们首先回顾了基础生物伦理学和公共卫生伦理学文献,以确定与公共卫生数据科学相关的关键伦理概念。确定这些伦理概念后,我们进行了一系列讨论,将它们归类到广泛的伦理领域之下。在每个领域内,我们从对主要文献的回顾中审视了相关的伦理概念。最后,我们为每个伦理领域提出了问题,以促进伤害和暴力预防项目伦理分析的早期概念化阶段。我们确定了4个伦理领域:隐私、负责任的管理、公平即正义以及包容性和参与度。我们认为每个领域的重要性相同,没有一个领域比其他领域更重要。伦理考量的例子包括明确确定项目目标、确定项目中纳入的人员是否有通过外部来源或关联被重新识别的风险,以及评估和尽量减少所用数据源中存在偏差的可能性。随着数据科学方法被纳入公共卫生研究,以努力减少伤害和暴力对美国个人、家庭和社区的影响,我们建议识别、考虑并解决相关的伦理问题。