Jackson Latifa, Kuhlman Caitlin, Jackson Fatimah, Fox P Keolu
Department of Pediatrics and Child Health, College of Medicine, Howard University, Washington, DC, United States.
W. Montague Cobb Research Laboratory, College of Arts and Sciences, Howard University, Washington, DC, United States.
Front Big Data. 2019 Jun 28;2:19. doi: 10.3389/fdata.2019.00019. eCollection 2019.
Data science has made great strides in harnessing the power of big data to improve human life across a broad spectrum of disciplines. Unfortunately this informational richesse is not equitably spread across human populations. Vulnerable populations remain both under-studied and under-consulted on the use of data derived from their communities. This lack of inclusion of vulnerable populations as data collectors, data analyzers and data beneficiaries significantly restrains the utility of big data applications that contribute to human well-ness. Here we present three case studies: (1) Describing a novel genomic dataset being developed with clinical and ethnographic insights in African Americans, (2) Demonstrating how a tutorial that enables data scientists from vulnerable populations to better understand criminal justice bias using the COMPAS dataset, and (3) investigating how Indigenous genomic diversity contributes to future biomedical interventions. These cases represent some of the outstanding challenges that big data science presents when addressing vulnerable populations as well as the innovative solutions that expanding science participation brings.
数据科学在利用大数据的力量以改善广泛学科领域的人类生活方面取得了巨大进展。不幸的是,这种信息财富并未在人群中公平分配。弱势群体在基于其社区数据的使用方面,仍然研究不足且咨询较少。在将弱势群体纳入数据收集者、数据分析者和数据受益者方面的缺失,严重限制了有助于人类健康的大数据应用的效用。在此,我们展示三个案例研究:(1)描述一个利用非裔美国人的临床和人种学见解开发的新型基因组数据集;(2)展示一个教程,该教程能使弱势群体的数据科学家利用COMPAS数据集更好地理解刑事司法偏见;(3)研究本土基因组多样性如何为未来的生物医学干预做出贡献。这些案例既代表了大数据科学在处理弱势群体问题时面临的一些突出挑战,也代表了扩大科学参与所带来的创新解决方案。