Suppr超能文献

尽责分类:数据科学家的歧视感知分类指南。

Conscientious Classification: A Data Scientist's Guide to Discrimination-Aware Classification.

机构信息

1 Department of Data Science, Zocdoc , New York, New York.

2 NYU Center for Data Science , New York, New York.

出版信息

Big Data. 2017 Jun;5(2):120-134. doi: 10.1089/big.2016.0048.

Abstract

Recent research has helped to cultivate growing awareness that machine-learning systems fueled by big data can create or exacerbate troubling disparities in society. Much of this research comes from outside of the practicing data science community, leaving its members with little concrete guidance to proactively address these concerns. This article introduces issues of discrimination to the data science community on its own terms. In it, we tour the familiar data-mining process while providing a taxonomy of common practices that have the potential to produce unintended discrimination. We also survey how discrimination is commonly measured, and suggest how familiar development processes can be augmented to mitigate systems' discriminatory potential. We advocate that data scientists should be intentional about modeling and reducing discriminatory outcomes. Without doing so, their efforts will result in perpetuating any systemic discrimination that may exist, but under a misleading veil of data-driven objectivity.

摘要

最近的研究帮助人们越来越意识到,由大数据驱动的机器学习系统可能会在社会中造成或加剧令人不安的差异。这些研究大多来自实践数据科学界之外,使得成员们几乎没有具体的指导来主动解决这些问题。本文以其自身的术语向数据科学界介绍了歧视问题。在本文中,我们介绍了熟悉的数据挖掘过程,同时提供了可能产生意外歧视的常见做法的分类法。我们还调查了如何衡量歧视,并提出了如何增强常见的开发过程以减轻系统的歧视潜力。我们主张数据科学家应该有意地对模型和减少歧视性结果进行建模。如果不这样做,他们的努力将导致可能存在的任何系统性歧视的延续,而这种歧视是在一个具有误导性的数据驱动客观性的面纱下进行的。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验