Kamikubo Rie, Dwivedi Utkarsh, Kacorri Hernisa
College of Information Studies University of Maryland, College Park.
ASSETS. 2021;1. doi: 10.1145/3441852.3471208.
Datasets sourced from people with disabilities and older adults play an important role in innovation, benchmarking, and mitigating bias for both assistive and inclusive AI-infused applications. However, they are scarce. We conduct a systematic review of 137 accessibility datasets manually located across different disciplines over the last 35 years. Our analysis highlights how researchers navigate tensions between benefits and risks in data collection and sharing. We uncover patterns in data collection purpose, terminology, sample size, data types, and data sharing practices across communities of focus. We conclude by critically reflecting on challenges and opportunities related to locating and sharing accessibility datasets calling for technical, legal, and institutional privacy frameworks that are more attuned to concerns from these communities.
来自残疾人和老年人的数据集在创新、基准测试以及减轻辅助性和包容性人工智能应用中的偏差方面发挥着重要作用。然而,这些数据集很稀缺。我们对过去35年中手动收集的137个不同学科的可访问性数据集进行了系统综述。我们的分析突出了研究人员如何应对数据收集和共享中利益与风险之间的紧张关系。我们发现了不同重点群体在数据收集目的、术语、样本量、数据类型和数据共享实践方面的模式。我们通过批判性地反思与查找和共享可访问性数据集相关的挑战和机遇来得出结论,呼吁建立更符合这些群体关切的技术、法律和机构隐私框架。