Hong Jonggi, Gandhi Jaina, Mensah Ernest Essuah, Zeraati Farnaz Zamiri, Jarjue Ebrima Haddy, Lee Kyungjun, Kacorri Hernisa
Smith-Kettlewell Eye Research Institute, San Francisco, United States.
University of Maryland, College Park, United States.
ASSETS. 2022 Oct;2022. doi: 10.1145/3517428.3544824. Epub 2022 Oct 22.
Teachable object recognizers provide a solution for a very practical need for blind people - instance level object recognition. They assume one can visually inspect the photos they provide for training, a critical and inaccessible step for those who are blind. In this work, we engineer data descriptors that address this challenge. They indicate in real time whether the object in the photo is cropped or too small, a hand is included, the photos is blurred, and how much photos vary from each other. Our descriptors are built into open source testbed iOS app, called MYCam. In a remote user study in ( = 12) blind participants' homes, we show how descriptors, even when error-prone, support experimentation and have a positive impact in the quality of training set that can translate to model performance though this gain is not uniform. Participants found the app simple to use indicating that they could effectively train it and that the descriptors were useful. However, many found the training being tedious, opening discussions around the need for balance between information, time, and cognitive load.
可教对象识别器为盲人的一项非常实际的需求——实例级对象识别提供了一种解决方案。它们假定人们可以目视检查用于训练的照片,而这对于盲人来说是关键且无法做到的一步。在这项工作中,我们设计了能够应对这一挑战的数据描述符。它们能实时指出照片中的对象是否被裁剪或过小、是否包含手部、照片是否模糊以及照片之间的差异程度。我们的描述符被集成到一款名为MYCam的开源测试版iOS应用程序中。在一项针对12名盲人参与者家中进行的远程用户研究中,我们展示了描述符即便容易出错,也能支持实验,并对训练集的质量产生积极影响,尽管这种提升并不一致,但这可以转化为模型性能。参与者发现该应用程序易于使用,表明他们可以有效地对其进行训练,并且描述符很有用。然而,许多人发现训练很乏味,引发了关于在信息、时间和认知负荷之间取得平衡的必要性的讨论。