训练人工智能识别盲人和低视力群体感兴趣的物体。

Sankarnarayanan Tharangini, Paciorkowski Lev, Parikh Khevna, Hamilton-Fletcher Giles, Feng Chen, Sheng Diwei, Hudson Todd E, Rizzo John-Ross, Chan Kevin C

Annu Int Conf IEEE Eng Med Biol Soc. 2023 Jul;2023:1-4. doi: 10.1109/EMBC40787.2023.10340454.

Recent object detection models show promising advances in their architecture and performance, expanding potential applications for the benefit of persons with blindness or low vision (pBLV). However, object detection models are usually trained on generic data rather than datasets that focus on the needs of pBLV. Hence, for applications that locate objects of interest to pBLV, object detection models need to be trained specifically for this purpose. Informed by prior interviews, questionnaires, and Microsoft's ORBIT research, we identified thirty-five objects pertinent to pBLV. We employed this user-centric feedback to gather images of these objects from the Google Open Images V6 dataset. We subsequently trained a YOLOv5x model with this dataset to recognize these objects of interest. We demonstrate that the model can identify objects that previous generic models could not, such as those related to tasks of daily functioning - e.g., coffee mug, knife, fork, and glass. Crucially, we show that careful pruning of a dataset with severe class imbalances leads to a rapid, noticeable improvement in the overall performance of the model by two-fold, as measured using the mean average precision at the intersection over union thresholds from 0.5 to 0.95 (mAP50-95). Specifically, mAP50-95 improved from 0.14 to 0.36 on the seven least prevalent classes in the training dataset. Overall, we show that careful curation of training data can improve training speed and object detection outcomes. We show clear directions on effectively customizing training data to create models that focus on the desires and needs of pBLV.Clinical Relevance- This work demonstrated the benefits of developing assistive AI technology customized to individual users or the wider BLV community.

最近的目标检测模型在其架构和性能方面取得了令人瞩目的进展，为盲人或视力低下者（pBLV）带来了更多潜在应用。然而，目标检测模型通常是在通用数据上进行训练，而非针对pBLV需求的数据集。因此，对于定位pBLV感兴趣对象的应用，目标检测模型需要专门为此目的进行训练。基于之前的访谈、问卷调查以及微软的ORBIT研究，我们确定了35个与pBLV相关的对象。我们利用这种以用户为中心的反馈，从谷歌开放图像V6数据集中收集这些对象的图像。随后，我们使用该数据集训练了一个YOLOv5x模型，以识别这些感兴趣的对象。我们证明，该模型能够识别先前通用模型无法识别的对象，例如与日常功能任务相关的对象，如咖啡杯、刀、叉和玻璃杯。至关重要的是，我们表明，对存在严重类别不平衡的数据集进行仔细修剪，可使模型的整体性能迅速显著提高两倍，这是通过在交并比阈值从0.5到0.95的平均精度均值（mAP50 - 95）来衡量的。具体而言，在训练数据集中七个最不常见的类别上，mAP50 - 95从0.14提高到了0.36。总体而言，我们表明仔细整理训练数据可以提高训练速度和目标检测结果。我们为有效定制训练数据以创建关注pBLV的愿望和需求的模型指明了明确方向。临床相关性——这项工作展示了开发针对个体用户或更广泛的BLV群体定制的辅助人工智能技术的益处。

相似文献

Training AI to Recognize Objects of Interest to the Blind and Low Vision Community.

Annu Int Conf IEEE Eng Med Biol Soc. 2023 Jul;2023:1-4. doi: 10.1109/EMBC40787.2023.10340454.

A Multi-Modal Foundation Model to Assist People with Blindness and Low Vision in Environmental Interaction.

J Imaging. 2024 Apr 26;10(5):103. doi: 10.3390/jimaging10050103.

Wearables for persons with blindness and low vision: form factor matters.

Assist Technol. 2024 Jan 2;36(1):60-63. doi: 10.1080/10400435.2023.2205490. Epub 2023 May 31.

What Visual Targets Are Viewed by Users With a Handheld Mobile Magnifier App.

Transl Vis Sci Technol. 2021 Mar 1;10(3):16. doi: 10.1167/tvst.10.3.16.

Deep learning based object detection and surrounding environment description for visually impaired people.

Heliyon. 2023 Jun 7;9(6):e16924. doi: 10.1016/j.heliyon.2023.e16924. eCollection 2023 Jun.

Performance of Real-world Functional Tasks Using an Updated Oral Electronic Vision Device in Persons Blinded by Trauma.

Optom Vis Sci. 2018 Sep;95(9):766-773. doi: 10.1097/OPX.0000000000001273.

Assistive device using computer vision and image processing for visually impaired; review and current status.

Disabil Rehabil Assist Technol. 2022 Apr;17(3):290-297. doi: 10.1080/17483107.2020.1786731. Epub 2020 Jul 1.

Automatic Object Detection Algorithm-Based Braille Image Generation System for the Recognition of Real-Life Obstacles for Visually Impaired People.

Sensors (Basel). 2022 Feb 18;22(4):1601. doi: 10.3390/s22041601.

Mobile assistive technologies for the visually impaired.

Surv Ophthalmol. 2013 Nov-Dec;58(6):513-28. doi: 10.1016/j.survophthal.2012.10.004. Epub 2013 Sep 20.

Expounding the rehabilitation service for acquired visual impairment contingent on assistive technology acceptance.

Disabil Rehabil Assist Technol. 2021 Jul;16(5):520-524. doi: 10.1080/17483107.2019.1683238. Epub 2020 May 4.

引用本文的文献

Enhancing fall risk assessment: instrumenting vision with deep learning during walks.

J Neuroeng Rehabil. 2024 Jun 22;21(1):106. doi: 10.1186/s12984-024-01400-2.

Suppr 超能文献

核心技术专利：CN118964589B侵权必究

相似文献

Training AI to Recognize Objects of Interest to the Blind and Low Vision Community.

Annu Int Conf IEEE Eng Med Biol Soc. 2023 Jul;2023:1-4. doi: 10.1109/EMBC40787.2023.10340454.

A Multi-Modal Foundation Model to Assist People with Blindness and Low Vision in Environmental Interaction.

J Imaging. 2024 Apr 26;10(5):103. doi: 10.3390/jimaging10050103.

Wearables for persons with blindness and low vision: form factor matters.

Assist Technol. 2024 Jan 2;36(1):60-63. doi: 10.1080/10400435.2023.2205490. Epub 2023 May 31.

What Visual Targets Are Viewed by Users With a Handheld Mobile Magnifier App.

Transl Vis Sci Technol. 2021 Mar 1;10(3):16. doi: 10.1167/tvst.10.3.16.

Deep learning based object detection and surrounding environment description for visually impaired people.

Heliyon. 2023 Jun 7;9(6):e16924. doi: 10.1016/j.heliyon.2023.e16924. eCollection 2023 Jun.

Performance of Real-world Functional Tasks Using an Updated Oral Electronic Vision Device in Persons Blinded by Trauma.

Optom Vis Sci. 2018 Sep;95(9):766-773. doi: 10.1097/OPX.0000000000001273.

Assistive device using computer vision and image processing for visually impaired; review and current status.

Disabil Rehabil Assist Technol. 2022 Apr;17(3):290-297. doi: 10.1080/17483107.2020.1786731. Epub 2020 Jul 1.

Automatic Object Detection Algorithm-Based Braille Image Generation System for the Recognition of Real-Life Obstacles for Visually Impaired People.

Sensors (Basel). 2022 Feb 18;22(4):1601. doi: 10.3390/s22041601.

Mobile assistive technologies for the visually impaired.

Surv Ophthalmol. 2013 Nov-Dec;58(6):513-28. doi: 10.1016/j.survophthal.2012.10.004. Epub 2013 Sep 20.

Expounding the rehabilitation service for acquired visual impairment contingent on assistive technology acceptance.

Disabil Rehabil Assist Technol. 2021 Jul;16(5):520-524. doi: 10.1080/17483107.2019.1683238. Epub 2020 May 4.

引用本文的文献

Enhancing fall risk assessment: instrumenting vision with deep learning during walks.

J Neuroeng Rehabil. 2024 Jun 22;21(1):106. doi: 10.1186/s12984-024-01400-2.

Training AI to Recognize Objects of Interest to the Blind and Low Vision Community.

作者信息

出版信息

相似文献

引用本文的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献