Cheema Maryam, Seifi Hasti, Fazli Pooyan
Arizona State University Tempe, Arizona, USA.
DIS (Des Interact Syst Conf). 2025 Jul;2025:458-474. doi: 10.1145/3715336.3735685. Epub 2025 Jul 4.
Audio descriptions (AD) make videos accessible for blind and low vision (BLV) users by describing visual elements that cannot be understood from the main audio track. AD created by professionals or novice describers is time-consuming and offers little customization or control to BLV viewers on description length and content and when they receive it. To address this gap, we explore user-driven AI-generated descriptions, enabling BLV viewers to control both the timing and level of detail of the descriptions they receive. In a study, 20 BLV participants activated audio descriptions for seven different video genres with two levels of detail: concise and detailed. Our findings reveal differences in the preferred frequency and level of detail of ADs for different videos, participants' sense of control with this style of AD delivery, and its limitations. We discuss the implications of these findings for the development of future AD tools for BLV users.
音频描述(AD)通过描述主音频轨道中无法理解的视觉元素,使盲人和低视力(BLV)用户能够访问视频。由专业人员或新手描述者创建的音频描述耗时且在描述长度、内容以及BLV观众接收描述的时间方面几乎没有提供定制或控制权。为了弥补这一差距,我们探索了用户驱动的人工智能生成描述,使BLV观众能够控制他们收到的描述的时间和细节程度。在一项研究中,20名BLV参与者为七种不同视频类型激活了音频描述,有两种细节级别:简洁和详细。我们的研究结果揭示了不同视频的音频描述在首选频率和细节级别、参与者对这种音频描述传递方式的控制感以及其局限性方面的差异。我们讨论了这些研究结果对未来为BLV用户开发音频描述工具的意义。