Romić Krešimir, Leventić Hrvoje, Habijan Marija, Galić Irena
Faculty of Electrical Engineering, Computer Science and Information Technology Osijek, Kneza Trpimira 2B, Osijek HR-31000, Croatia.
Data Brief. 2025 Jun 7;61:111755. doi: 10.1016/j.dib.2025.111755. eCollection 2025 Aug.
This article presents a new dataset for crosswalk segmentation targeting assistive technologies for visually impaired individuals. The dataset combines synthetic and real-world first-person view images with corresponding binary segmentation masks. The synthetic portion contains 3000 images generated using a fine-tuned Stable Diffusion model, with 1500 images created using a standard prompt ("a crosswalk image") and 1500 additional images incorporating various environmental conditions (sunny, cloudy, rainy, and night) through specialized prompts. The real-world component comprises 300 images extracted from chest-mounted smartphone video recordings of pedestrians approaching crosswalks, carefully distributed across different environmental conditions (120 sunny, 60 cloudy, 60 rainy, and 60 night images). To ensure diversity, each physical crosswalk location appears in at most two images from different approach directions. All images in both synthetic and real-world sets were manually annotated using a custom interface where annotators defined crosswalk regions as quadrilateral polygons, creating binary masks. The dataset is organized hierarchically by image source (synthetic/real-world) and environmental condition, with consistent subfolder structures for images and their corresponding masks. This dataset addresses the scarcity of publicly available crosswalk segmentation data with environmental diversity and has potential applications in developing and benchmarking computer vision algorithms for assistive navigation systems, investigating synthetic data augmentation efficacy, and advancing pedestrian safety technologies.
本文提出了一个用于人行横道分割的新数据集,旨在为视障人士提供辅助技术。该数据集将合成的和真实世界的第一人称视角图像与相应的二值分割掩码相结合。合成部分包含使用微调后的Stable Diffusion模型生成的3000张图像,其中1500张图像是使用标准提示(“一张人行横道图像”)创建的,另外1500张图像则通过专门提示融入了各种环境条件(晴天、多云、雨天和夜晚)。真实世界部分由从行人接近人行横道时佩戴在胸前的智能手机视频记录中提取的300张图像组成,这些图像仔细分布在不同的环境条件下(120张晴天、60张多云、60张雨天和60张夜晚图像)。为确保多样性,每个实际人行横道位置在来自不同接近方向的图像中最多出现两次。合成集和真实世界集中的所有图像都使用自定义界面进行了手动标注,标注人员将人行横道区域定义为四边形多边形,从而创建二值掩码。该数据集按图像来源(合成/真实世界)和环境条件进行分层组织,图像及其相应掩码具有一致的子文件夹结构。这个数据集解决了具有环境多样性的公开可用人行横道分割数据稀缺的问题,并且在开发和基准测试辅助导航系统的计算机视觉算法、研究合成数据增强效果以及推进行人安全技术方面具有潜在应用。