Suppr超能文献

用于眼镜检测的多样化数据集:扩展Flickr人脸高质量(FFHQ)数据集

Diverse Dataset for Eyeglasses Detection: Extending the Flickr-Faces-HQ (FFHQ) Dataset.

作者信息

Matuzevičius Dalius

机构信息

Department of Electronic Systems, Vilnius Gediminas Technical University (VILNIUS TECH), 10105 Vilnius, Lithuania.

出版信息

Sensors (Basel). 2024 Dec 1;24(23):7697. doi: 10.3390/s24237697.

Abstract

Facial analysis is an important area of research in computer vision and machine learning, with applications spanning security, healthcare, and user interaction systems. The data-centric AI approach emphasizes the importance of high-quality, diverse, and well-annotated datasets in driving advancements in this field. However, current facial datasets, such as Flickr-Faces-HQ (FFHQ), lack detailed annotations for detecting facial accessories, particularly eyeglasses. This work addresses this limitation by extending the FFHQ dataset with precise bounding box annotations for eyeglasses detection, enhancing its utility for data-centric AI applications. The extended dataset comprises 70,000 images, including over 16,000 images containing eyewear, and it exceeds the CelebAMask-HQ dataset in size and diversity. A semi-automated protocol was employed to efficiently generate accurate bounding box annotations, minimizing the demand for extensive manual labeling. This enriched dataset serves as a valuable resource for training and benchmarking eyewear detection models. Additionally, the baseline benchmark results for eyeglasses detection were presented using deep learning methods, including YOLOv8 and MobileNetV3. The evaluation, conducted through cross-dataset validation, demonstrated the robustness of models trained on the extended FFHQ dataset with their superior performances over existing alternative CelebAMask-HQ. The extended dataset, which has been made publicly available, is expected to support future research and development in eyewear detection, contributing to advancements in facial analysis and related fields.

摘要

面部分析是计算机视觉和机器学习领域的一个重要研究方向,其应用涵盖安全、医疗保健和用户交互系统等领域。以数据为中心的人工智能方法强调高质量、多样化且标注良好的数据集对于推动该领域进步的重要性。然而,当前的面部数据集,如Flickr-Faces-HQ(FFHQ),缺乏用于检测面部配饰(尤其是眼镜)的详细标注。这项工作通过为眼镜检测添加精确的边界框标注来扩展FFHQ数据集,从而解决了这一局限性,增强了其在以数据为中心的人工智能应用中的实用性。扩展后的数据集包含70000张图像,其中超过16000张图像包含眼镜,在规模和多样性上超过了CelebAMask-HQ数据集。采用了一种半自动协议来高效生成准确的边界框标注,最大限度地减少了对大量手动标注的需求。这个丰富的数据集是训练和测试眼镜检测模型的宝贵资源。此外,还使用深度学习方法(包括YOLOv8和MobileNetV3)展示了眼镜检测的基线基准结果。通过跨数据集验证进行的评估表明,在扩展的FFHQ数据集上训练的模型具有鲁棒性,其性能优于现有的替代数据集CelebAMask-HQ。已公开提供的扩展数据集有望支持眼镜检测领域未来的研究与开发,为面部分析及相关领域的进步做出贡献。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/01dc/11645010/ab31be0abff0/sensors-24-07697-g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验