协同融合：一种用于增强热眼检测的CLAHE、YOLO模型和先进超分辨率的集成管道。

Synergistic fusion: An integrated pipeline of CLAHE, YOLO models, and advanced super-resolution for enhanced thermal eye detection.

作者信息

J Persiya, A Sasithradevi

机构信息

School of Electronics Engineering, Vellore Institute of Technology Chennai, India.

Centre for Advanced Data Science, Vellore Institute of Technology Chennai, India.

出版信息

PLoS One. 2025 Jul 18;20(7):e0328227. doi: 10.1371/journal.pone.0328227. eCollection 2025.

DOI:10.1371/journal.pone.0328227

PMID:40679961

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC12273955/

Abstract

Accurate eye detection in thermal images is essential for diverse applications, including biometrics, healthcare, driver monitoring, and human-computer interaction. However, achieving this accuracy is often hindered by the inherent limitations of thermal data, such as low resolution and poor contrast. This work addresses these challenges by proposing a novel, multifaceted approach that combines both deep learning and image processing techniques. We first introduce a unique dataset of thermal facial images captured with meticulous eye location annotations. To improve image clarity, we employ Contrast Limited Adaptive Histogram Equalization (CLAHE). Subsequently, we explore the effectiveness of advanced YOLO models (YOLOv8 and YOLOv9) for accurate eye detection. Our experiments reveal that YOLOv8 with CLAHE-enhanced images achieved the highest accuracy (precision and recall of 1, mAP50 of 0.995, and mAP50-95 of 0.801), the YOLOv9 model also demonstrated excellent performance with a precision of 0.998, recall of 0.998, mAP-50 of 0.995, and mAP50-95 of 0.753. Furthermore, to enhance the resolution of detected eye regions, we investigate various super-resolution techniques, ranging from traditional methods like Bicubic interpolation to cutting-edge approaches like generative adversarial networks (BSRGAN, ESRGAN) and advanced models like Real-ESRGAN, SwinIR, and SwinIR-Large with ResShift. The performance of these techniques is evaluated using both objective and subjective quality measures. Overall, this work demonstrates the effectiveness of our proposed pipeline, which seamlessly integrates image enhancement, deep learning, and super-resolution techniques. This synergic fusion significantly improves the contrast, accuracy of eye detection, and overall resolution of thermal images, paving the way for potential applications across various fields.

摘要

在热成像中进行准确的眼睛检测对于包括生物识别、医疗保健、驾驶员监测和人机交互在内的各种应用至关重要。然而，热数据的固有局限性，如低分辨率和对比度差，常常阻碍实现这种准确性。这项工作通过提出一种新颖的、多方面的方法来应对这些挑战，该方法结合了深度学习和图像处理技术。我们首先引入了一个独特的热面部图像数据集，这些图像带有精确的眼睛位置注释。为了提高图像清晰度，我们采用了对比度受限自适应直方图均衡化（CLAHE）。随后，我们探索了先进的YOLO模型（YOLOv8和YOLOv9）在准确眼睛检测方面的有效性。我们的实验表明，使用CLAHE增强图像的YOLOv8实现了最高的准确率（精确率和召回率均为1，mAP50为0.995，mAP50 - 95为0.801），YOLOv9模型也表现出色，精确率为0.998，召回率为0.998，mAP - 50为0.995，mAP50 - 95为0.753。此外，为了提高检测到的眼睛区域的分辨率，我们研究了各种超分辨率技术，从传统方法如双立方插值到前沿方法如生成对抗网络（BSRGAN、ESRGAN）以及先进模型如Real - ESRGAN、SwinIR和带有ResShift的SwinIR - Large。使用客观和主观质量指标对这些技术的性能进行了评估。总体而言，这项工作证明了我们提出的管道的有效性，该管道无缝集成了图像增强、深度学习和超分辨率技术。这种协同融合显著提高了热图像的对比度、眼睛检测的准确性和整体分辨率，为跨领域的潜在应用铺平了道路。