ViT-HHO：使用哈里斯鹰优化算法的糖尿病视网膜病变检测优化视觉Transformer

ViT-HHO: Optimized vision transformer for diabetic retinopathy detection using Harris Hawk optimization.

作者信息

Awasthi Vishal, Awasthi Namita, Kumar Hemant, Singh Shubhendra, Singh Prabal Pratap, Dixit Poonam, Agarwal Rashi

机构信息

Department of Electronics & Communication Engineering, School of Engineering & Technology (UIET), Chhatrapati Shahu Ji Maharaj University, Kanpur, India.

Department of Computer Application, Allenhouse Business School, Dr. A.P.J. Abdul Kalam Technical University, Lucknow, India.

出版信息

MethodsX. 2024 Oct 23;13:103018. doi: 10.1016/j.mex.2024.103018. eCollection 2024 Dec.

Diabetic retinopathy (DR) is a significant cause of vision impairment globally, emphasizing the importance of timely and precise detection to prevent severe consequences. This study presents an optimized Vision Transformer (ViT) model that incorporates Harris Hawk Optimization (HHO) to improve the automated detection of diabetic retinopathy (DR). The ViT architecture utilizes self-attention mechanisms to capture local and global features in retinal images. Additionally, HHO optimizes key hyperparameters to maximize the performance of the model. The proposed ViT-HHO model achieved exceptional performance on the APTOS-2019 and IDRiD datasets. Specifically, it achieved 99.83 % accuracy, 99.78 % sensitivity, 99.85 % specificity, and 99.80 % AUC-ROC on the APTOS-2019 dataset, surpassing traditional CNNs and alternative optimization techniques. The model exhibited strong generalization on the IDRiID dataset, achieving an accuracy of 99.11 % and an AUC-ROC of 99.12 %. The ViT-HHO model demonstrates the potential for enhancing the clinical detection of diabetic retinopathy (DR), providing high precision and reliability.•An optimized Vision Transformer (ViT) model was developed using HHO for improved detection of Diabetic Retinopathy (DR).•The model was validated on the APTOS-2019 and IDRiID datasets, demonstrating superior accuracy and AUC-ROC metrics.•The model's generalization and robustness were demonstrated through comprehensive performance evaluations.

糖尿病视网膜病变（DR）是全球视力损害的一个重要原因，这凸显了及时、精确检测以防止严重后果的重要性。本研究提出了一种优化的视觉Transformer（ViT）模型，该模型结合了哈里斯鹰优化（HHO）算法以改进糖尿病视网膜病变（DR）的自动检测。ViT架构利用自注意力机制来捕捉视网膜图像中的局部和全局特征。此外，HHO算法对关键超参数进行优化，以最大化模型的性能。所提出的ViT-HHO模型在APTOS-2019和IDRiD数据集上取得了优异的性能。具体而言，在APTOS-2019数据集上，它实现了99.83%的准确率、99.78%的灵敏度、99.85%的特异性和99.80%的AUC-ROC，超过了传统的卷积神经网络（CNN）和其他优化技术。该模型在IDRiID数据集上表现出很强的泛化能力，准确率达到99.11%，AUC-ROC为99.12%。ViT-HHO模型展示了增强糖尿病视网膜病变（DR）临床检测的潜力，具有高精度和高可靠性。

• 使用HHO算法开发了一种优化的视觉Transformer（ViT）模型，用于改进糖尿病视网膜病变（DR）的检测。

• 该模型在APTOS-2019和IDRiID数据集上得到验证，展示了卓越的准确率和AUC-ROC指标。

• 通过全面的性能评估证明了该模型的泛化能力和鲁棒性。