Mohammed Mohammed Aliy, Anzaku Esla Timothy, Ward Peter Kenneth, Levecke Bruno, Krishnamoorthy Janarthanan, De Neve Wesley, Van Hoecke Sofie
IDLab, Ghent University - imec, Ghent University, Ghent, Belgium.
School of Biomedical Engineering, Jimma Institute of Technology, Jimma University, Jimma, Ethiopia.
PLoS Negl Trop Dis. 2025 Jul 3;19(7):e0013234. doi: 10.1371/journal.pntd.0013234. eCollection 2025 Jul.
Soil-transmitted helminth (STH) and Schistosoma mansoni (S. mansoni) infections remain significant public health concerns in tropical and subtropical regions. Deep Convolutional Neural Networks (DCNNs) have already shown promising accuracy in identifying STH and S. mansoni eggs in the same, in-distribution (ID) settings. However, their performance in real-world, out-of-distribution (OOD) scenarios, characterized by variations in image capture devices and the appearance of previously unseen egg types, has not been thoroughly investigated. Assessing the robustness of DCNNs under these challenging conditions is crucial for ensuring their reliability in field diagnostics.
Our study addresses the gap in evaluating DCNNs for identifying STH and S. mansoni eggs by rigorously testing multiple variants of the You Only Look Once (YOLO) version 7 model under two OOD conditions: (i) a dataset shift due to a change in the image capture device, and (ii) a combination of this device change and the presence of two egg types not occurring during training. We adopted a 2 [Formula: see text] 3 montage data augmentation strategy to enhance OOD generalization. Additionally, we used the Toolkit for Identifying object Detection Errors (TIDE) and Gradient-weighted Class Activation Mapping (Grad-CAM) to perform a comprehensive analysis of the results.
In ID settings, YOLOv7-E6E outperformed other models, achieving an F1-score of 97.47%. For the OOD scenario involving only a change in the image capture device, the 2 [Formula: see text] 3 montage strategy significantly enhanced performance, increasing precision by 8%, recall by 14.85%, and mAP@IoU0.5 by 21.36%. However, for the more complex OOD scenario that involves both a change in the capture device and the introduction of two previously unseen egg types, the proposed augmentation technique, while beneficial, did not fully address the generalization challenges across all YOLOv7 variants, highlighting the necessity of testing beyond ID scenarios, on which state-of-the-art models predominantly focus.
CONCLUSIONS/SIGNIFICANCE: This study underscores the critical importance of utilizing comprehensive test sets and conducting rigorous OOD evaluations when designing machine learning solutions for STH, S. mansoni or any other helminth infections. Understanding the true capabilities of DCNNs in real-world settings depends on such thorough testing. Expanding AI-driven diagnostic assessments to account for the complexities encountered in the field is essential for creating robust tools that can significantly contribute to the global elimination of STH and S. mansoni infections as public health problems by 2030, a goal put forth by the World Health Organization.
土壤传播的蠕虫(STH)感染和曼氏血吸虫(S. mansoni)感染在热带和亚热带地区仍然是重大的公共卫生问题。深度卷积神经网络(DCNN)在相同的分布内(ID)设置中识别STH和曼氏血吸虫卵方面已经显示出有前景的准确性。然而,它们在现实世界的分布外(OOD)场景中的性能,其特点是图像捕获设备的变化以及出现以前未见过的卵类型,尚未得到充分研究。在这些具有挑战性的条件下评估DCNN的稳健性对于确保其在现场诊断中的可靠性至关重要。
我们的研究通过在两种OOD条件下严格测试You Only Look Once(YOLO)版本7模型的多个变体,解决了评估DCNN识别STH和曼氏血吸虫卵方面的差距:(i)由于图像捕获设备的变化导致的数据集偏移,以及(ii)这种设备变化与训练期间未出现的两种卵类型的存在相结合。我们采用了2 [公式:见正文] 3蒙太奇数据增强策略来增强OOD泛化能力。此外,我们使用对象检测错误识别工具包(TIDE)和梯度加权类激活映射(Grad-CAM)对结果进行全面分析。
在ID设置中,YOLOv7-E6E的表现优于其他模型,F1分数达到97.47%。对于仅涉及图像捕获设备变化的OOD场景,2 [公式:见正文] 3蒙太奇策略显著提高了性能,精度提高了8%,召回率提高了14.85%,mAP@IoU0.5提高了21.36%。然而,对于更复杂的OOD场景,即涉及捕获设备的变化以及引入两种以前未见过的卵类型,所提出的增强技术虽然有益,但并未完全解决所有YOLOv7变体的泛化挑战,这突出了在ID场景之外进行测试的必要性,而目前的先进模型主要关注ID场景。
结论/意义:本研究强调了在为STH、曼氏血吸虫或任何其他蠕虫感染设计机器学习解决方案时,使用综合测试集并进行严格的OOD评估的至关重要性。了解DCNN在现实世界环境中的真正能力取决于这种全面的测试。将人工智能驱动诊断评估扩展到考虑现场遇到的复杂性对于创建强大的工具至关重要,这些工具可以为到2030年在全球消除作为公共卫生问题的STH和曼氏血吸虫感染做出重大贡献,这是世界卫生组织提出的目标。