Layana Castro Pablo E, Garví Antonio García, Sánchez-Salmerón Antonio-José
Universitat Politècnica de Valéncia, Instituto de Automática e Informática Industrial, Camino de Vera S/n, Edificio 8G Acceso D, Valencia, 46022, Valencia, Spain.
Heliyon. 2023 Mar 22;9(4):e14715. doi: 10.1016/j.heliyon.2023.e14715. eCollection 2023 Apr.
Pose estimation of in image sequences is challenging and even more difficult in low-resolution images. Problems range from occlusions, loss of worm identity, and overlaps to aggregations that are too complex or difficult to resolve, even for the human eye. Neural networks, on the other hand, have shown good results in both low-resolution and high-resolution images. However, training in a neural network model requires a very large and balanced dataset, which is sometimes impossible or too expensive to obtain. In this article, a novel method for predicting poses in cases of multi-worm aggregation and aggregation with noise is proposed. To solve this problem we use an improved U-Net model capable of obtaining images of the next aggregated worm posture. This neural network model was trained/validated using a custom-generated dataset with a synthetic image simulator. Subsequently, tested with a dataset of real images. The results obtained were greater than 75% in precision and 0.65 with Intersection over Union (IoU) values.
在图像序列中进行姿态估计具有挑战性,在低分辨率图像中更是如此。问题包括遮挡、蠕虫身份丢失、重叠以及过于复杂或难以解决的聚集,即使对人眼来说也是如此。另一方面,神经网络在低分辨率和高分辨率图像中都显示出了良好的结果。然而,在神经网络模型中进行训练需要非常大且平衡的数据集,有时获取这样的数据集是不可能的或成本过高。在本文中,提出了一种在多蠕虫聚集和有噪声聚集情况下预测姿态的新方法。为了解决这个问题,我们使用了一种改进的U-Net模型,该模型能够获取下一个聚集蠕虫姿态的图像。这个神经网络模型是使用带有合成图像模拟器的自定义生成数据集进行训练/验证的。随后,使用真实图像数据集进行测试。获得的结果精度大于75%,交并比(IoU)值为0.65。