Chotikkakamthorn Kittisak, Lie Wen-Nung, Ritthipravat Panrasee, Kusakunniran Worapan, Tuakta Pimchanok, Benjapornlert Paitoon
Department of Biomedical Engineering, Faculty of Engineering, Mahidol University, 999 Phutthamonthon 4 Road, Salaya, Nakhon Pathom, 73170, Thailand; Department of Electrical Engineering, College of Engineering, National Chung Cheng University, No. 168, Section 1, University Rd, Minxiong Township, Chia-Yi, 621301, Taiwan.
Department of Electrical Engineering, College of Engineering, National Chung Cheng University, No. 168, Section 1, University Rd, Minxiong Township, Chia-Yi, 621301, Taiwan.
Comput Biol Med. 2025 Sep;195:110620. doi: 10.1016/j.compbiomed.2025.110620. Epub 2025 Jun 21.
Recently, telemedicine has allowed doctor-to-patient or doctor-to-doctor consultations to tackle traditional problems: the COVID-19 pandemic, remote areas, long-time usage per visit, and dependence on family members in transportation. Nevertheless, few studies have applied telemedicine to measure head movement, which is mandatory for activities of daily living and is degraded by aging, trauma, pain, and degenerative disease. In recent years, artificial intelligence, including vision-based methods, has been used to measure cervical range of motion (CROM). However, they suffer from significant measurement errors and depth-camera requirements. Conversely, recent deep-learning-based head pose estimation (HPE) networks have achieved higher accuracy than previous methods, which are attractive for CROM measurements in telemedicine. This study aims to propose the application of a deep neural network adopting multi-level pyramidal feature extraction, a bi-directional Pyramidal Feature Aggregation Structure (PFAS) for feature fusion, a modified Atrous Spatial Pyramid Pooling (ASPP) module for spatial and channel feature enhancement, and a multi-bin classification and regression module, to derive the Euler angles as the head pose parameters. We evaluated the proposed technique on public datasets (300 W_LP, AFLW2000, and BIWI), achieving comparable performance to previous algorithms with mean MAE (mean absolute error) values of 3.36°, 3.50°, and 2.16° at several evaluation protocols. For CROM measurement in telemedicine, ours achieved the lowest mean MAE of 3.73° for a private medical dataset. Furthermore, ours achieved fast inference speed of 2.27 ms per image. Thus, for both traditional HPE problems and CROM measurement applications, ours offers accuracy, convenience, low computational requirements, and low operational costs (GitHub: https://github.com/nickuntitled/pyramid_based_HPE).
最近,远程医疗使得医患或医医之间的会诊能够解决一些传统问题:新冠疫情、偏远地区、每次就诊时间过长以及交通上对家属的依赖。然而,很少有研究将远程医疗应用于头部运动测量,而头部运动对于日常生活活动来说是必不可少的,并且会因衰老、创伤、疼痛和退行性疾病而退化。近年来,包括基于视觉的方法在内的人工智能已被用于测量颈椎活动范围(CROM)。然而,它们存在显著的测量误差且需要深度相机。相反,最近基于深度学习的头部姿态估计(HPE)网络比以前的方法具有更高的准确率,这对于远程医疗中的CROM测量很有吸引力。本研究旨在提出一种深度神经网络的应用,该网络采用多级金字塔特征提取、用于特征融合的双向金字塔特征聚合结构(PFAS)、用于空间和通道特征增强的改进空洞空间金字塔池化(ASPP)模块以及多箱分类和回归模块,以导出欧拉角作为头部姿态参数。我们在公共数据集(300W_LP、AFLW2000和BIWI)上评估了所提出的技术,在几个评估协议下,平均绝对误差(MAE)值分别为3.36°、3.50°和2.16°,与先前算法的性能相当。对于远程医疗中的CROM测量,在一个私人医疗数据集上,我们的方法实现了最低的平均MAE,为3.73°。此外,我们的方法实现了每张图像2.27毫秒的快速推理速度。因此,对于传统的HPE问题和CROM测量应用,我们的方法都具有准确性、便利性、低计算要求和低运营成本(GitHub:https://github.com/nickuntitled/pyramid_based_HPE)。