Department of Information Engineering, Univeristà Politecnica Delle Marche, Via Brecce Bianche 12, Ancona, 60121, Italy; AIDAPT S.r.l, Via Brecce Bianche 12, Ancona, 60121, Italy.
Department of Information Engineering, Univeristà Politecnica Delle Marche, Via Brecce Bianche 12, Ancona, 60121, Italy.
Comput Biol Med. 2023 Sep;163:107194. doi: 10.1016/j.compbiomed.2023.107194. Epub 2023 Jun 30.
Patients suffering from neurological diseases may develop dysarthria, a motor speech disorder affecting the execution of speech. Close and quantitative monitoring of dysarthria evolution is crucial for enabling clinicians to promptly implement patients' management strategies and maximizing effectiveness and efficiency of communication functions in term of restoring, compensating or adjusting. In the clinical assessment of orofacial structures and functions, at rest condition or during speech and non-speech movements, a qualitative evaluation is usually performed, throughout visual observation.
To overcome limitations posed by qualitative assessments, this work presents a store-and-forward self-service telemonitoring system that integrates, within its cloud architecture, a convolutional neural network (CNN) for analyzing video recordings acquired by individuals with dysarthria. This architecture - called facial landmark Mask RCNN - aims at locating facial landmarks as a prior for assessing the orofacial functions related to speech and examining dysarthria evolution in neurological diseases.
When tested on the Toronto NeuroFace dataset, a publicly available annotated dataset of video recordings from patients with amyotrophic lateral sclerosis (ALS) and stroke, the proposed CNN achieved a normalized mean error equal to 1.79 on localizing the facial landmarks. We also tested our system in a real-life scenario on 11 bulbar-onset ALS subjects, obtaining promising outcomes in terms of facial landmark position estimation.
This preliminary study represents a relevant step towards the use of remote tools to support clinicians in monitoring the evolution of dysarthria.
患有神经系统疾病的患者可能会出现构音障碍,这是一种影响言语执行的运动性言语障碍。密切且定量地监测构音障碍的演变对于使临床医生能够及时实施患者的管理策略以及最大限度地提高恢复、补偿或调整言语和非言语运动期间的口面部结构和功能的交流功能的有效性和效率至关重要。在口面部结构和功能的临床评估中,无论是在休息状态还是在言语和非言语运动期间,通常都是通过视觉观察来进行定性评估。
为了克服定性评估的局限性,本研究提出了一种存储转发自助式远程监测系统,该系统将卷积神经网络(CNN)集成在其云架构中,用于分析构音障碍患者采集的视频记录。该架构称为面部地标 Mask RCNN,旨在定位面部地标,作为评估与言语相关的口面部功能和检查神经系统疾病中构音障碍演变的先验条件。
当在多伦多神经面部数据集(一个公开的肌萎缩侧索硬化症(ALS)和中风患者视频记录的标注数据集)上进行测试时,所提出的 CNN 在定位面部地标方面的归一化平均误差达到了 1.79。我们还在 11 名球部起病的 ALS 患者的真实场景中测试了我们的系统,在面部地标位置估计方面取得了有前景的结果。
这项初步研究代表了在使用远程工具来支持临床医生监测构音障碍演变方面迈出的重要一步。