School of Computer Science, University of Nottingham, Nottingham NG8 1BB, UK.
Centre for Perinatal Research, School of Medicine, University of Nottingham, Nottingham NG7 2RD, UK.
Sensors (Basel). 2024 Oct 14;24(20):6619. doi: 10.3390/s24206619.
Neurodevelopment is a highly intricate process, and early detection of abnormalities is critical for optimizing outcomes through timely intervention. Accurate and cost-effective diagnostic methods for neurological disorders, particularly in infants, remain a significant challenge due to the heterogeneity of data and the variability in neurodevelopmental conditions. This study recruited twelve parent-infant pairs, with infants aged 3 to 12 months. Approximately 25 min of 2D video footage was captured, documenting natural play interactions between the infants and toys. We developed a novel, open-source method to classify and analyse infant movement patterns using deep learning techniques, specifically employing a transformer-based fusion model that integrates multiple video features within a unified deep neural network. This approach significantly outperforms traditional methods reliant on individual video features, achieving an accuracy of over 90%. Furthermore, a sensitivity analysis revealed that the pose estimation contributed far less to the model's output than the pre-trained transformer and convolutional neural network (CNN) components, providing key insights into the relative importance of different feature sets. By providing a more robust, accurate and low-cost analysis of movement patterns, our work aims to enhance the early detection and potential prediction of neurodevelopmental delays, whilst providing insight into the functioning of the transformer-based fusion models of diverse video features.
神经发育是一个高度复杂的过程,早期发现异常对于通过及时干预优化结果至关重要。由于数据的异质性和神经发育状况的可变性,用于神经障碍的准确且具有成本效益的诊断方法,特别是在婴儿中,仍然是一个重大挑战。本研究招募了 12 对母婴,婴儿年龄在 3 到 12 个月之间。大约拍摄了 25 分钟的 2D 视频片段,记录了婴儿与玩具之间自然玩耍的互动。我们开发了一种新颖的、开源的方法,使用深度学习技术对婴儿的运动模式进行分类和分析,特别是采用了基于转换器的融合模型,该模型将多个视频特征集成到一个统一的深度神经网络中。这种方法的性能明显优于依赖于单个视频特征的传统方法,准确率超过 90%。此外,敏感性分析表明,与预训练的转换器和卷积神经网络(CNN)组件相比,姿势估计对模型输出的贡献要小得多,这为不同特征集的相对重要性提供了关键见解。通过提供更强大、准确和低成本的运动模式分析,我们的工作旨在增强神经发育迟缓的早期检测和潜在预测能力,同时深入了解基于转换器的融合模型对各种视频特征的功能。