Isaev Dmitry Yu, Major Samantha, Carpenter Kimberly L H, Grapel Jordan, Chang Zhuoqing, Di Martino Matias, Carlson David, Dawson Geraldine, Sapiro Guillermo
Department of Electrical and Computer Engineering, Duke University, Durham, NC, USA.
Duke Center for Autism and Brain Development, Department of Psychiatry and Behavioral Sciences, Duke University School of Medicine, Durham, NC, USA.
Sci Rep. 2025 Aug 22;15(1):30963. doi: 10.1038/s41598-025-10511-2.
Electroencephalography (EEG) recordings with visual stimuli require detailed coding to determine the periods of participant's attention. Here we propose to use a supervised machine learning model and off-the-shelf video cameras only. We extract computer vision-based features such as head pose, gaze, and face landmarks from the video of the participant, and train the machine learning model (multi-layer perceptron) on an initial dataset, then adapt it with a small subset of data from a new participant. Using a sample size of 23 autistic children with and without co-occurring ADHD (attention-deficit/hyperactivity disorder) aged 49-95 months, and training on additional 2560 labeled frames (equivalent to 85.3 s of the video) of a new participant, the median area under the receiver operating characteristic curve for inattention detection was 0.989 (IQR 0.984-0.993) and the median inter-rater reliability (Cohen's kappa) with a trained human annotator was 0.888. Agreement with human annotations for nine participants was in the 0.616-0.944 range. Our results demonstrate the feasibility of automatic tools to detect inattention during EEG recordings, and its potential to reduce the subjectivity and time burden of human attention coding. The tool for model adaptation and visualization of the computer vision features is made publicly available to the research community.
带有视觉刺激的脑电图(EEG)记录需要详细编码以确定参与者的注意力时段。在此,我们提议仅使用一个监督式机器学习模型和现成的摄像机。我们从参与者的视频中提取基于计算机视觉的特征,如头部姿势、注视和面部地标,并在初始数据集上训练机器学习模型(多层感知器),然后用来自新参与者的一小部分数据对其进行适配。以23名年龄在49 - 95个月、患有和未患有共病注意力缺陷多动障碍(ADHD)的自闭症儿童为样本量,并在新参与者的另外2560个标注帧(相当于85.3秒视频)上进行训练,用于检测注意力不集中的接收器操作特征曲线下的中位数面积为0.989(四分位距0.984 - 0.993),与经过训练的人工标注者的中位数评分者间信度(科恩kappa系数)为0.888。九名参与者与人工标注的一致性在0.616 - 0.944范围内。我们的结果证明了自动工具在脑电图记录期间检测注意力不集中的可行性,以及其减少人工注意力编码的主观性和时间负担的潜力。用于模型适配和计算机视觉特征可视化的工具已向研究界公开提供。