Isaev Dmitry Yu, Tchapyjnikov Dmitry, Cotten C Michael, Tanaka David, Martinez Natalia, Bertran Martin, Sapiro Guillermo, Carlson David
Department of Biomedical Engineering, Duke University, Durham, NC, USA.
Department of Pediatrics, Department of Neurology, Duke University, Durham, NC, USA.
Proc Mach Learn Res. 2020 Aug;126:479-507.
Seizures are a common emergency in the neonatal intesive care unit (NICU) among newborns receiving therapeutic hypothermia for hypoxic ischemic encephalopathy. The high incidence of seizures in this patient population necessitates continuous electroencephalographic (EEG) monitoring to detect and treat them. Due to EEG recordings being reviewed intermittently throughout the day, inevitable delays to seizure identification and treatment arise. In recent years, work on neonatal seizure detection using deep learning algorithms has started gaining momentum. These algorithms face numerous challenges: first, the training data for such algorithms comes from individual patients, each with varying levels of label imbalance since the seizure burden in NICU patients differs by several orders of magnitude. Second, seizures in neonates are usually localized in a subset of EEG channels, and performing annotations per channel is very time-consuming. Hence models which make use of labels only per time periods, and not per channels, are preferable. In this work we assess how different deep learning models and data balancing methods influence learning in neonatal seizure detection in EEGs. We propose a model which provides a level of importance to each of the EEG channels - a proxy to whether a channel exhibits seizure activity or not, and we provide a quantitative assessment of how well this mechanism works. The model is portable to EEG devices with differing layouts without retraining, facilitating its potential deployment across different medical centers. We also provide a first assessment of how a deep learning model for neonatal seizure detection agrees with human rater decisions - an important milestone for deployment to clinical practice. We show that high AUC values in a deep learning model do not necessarily correspond to agreement with a human expert, and there is still a need to further refine such algorithms for optimal seizure discrimination.
在新生儿重症监护病房(NICU)中,接受亚低温治疗的缺氧缺血性脑病新生儿发生癫痫是一种常见的紧急情况。该患者群体中癫痫的高发病率使得有必要进行持续的脑电图(EEG)监测以检测和治疗癫痫。由于EEG记录需要在一天中进行间歇性审查,因此不可避免地会出现癫痫识别和治疗的延迟。近年来,使用深度学习算法进行新生儿癫痫检测的工作开始蓬勃发展。这些算法面临众多挑战:首先,此类算法的训练数据来自个体患者,由于NICU患者的癫痫负担相差几个数量级,每个患者的标签不平衡程度各不相同。其次,新生儿癫痫通常局限于EEG通道的一个子集,并且逐通道进行注释非常耗时。因此,仅在时间段而非通道上使用标签的模型更可取。在这项工作中,我们评估了不同的深度学习模型和数据平衡方法如何影响EEG中新生儿癫痫检测的学习。我们提出了一种模型,该模型为每个EEG通道提供重要性级别——这是一个通道是否表现出癫痫活动的代理指标,并且我们对该机制的工作效果进行了定量评估。该模型无需重新训练即可移植到具有不同布局的EEG设备上,便于其在不同医疗中心的潜在部署。我们还首次评估了用于新生儿癫痫检测的深度学习模型与人类评估者决策的一致性——这是部署到临床实践的一个重要里程碑。我们表明,深度学习模型中的高AUC值不一定与人类专家的判断一致,仍需要进一步完善此类算法以实现最佳癫痫辨别。