Vanabelle Paul, De Handschutter Pierre, El Tahry Riëm, Benjelloun Mohammed, Boukhebouze Mohamed
Data Science Department, Centre of Excellence in Information and Communication Technologies, Charleroi 6041, Belgium.
Computer Science Unit, Faculty of Engineering, University of Mons, Mons 7000, Belgium.
J Biomed Res. 2019 Aug 30;34(3):228-239. doi: 10.7555/JBR.33.20190016.
The problem of automated seizure detection is treated using clinical electroencephalograms (EEG) and machine learning algorithms on the Temple University Hospital EEG Seizure Corpus (TUSZ). Performances on this complex data set are still not encountering expectations. The purpose of this work is to determine to what extent the use of larger amount of data can help to improve the performances. Two methods are explored: a standard partitioning on a recent and larger version of the TUSZ, and a leave-one-out approach used to increase the amount of data for the training set. XGBoost, a fast implementation of the gradient boosting classifier, is the ideal algorithm for these tasks. The performances obtained are in the range of what is reported until now in the literature with deep learning models. We give interpretation to our results by identifying the most relevant features and analyzing performances by seizure types. We show that generalized seizures tend to be far better predicted than focal ones. We also notice that some EEG channels and features are more important than others to distinguish seizure from background.
利用临床脑电图(EEG)和机器学习算法,在天普大学医院脑电图癫痫语料库(TUSZ)上处理自动癫痫检测问题。在这个复杂数据集上的性能仍未达到预期。这项工作的目的是确定使用大量数据在多大程度上有助于提高性能。探索了两种方法:对最新、更大版本的TUSZ进行标准划分,以及采用留一法来增加训练集的数据量。XGBoost是梯度提升分类器的一种快速实现,是完成这些任务的理想算法。所获得的性能与目前文献中报道的深度学习模型的性能范围相当。我们通过识别最相关的特征并按癫痫类型分析性能来解释我们的结果。我们表明,全身性癫痫往往比局灶性癫痫更容易预测。我们还注意到,一些脑电图通道和特征在区分癫痫发作与背景方面比其他通道和特征更重要。