IEEE Trans Biomed Eng. 2018 Mar;65(3):502-510. doi: 10.1109/TBME.2017.2700086. Epub 2017 May 2.
Key issues in the epilepsy seizure prediction research are (1) the reproducibility of results (2) the inability to compare multiple approaches directly. To overcome these problems, the seizure prediction challenge was organized on Kaggle.com. It aimed at establishing benchmarks on a dataset with predefined train, validation, and test sets. Our main objective is to analyze the competition format, and to propose improvements, which would facilitate a better comparison of algorithms. The second objective is to present a novel deep learning approach to seizure prediction and compare it to other commonly used methods using patient centered metrics.
We used the competition's datasets to illustrate the effects of data contamination. Having better data partitions, we compared three types of models in terms of different objectives.
We found that correct selection of test samples is crucial when evaluating the performance of seizure forecasting models. Moreover, we showed that models, which achieve state-of-the-art performance with respect to commonly used AUC, sensitivity, and specificity metrics, may not yet be suitable for practical usage because of low precision scores.
Correlation between validation and test datasets used in the competition limited its scientific value.
Our findings provide guidelines which allow for a more objective evaluation of seizure prediction models.
癫痫发作预测研究中的关键问题是(1)结果的可重复性(2)无法直接比较多种方法。为了克服这些问题,Kaggle.com 组织了癫痫发作预测挑战赛。它旨在建立具有预定义训练、验证和测试集的数据集上的基准。我们的主要目标是分析竞争格式,并提出改进建议,这将有助于更好地比较算法。第二个目标是提出一种新的深度学习方法来进行癫痫发作预测,并使用以患者为中心的指标与其他常用方法进行比较。
我们使用竞赛数据集来说明数据污染的影响。通过更好的数据分区,我们根据不同的目标比较了三种类型的模型。
我们发现,在评估癫痫预测模型的性能时,正确选择测试样本是至关重要的。此外,我们还表明,在常用 AUC、敏感性和特异性指标方面表现出色的模型,由于精度得分较低,可能还不适合实际使用。
竞赛中使用的验证数据集和测试数据集之间的相关性限制了其科学价值。
我们的研究结果为癫痫预测模型的更客观评估提供了指导。