Choi Junggu, Kwon Seohyun, Park Sohyun, Han Sanghoon
Yonsei Graduate Program in Cognitive Science, Yonsei University, Seoul, Republic of Korea.
Munice Inc., Vienna, VA, USA.
Digit Health. 2023 Mar 14;9:20552076231163783. doi: 10.1177/20552076231163783. eCollection 2023 Jan-Dec.
Sleep stage identification is critical in multiple areas (e.g. medicine or psychology) to diagnose sleep-related disorders. Previous studies have reported that the performance of machine learning algorithms can be changed depending on the biosignals and feature-extraction processes in sleep stage classification.
To compare as many conditions as possible, 414 experimental conditions were applied, considering the combination of different biosignals, biosignal length, and window length. Five biosignals in polysomnography (i.e. electrocardiogram (ECG), electroencephalogram (EEG), electromyogram (EMG), electrooculogram left, and electrooculogram right) were used to identify optimal signal combinations for classification. In addition, three different signal-length conditions and six different window-length conditions were applied. The validity of each condition was examined via classification performance from the XGBoost classifiers trained using 10-fold cross-validation. Furthermore, results considering feature importance were examined to validate the experimental results in terms of model explanation.
The combination of EEG + EMG + ECG with a 40 s window and 120 s signal length resulted in the best classification performance (precision: 0.853, recall: 0.855, F1-score: 0.853, and accuracy: 0.853). Compared to other conditions and feature importance results, EEG signals showed a relatively higher importance for classification in the present study.
We determined the optimal biosignal and window conditions for the feature-extraction process in machine learning algorithm-based sleep stage classification. Our experimental results inform researchers in the future conduct of related studies. To generalize our results, more diverse methodologies and conditions should be applied in future studies.
睡眠阶段识别在多个领域(如医学或心理学)对于诊断睡眠相关障碍至关重要。先前的研究报告称,机器学习算法的性能可能会因睡眠阶段分类中的生物信号和特征提取过程而改变。
为了尽可能多地比较各种条件,考虑了不同生物信号、生物信号长度和窗口长度的组合,应用了414种实验条件。使用多导睡眠图中的五种生物信号(即心电图(ECG)、脑电图(EEG)、肌电图(EMG)、左眼电图和右眼电图)来确定用于分类的最佳信号组合。此外,应用了三种不同的信号长度条件和六种不同的窗口长度条件。通过使用10折交叉验证训练的XGBoost分类器的分类性能来检验每种条件的有效性。此外,还检查了考虑特征重要性的结果,以从模型解释的角度验证实验结果。
脑电图(EEG)+肌电图(EMG)+心电图(ECG)的组合,窗口为40秒,信号长度为120秒,产生了最佳的分类性能(精确率:0.853,召回率:0.855,F1分数:0.853,准确率:0.853)。与其他条件和特征重要性结果相比,在本研究中脑电图信号在分类中显示出相对较高的重要性。
我们确定了基于机器学习算法的睡眠阶段分类中特征提取过程的最佳生物信号和窗口条件。我们的实验结果为未来相关研究的开展为研究人员提供了参考。为了推广我们的结果,未来的研究应应用更多样化的方法和条件。