Gao Vance, Turek Fred, Vitaterna Martha
Center for Sleep and Circadian Biology, Northwestern University, Department of Neurobiology, 2205 Tech Drive Hogan 2-160, Evanston, IL 60208, United States.
Center for Sleep and Circadian Biology, Northwestern University, Department of Neurobiology, 2205 Tech Drive Hogan 2-160, Evanston, IL 60208, United States.
J Neurosci Methods. 2016 May 1;264:33-39. doi: 10.1016/j.jneumeth.2016.02.016. Epub 2016 Feb 27.
Electroencephalogram (EEG) and electromyogram (EMG) recordings are often used in rodents to study sleep architecture and sleep-associated neural activity. These recordings must be scored to designate what sleep/wake state the animal is in at each time point. Manual sleep-scoring is very time-consuming, so machine-learning classifier algorithms have been used to automate scoring.
Instead of using single classifiers, we implement a multiple classifier system. The multiple classifier is built from six base classifiers: decision tree, k-nearest neighbors, naïve Bayes, support vector machine, neural net, and linear discriminant analysis. Decision tree and k-nearest neighbors were improved into ensemble classifiers by using bagging and random subspace. Confidence scores from each classifier were combined to determine the final classification. Ambiguous epochs can be rejected and left for a human to classify.
Support vector machine was the most accurate base classifier, and had error rate of 0.054. The multiple classifier system reduced the error rate to 0.049, which was not significantly different from a second human scorer. When 10% of epochs were rejected, the remaining epochs' error rate dropped to 0.018.
COMPARISON WITH EXISTING METHOD(S): Compared with the most accurate single classifier (support vector machine), the multiple classifier reduced errors by 9.4%. The multiple classifier surpassed the accuracy of a second human scorer after rejecting only 2% of epochs.
Multiple classifier systems are an effective way to increase automated sleep scoring accuracy. Improvements in autoscoring will allow sleep researchers to increase sample sizes and recording lengths, opening new experimental possibilities.
脑电图(EEG)和肌电图(EMG)记录常用于啮齿动物,以研究睡眠结构和与睡眠相关的神经活动。这些记录必须进行评分,以确定动物在每个时间点所处的睡眠/觉醒状态。人工睡眠评分非常耗时,因此已使用机器学习分类器算法来实现自动评分。
我们实现了一个多分类器系统,而不是使用单个分类器。多分类器由六个基本分类器构建而成:决策树、k近邻、朴素贝叶斯、支持向量机、神经网络和线性判别分析。决策树和k近邻通过使用装袋法和随机子空间改进为集成分类器。将每个分类器的置信度分数合并起来以确定最终分类。模糊的时间段可以被排除,留待人工分类。
支持向量机是最准确的基本分类器,错误率为0.054。多分类器系统将错误率降低到0.049,与第二位人工评分者的错误率没有显著差异。当10%的时间段被排除时,其余时间段的错误率降至0.018。
与最准确的单个分类器(支持向量机)相比,多分类器将错误减少了9.4%。多分类器在仅排除2%的时间段后就超过了第二位人工评分者的准确率。
多分类器系统是提高自动睡眠评分准确性的有效方法。自动评分的改进将使睡眠研究人员能够增加样本量和记录长度,开启新的实验可能性。