偶然超过机遇水平：脑信号分类中理论机遇水平的注意事项及解码准确性的统计评估

Exceeding chance level by chance: The caveat of theoretical chance levels in brain signal classification and statistical assessment of decoding accuracy.

作者信息

Combrisson Etienne, Jerbi Karim

机构信息

DYCOG Lab, Lyon Neuroscience Research Center, INSERM U1028, UMR 5292, University Lyon I, Lyon, France; Center of Research and Innovation in Sport, Mental Processes and Motor Performance, University of Lyon I, Lyon, France.

DYCOG Lab, Lyon Neuroscience Research Center, INSERM U1028, UMR 5292, University Lyon I, Lyon, France; Psychology Department, University of Montreal, QC, Canada.

出版信息

J Neurosci Methods. 2015 Jul 30;250:126-36. doi: 10.1016/j.jneumeth.2015.01.010. Epub 2015 Jan 14.

DOI:10.1016/j.jneumeth.2015.01.010

PMID:25596422

Abstract

Machine learning techniques are increasingly used in neuroscience to classify brain signals. Decoding performance is reflected by how much the classification results depart from the rate achieved by purely random classification. In a 2-class or 4-class classification problem, the chance levels are thus 50% or 25% respectively. However, such thresholds hold for an infinite number of data samples but not for small data sets. While this limitation is widely recognized in the machine learning field, it is unfortunately sometimes still overlooked or ignored in the emerging field of brain signal classification. Incidentally, this field is often faced with the difficulty of low sample size. In this study we demonstrate how applying signal classification to Gaussian random signals can yield decoding accuracies of up to 70% or higher in two-class decoding with small sample sets. Most importantly, we provide a thorough quantification of the severity and the parameters affecting this limitation using simulations in which we manipulate sample size, class number, cross-validation parameters (k-fold, leave-one-out and repetition number) and classifier type (Linear-Discriminant Analysis, Naïve Bayesian and Support Vector Machine). In addition to raising a red flag of caution, we illustrate the use of analytical and empirical solutions (binomial formula and permutation tests) that tackle the problem by providing statistical significance levels (p-values) for the decoding accuracy, taking sample size into account. Finally, we illustrate the relevance of our simulations and statistical tests on real brain data by assessing noise-level classifications in Magnetoencephalography (MEG) and intracranial EEG (iEEG) baseline recordings.

摘要

机器学习技术在神经科学中越来越多地用于对脑信号进行分类。解码性能通过分类结果与纯随机分类所达到的比率的偏离程度来反映。在二分类或四分类问题中，因此机会水平分别为50%或25%。然而，这样的阈值适用于无限数量的数据样本，而不适用于小数据集。虽然这一局限性在机器学习领域已得到广泛认可，但不幸的是，在新兴的脑信号分类领域中，它有时仍被忽视或忽略。顺便说一句，该领域经常面临样本量小的困难。在本研究中，我们展示了将信号分类应用于高斯随机信号如何在小样本集的二分类解码中产生高达70%或更高的解码准确率。最重要的是，我们通过模拟对影响这一局限性的严重程度和参数进行了全面量化，在模拟中我们操纵样本量、类别数量、交叉验证参数（k折、留一法和重复次数）以及分类器类型（线性判别分析、朴素贝叶斯和支持向量机）。除了发出谨慎的警告外，我们还说明了通过提供考虑样本量的解码准确率的统计显著性水平（p值）来解决问题的分析和实证解决方案（二项式公式和排列检验）的使用。最后，我们通过评估脑磁图（MEG）和颅内脑电图（iEEG）基线记录中的噪声水平分类，说明了我们的模拟和统计测试对真实脑数据的相关性。

相似文献

Exceeding chance level by chance: The caveat of theoretical chance levels in brain signal classification and statistical assessment of decoding accuracy.

J Neurosci Methods. 2015 Jul 30;250:126-36. doi: 10.1016/j.jneumeth.2015.01.010. Epub 2015 Jan 14.

Automated model selection in covariance estimation and spatial whitening of MEG and EEG signals.

Neuroimage. 2015 Mar;108:328-42. doi: 10.1016/j.neuroimage.2014.12.040. Epub 2014 Dec 23.

Multivariate pattern analysis for MEG: A comparison of dissimilarity measures.

Neuroimage. 2018 Jun;173:434-447. doi: 10.1016/j.neuroimage.2018.02.044. Epub 2018 Feb 27.

Classification methods for ongoing EEG and MEG signals.

Biol Res. 2007;40(4):415-37. Epub 2008 May 28.

Channel selection and classification of electroencephalogram signals: an artificial neural network and genetic algorithm-based approach.

Artif Intell Med. 2012 Jun;55(2):117-26. doi: 10.1016/j.artmed.2012.02.001. Epub 2012 Apr 12.

Learning machines and sleeping brains: Automatic sleep stage classification using decision-tree multi-class support vector machines.

J Neurosci Methods. 2015 Jul 30;250:94-105. doi: 10.1016/j.jneumeth.2015.01.022. Epub 2015 Jan 25.

Designing a robust feature extraction method based on optimum allocation and principal component analysis for epileptic EEG signal classification.

Comput Methods Programs Biomed. 2015 Apr;119(1):29-42. doi: 10.1016/j.cmpb.2015.01.002. Epub 2015 Jan 30.

Clustering linear discriminant analysis for MEG-based brain computer interfaces.

IEEE Trans Neural Syst Rehabil Eng. 2011 Jun;19(3):221-31. doi: 10.1109/TNSRE.2011.2116125. Epub 2011 Feb 22.

Improved multi-unit decoding at the brain-machine interface using population temporal linear filtering.

J Neural Eng. 2010 Aug;7(4):046012. doi: 10.1088/1741-2560/7/4/046012. Epub 2010 Jul 19.

Comparative analysis of classifiers for developing an adaptive computer-assisted EEG analysis system for diagnosing epilepsy.

Biomed Res Int. 2015;2015:638036. doi: 10.1155/2015/638036. Epub 2015 Mar 5.

引用本文的文献

Machine Learning-Based Alexithymia Assessment Using Resting-State Default Mode Network Functional Connectivity.

Sensors (Basel). 2025 Sep 4;25(17):5515. doi: 10.3390/s25175515.

Where do I go? Decoding temporal neural dynamics of scene processing and visuospatial memory interactions using convolutional neural networks.

J Vis. 2025 Aug 1;25(10):15. doi: 10.1167/jov.25.10.15.

GPT-based normative models of brain sMRI correlate with dimensional psychopathology.

Imaging Neurosci (Camb). 2024 Jun 26;2. doi: 10.1162/imag_a_00204. eCollection 2024.

Robust discrimination of multiple naturalistic same-hand movements from MEG signals with convolutional neural networks.

Imaging Neurosci (Camb). 2024 May 20;2. doi: 10.1162/imag_a_00178. eCollection 2024.

Frequency tagging of spatial attention using periliminal flickers.

Imaging Neurosci (Camb). 2024 Jul 12;2. doi: 10.1162/imag_a_00223. eCollection 2024.

How low can you go: evaluating electrode reduction methods for EEG-based speech imagery BCIs.

Front Neuroergon. 2025 Jul 2;6:1578586. doi: 10.3389/fnrgo.2025.1578586. eCollection 2025.

From pronounced to imagined: improving speech decoding with multi-condition EEG data.

Front Neuroinform. 2025 Jun 27;19:1583428. doi: 10.3389/fninf.2025.1583428. eCollection 2025.

Neuroimage Rep. 2021 Oct 12;1(4):100058. doi: 10.1016/j.ynirp.2021.100058. eCollection 2021 Dec.

On the relative importance of attention and response selection processes for multi-component behavior - Evidence from EEG-based deep learning.

Neuroimage Rep. 2022 Aug 4;2(3):100118. doi: 10.1016/j.ynirp.2022.100118. eCollection 2022 Sep.

A multi-day and high-quality EEG dataset for motor imagery brain-computer interface.

Sci Data. 2025 Mar 23;12(1):488. doi: 10.1038/s41597-025-04826-y.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

偶然超过机遇水平：脑信号分类中理论机遇水平的注意事项及解码准确性的统计评估

Exceeding chance level by chance: The caveat of theoretical chance levels in brain signal classification and statistical assessment of decoding accuracy.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献