Yotsutsuji Sunao, Lei Miaomei, Akama Hiroyuki
School of Life Science and Technology, Tokyo Institute of Technology, Tokyo, Japan.
Ex-Graduate School of Science and Technology, Tokyo Institute of Technology, Tokyo, Japan.
Front Neuroinform. 2021 Feb 12;15:577451. doi: 10.3389/fninf.2021.577451. eCollection 2021.
Recently, several deep learning methods have been applied to decoding in task-related fMRI, and their advantages have been exploited in a variety of ways. However, this paradigm is sometimes problematic, due to the difficulty of applying deep learning to high-dimensional data and small sample size conditions. The difficulties in gathering a large amount of data to develop predictive machine learning models with multiple layers from fMRI experiments with complicated designs and tasks are well-recognized. Group-level, multi-voxel pattern analysis with small sample sizes results in low statistical power and large accuracy evaluation errors; failure in such instances is ascribed to the individual variability that risks information leakage, a particular issue when dealing with a limited number of subjects. In this study, using a small-size fMRI dataset evaluating bilingual language switch in a property generation task, we evaluated the relative fit of different deep learning models, incorporating moderate split methods to control the amount of information leakage. Our results indicated that using the session shuffle split as the data folding method, along with the multichannel 2D convolutional neural network (M2DCNN) classifier, recorded the best authentic classification accuracy, which outperformed the efficiency of 3D convolutional neural network (3DCNN). In this manuscript, we discuss the tolerability of within-subject or within-session information leakage, of which the impact is generally considered small but complex and essentially unknown; this requires clarification in future studies.
最近,几种深度学习方法已被应用于与任务相关的功能磁共振成像(fMRI)解码,并且它们的优势已通过多种方式得到利用。然而,由于将深度学习应用于高维数据和小样本量条件存在困难,这种范式有时会出现问题。人们普遍认识到,从具有复杂设计和任务的fMRI实验中收集大量数据以开发多层预测机器学习模型存在困难。小样本量的组水平多体素模式分析会导致统计功效低和准确性评估误差大;在这种情况下的失败归因于个体变异性,这存在信息泄露风险,在处理有限数量的受试者时这是一个特别的问题。在本研究中,我们使用一个小型fMRI数据集评估属性生成任务中的双语语言切换,评估了不同深度学习模型的相对拟合度,并采用适度的分割方法来控制信息泄露量。我们的结果表明,使用会话洗牌分割作为数据折叠方法,结合多通道二维卷积神经网络(M2DCNN)分类器,记录了最佳的真实分类准确率,其性能优于三维卷积神经网络(3DCNN)。在本论文中,我们讨论了受试者内或会话内信息泄露的耐受性,其影响通常被认为较小但复杂且基本未知;这需要在未来的研究中加以阐明。