oFVSD：用于高维神经成像数据的优化前向变量选择解码器的Python软件包。

oFVSD: a Python package of optimized forward variable selection decoder for high-dimensional neuroimaging data.

作者信息

Dang Tung, Fermin Alan S R, Machizawa Maro G

机构信息

Center for Brain, Mind, and KANSEI Sciences Research, Hiroshima University, Hiroshima, Japan.

Graduate School of Agricultural and Life Sciences, The University of Tokyo, Tokyo, Japan.

出版信息

Front Neuroinform. 2023 Sep 26;17:1266713. doi: 10.3389/fninf.2023.1266713. eCollection 2023.

DOI:10.3389/fninf.2023.1266713

PMID:37829329

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10566623/

Abstract

The complexity and high dimensionality of neuroimaging data pose problems for decoding information with machine learning (ML) models because the number of features is often much larger than the number of observations. Feature selection is one of the crucial steps for determining meaningful target features in decoding; however, optimizing the feature selection from such high-dimensional neuroimaging data has been challenging using conventional ML models. Here, we introduce an efficient and high-performance decoding package incorporating a forward variable selection (FVS) algorithm and hyper-parameter optimization that automatically identifies the best feature pairs for both classification and regression models, where a total of 18 ML models are implemented by default. First, the FVS algorithm evaluates the goodness-of-fit across different models using the k-fold cross-validation step that identifies the best subset of features based on a predefined criterion for each model. Next, the hyperparameters of each ML model are optimized at each forward iteration. Final outputs highlight an optimized number of selected features (brain regions of interest) for each model with its accuracy. Furthermore, the toolbox can be executed in a parallel environment for efficient computation on a typical personal computer. With the optimized forward variable selection decoder (oFVSD) pipeline, we verified the effectiveness of decoding sex classification and age range regression on 1,113 structural magnetic resonance imaging (MRI) datasets. Compared to ML models without the FVS algorithm and with the Boruta algorithm as a variable selection counterpart, we demonstrate that the oFVSD significantly outperformed across all of the ML models over the counterpart models without FVS (approximately 0.20 increase in correlation coefficient, , with regression models and 8% increase in classification models on average) and with Boruta variable selection algorithm (approximately 0.07 improvement in regression and 4% in classification models). Furthermore, we confirmed the use of parallel computation considerably reduced the computational burden for the high-dimensional MRI data. Altogether, the oFVSD toolbox efficiently and effectively improves the performance of both classification and regression ML models, providing a use case example on MRI datasets. With its flexibility, oFVSD has the potential for many other modalities in neuroimaging. This open-source and freely available Python package makes it a valuable toolbox for research communities seeking improved decoding accuracy.

摘要

神经影像数据的复杂性和高维度给使用机器学习（ML）模型解码信息带来了问题，因为特征数量往往远多于观测数量。特征选择是在解码中确定有意义的目标特征的关键步骤之一；然而，使用传统ML模型从如此高维的神经影像数据中优化特征选择一直具有挑战性。在此，我们引入了一个高效且高性能的解码软件包，它结合了前向变量选择（FVS）算法和超参数优化，能自动为分类和回归模型识别出最佳特征对，默认总共实现了18个ML模型。首先，FVS算法使用k折交叉验证步骤评估不同模型的拟合优度，该步骤基于每个模型的预定义标准识别出最佳特征子集。接下来，在每次前向迭代中对每个ML模型的超参数进行优化。最终输出突出显示每个模型所选特征（感兴趣的脑区）的优化数量及其准确率。此外，该工具箱可在并行环境中执行，以便在典型的个人计算机上进行高效计算。通过优化的前向变量选择解码器（oFVSD）管道，我们在1113个结构磁共振成像（MRI）数据集上验证了解码性别分类和年龄范围回归的有效性。与没有FVS算法且使用Boruta算法作为变量选择对应方法的ML模型相比，我们证明oFVSD在所有ML模型中显著优于没有FVS的对应模型（相关系数平均增加约0.20，回归模型中如此，分类模型中平均增加8%）以及使用Boruta变量选择算法的模型（回归模型中约提高0.07，分类模型中提高4%）。此外，我们证实使用并行计算大大减轻了高维MRI数据的计算负担。总之，oFVSD工具箱有效且高效地提高了分类和回归ML模型的性能，并在MRI数据集上提供了一个用例示例。凭借其灵活性，oFVSD在神经影像的许多其他模态中具有潜力。这个开源且免费可用的Python软件包使其成为寻求提高解码准确性的研究社区的宝贵工具箱。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ad0b/10566623/0140916435ea/fninf-17-1266713-g001.jpg

相似文献

oFVSD: a Python package of optimized forward variable selection decoder for high-dimensional neuroimaging data.oFVSD：用于高维神经成像数据的优化前向变量选择解码器的Python软件包。

Front Neuroinform. 2023 Sep 26;17:1266713. doi: 10.3389/fninf.2023.1266713. eCollection 2023.

Improving the estimation of alpine grassland fractional vegetation cover using optimized algorithms and multi-dimensional features.利用优化算法和多维度特征改进高寒草地植被覆盖度估算

Plant Methods. 2021 Sep 17;17(1):96. doi: 10.1186/s13007-021-00796-5.

Integrated Evolutionary Learning: An Artificial Intelligence Approach to Joint Learning of Features and Hyperparameters for Optimized, Explainable Machine Learning.集成进化学习：一种用于特征和超参数联合学习以实现优化、可解释机器学习的人工智能方法。

Front Artif Intell. 2022 Apr 5;5:832530. doi: 10.3389/frai.2022.832530. eCollection 2022.

Feature Selection Methods for Robust Decoding of Finger Movements in a Non-human Primate.用于非人类灵长类动物手指运动稳健解码的特征选择方法

Front Neurosci. 2018 Feb 6;12:22. doi: 10.3389/fnins.2018.00022. eCollection 2018.

A generalizable brain extraction net (BEN) for multimodal MRI data from rodents, nonhuman primates, and humans.一种可推广的用于啮齿动物、非人灵长类动物和人类多模态 MRI 数据的大脑提取网络（BEN）。

Elife. 2022 Dec 22;11:e81217. doi: 10.7554/eLife.81217.

SkinNet-INIO: Multiclass Skin Lesion Localization and Classification Using Fusion-Assisted Deep Neural Networks and Improved Nature-Inspired Optimization Algorithm.SkinNet-INIO：使用融合辅助深度神经网络和改进的自然启发优化算法的多类皮肤病变定位与分类

Diagnostics (Basel). 2023 Sep 6;13(18):2869. doi: 10.3390/diagnostics13182869.

Feature-Selection-Based Transfer Learning for Intracortical Brain-Machine Interface Decoding.基于特征选择的脑机接口皮层内解码迁移学习

IEEE Trans Neural Syst Rehabil Eng. 2021;29:60-73. doi: 10.1109/TNSRE.2020.3034234. Epub 2021 Feb 25.

A Comparison of Random Forest Variable Selection Methods for Classification Prediction Modeling.用于分类预测建模的随机森林变量选择方法比较

Expert Syst Appl. 2019 Nov 15;134:93-101. doi: 10.1016/j.eswa.2019.05.028. Epub 2019 May 23.

A universal deep learning approach for modeling the flow of patients under different severities.一种通用的深度学习方法，用于对不同严重程度的患者进行建模。

Comput Methods Programs Biomed. 2018 Feb;154:191-203. doi: 10.1016/j.cmpb.2017.11.003. Epub 2017 Nov 7.

Probability distribution function-based classification of structural MRI for the detection of Alzheimer's disease.基于概率分布函数的结构 MRI 分类用于阿尔茨海默病的检测。

Comput Biol Med. 2015 Sep;64:208-16. doi: 10.1016/j.compbiomed.2015.07.006. Epub 2015 Jul 20.

引用本文的文献

Development of a risk prediction model for sepsis-related delirium based on multiple machine learning approaches and an online calculator.基于多种机器学习方法和在线计算器开发脓毒症相关性谵妄风险预测模型。

PLoS One. 2025 Jul 16;20(7):e0323831. doi: 10.1371/journal.pone.0323831. eCollection 2025.

I-SVVS: integrative stochastic variational variable selection to explore joint patterns of multi-omics microbiome data.I-SVVS：整合随机变分变量选择以探索多组学微生物组数据的联合模式

Brief Bioinform. 2025 May 1;26(3). doi: 10.1093/bib/bbaf132.

本文引用的文献

Stochastic variational variable selection for high-dimensional microbiome data.高维微生物组数据的随机变分变量选择。

Microbiome. 2022 Dec 24;10(1):236. doi: 10.1186/s40168-022-01439-0.

Functional magnetic resonance imaging, deep learning, and Alzheimer's disease: A systematic review.功能磁共振成像、深度学习与阿尔茨海默病：系统综述。

J Neuroimaging. 2023 Jan;33(1):5-18. doi: 10.1111/jon.13063. Epub 2022 Oct 18.

Choice of Voxel-based Morphometry processing pipeline drives variability in the location of neuroanatomical brain markers.基于体素的形态测量处理流水线的选择导致神经解剖学脑标志物位置的可变性。

Commun Biol. 2022 Sep 6;5(1):913. doi: 10.1038/s42003-022-03880-1.

Microbiota alterations in proline metabolism impact depression.肠道菌群在脯氨酸代谢中的改变影响抑郁症。

Cell Metab. 2022 May 3;34(5):681-701.e10. doi: 10.1016/j.cmet.2022.04.001.

Gut microbial β-glucuronidases regulate host luminal proteases and are depleted in irritable bowel syndrome.肠道微生物 β-葡糖苷酸酶调节宿主腔蛋白酶，在肠易激综合征中减少。

Nat Microbiol. 2022 May;7(5):680-694. doi: 10.1038/s41564-022-01103-1. Epub 2022 Apr 28.

Prognostic tools and candidate drugs based on plasma proteomics of patients with severe COVID-19 complications.基于严重 COVID-19 并发症患者血浆蛋白质组学的预后工具和候选药物。

Nat Commun. 2022 Feb 17;13(1):946. doi: 10.1038/s41467-022-28639-4.

Synergistic insights into human health from aptamer- and antibody-based proteomic profiling.基于适配体和抗体的蛋白质组学分析对人类健康的协同见解。

Nat Commun. 2021 Nov 24;12(1):6822. doi: 10.1038/s41467-021-27164-0.

Integrated microbiota and metabolite profiles link Crohn's disease to sulfur metabolism.整合的微生物群和代谢物谱将克罗恩病与硫代谢联系起来。

Nat Commun. 2020 Aug 28;11(1):4322. doi: 10.1038/s41467-020-17956-1.

Machine Learning With Neuroimaging: Evaluating Its Applications in Psychiatry.机器学习与神经影像学：评估其在精神病学中的应用。

Biol Psychiatry Cogn Neurosci Neuroimaging. 2020 Aug;5(8):791-798. doi: 10.1016/j.bpsc.2019.11.007. Epub 2019 Nov 27.

Applications of Deep Learning to Neuro-Imaging Techniques.深度学习在神经成像技术中的应用。

Front Neurol. 2019 Aug 14;10:869. doi: 10.3389/fneur.2019.00869. eCollection 2019.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

oFVSD：用于高维神经成像数据的优化前向变量选择解码器的Python软件包。

oFVSD: a Python package of optimized forward variable selection decoder for high-dimensional neuroimaging data.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献