Suppr超能文献

TabMixer:采用增强型MLP-Mixer方法推进表格数据分析。

TabMixer: advancing tabular data analysis with an enhanced MLP-mixer approach.

作者信息

Eslamian Ali, Cheng Qiang

机构信息

Department of Computer Science, University of Kentucky, 329 Rose Street, Lexington, Kentucky 40506, USA.

Institute for Biomedical Informatics, University of Kentucky, 800 Rose Street, Lexington, Kentucky 40506, USA.

出版信息

Pattern Anal Appl. 2025 Jun;28(2). doi: 10.1007/s10044-025-01423-y. Epub 2025 Feb 21.

Abstract

Tabular data, prevalent in relational databases and spreadsheets, is fundamental across fields like healthcare, engineering, and finance. Despite significant advances in tabular data learning, critical challenges remain: handling missing values, addressing class imbalance, enabling transfer learning, and facilitating feature incremental learning beyond traditional supervised paradigms. We introduce TabMixer, an innovative model that enhances the multilayer perceptron (MLP) mixer architecture to address these challenges. TabMixer incorporates a self-attention mechanism, making it versatile across various learning scenarios including supervised learning, transfer learning, and feature incremental learning. Extensive experiments on eight public datasets demonstrate TabMixer's superior performance over existing state-of-the-art methods. Notably, TabMixer achieved substantial improvements in ANOVA AUC across all scenarios: a 4% increase in supervised learning (0.840 to 0.881), 8% in transfer learning (0.803 to 0.872), and 4% in feature incremental learning (0.806 to 0.843). TabMixer demonstrates high computational efficiency and scalability through reduced floating-point operations and learnable parameters. Moreover, it exhibits strong resilience to missing values and class imbalances through both its architectural design and optional preprocessing enhancements. These results establish TabMixer as a promising model for tabular data analysis and a valuable tool for diverse applications.

摘要

表格数据在关系数据库和电子表格中很常见,在医疗保健、工程和金融等领域至关重要。尽管表格数据学习取得了重大进展,但仍存在关键挑战:处理缺失值、解决类别不平衡问题、实现迁移学习以及促进超越传统监督范式的特征增量学习。我们引入了TabMixer,这是一种创新模型,它增强了多层感知器(MLP)混合器架构以应对这些挑战。TabMixer集成了自注意力机制,使其在包括监督学习、迁移学习和特征增量学习在内的各种学习场景中都具有通用性。在八个公共数据集上进行的广泛实验表明,TabMixer比现有的最先进方法具有更优的性能。值得注意的是,TabMixer在所有场景下的方差分析AUC都有显著提升:在监督学习中提高了4%(从0.840提升至0.881),在迁移学习中提高了8%(从0.803提升至0.872),在特征增量学习中提高了4%(从0.806提升至0.843)。TabMixer通过减少浮点运算和可学习参数,展示了高计算效率和可扩展性。此外,通过其架构设计和可选的预处理增强,它对缺失值和类别不平衡具有很强的弹性。这些结果确立了TabMixer作为表格数据分析的一个有前景的模型以及用于各种应用的有价值工具。

相似文献

3
MambaTab: A Plug-and-Play Model for Learning Tabular Data.曼巴表格:一种用于学习表格数据的即插即用模型。
Proc (IEEE Conf Multimed Inf Process Retr). 2024 Aug;2024:369-375. doi: 10.1109/mipr62202.2024.00065. Epub 2024 Oct 15.
9
Multilayer perceptron-based self-care early prediction of children with disabilities.基于多层感知器的残疾儿童自我护理早期预测
Digit Health. 2023 Jun 29;9:20552076231184054. doi: 10.1177/20552076231184054. eCollection 2023 Jan-Dec.
10
Deep Neural Networks and Tabular Data: A Survey.深度神经网络与表格数据:一项综述。
IEEE Trans Neural Netw Learn Syst. 2024 Jun;35(6):7499-7519. doi: 10.1109/TNNLS.2022.3229161. Epub 2024 Jun 3.

本文引用的文献

1
Deep Neural Networks and Tabular Data: A Survey.深度神经网络与表格数据:一项综述。
IEEE Trans Neural Netw Learn Syst. 2024 Jun;35(6):7499-7519. doi: 10.1109/TNNLS.2022.3229161. Epub 2024 Jun 3.
3
The prevention and handling of the missing data.数据缺失的预防和处理。
Korean J Anesthesiol. 2013 May;64(5):402-6. doi: 10.4097/kjae.2013.64.5.402. Epub 2013 May 24.

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验