• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

深度神经网络与表格数据:一项综述。

Deep Neural Networks and Tabular Data: A Survey.

作者信息

Borisov Vadim, Leemann Tobias, Sebler Kathrin, Haug Johannes, Pawelczyk Martin, Kasneci Gjergji

出版信息

IEEE Trans Neural Netw Learn Syst. 2024 Jun;35(6):7499-7519. doi: 10.1109/TNNLS.2022.3229161. Epub 2024 Jun 3.

DOI:10.1109/TNNLS.2022.3229161
PMID:37015381
Abstract

Heterogeneous tabular data are the most commonly used form of data and are essential for numerous critical and computationally demanding applications. On homogeneous datasets, deep neural networks have repeatedly shown excellent performance and have therefore been widely adopted. However, their adaptation to tabular data for inference or data generation tasks remains highly challenging. To facilitate further progress in the field, this work provides an overview of state-of-the-art deep learning methods for tabular data. We categorize these methods into three groups: data transformations, specialized architectures, and regularization models. For each of these groups, our work offers a comprehensive overview of the main approaches. Moreover, we discuss deep learning approaches for generating tabular data and also provide an overview over strategies for explaining deep models on tabular data. Thus, our first contribution is to address the main research streams and existing methodologies in the mentioned areas while highlighting relevant challenges and open research questions. Our second contribution is to provide an empirical comparison of traditional machine learning methods with 11 deep learning approaches across five popular real-world tabular datasets of different sizes and with different learning objectives. Our results, which we have made publicly available as competitive benchmarks, indicate that algorithms based on gradient-boosted tree ensembles still mostly outperform deep learning models on supervised learning tasks, suggesting that the research progress on competitive deep learning models for tabular data is stagnating. To the best of our knowledge, this is the first in-depth overview of deep learning approaches for tabular data; as such, this work can serve as a valuable starting point to guide researchers and practitioners interested in deep learning with tabular data.

摘要

异构表格数据是最常用的数据形式,对于众多关键且计算要求高的应用至关重要。在同构数据集上,深度神经网络已多次展现出卓越性能,因此被广泛采用。然而,它们在适应表格数据进行推理或数据生成任务方面仍极具挑战性。为推动该领域的进一步发展,本文对表格数据的当前深度学习方法进行了综述。我们将这些方法分为三类:数据变换、专用架构和正则化模型。对于每一类,我们的工作都对主要方法进行了全面概述。此外,我们还讨论了用于生成表格数据的深度学习方法,并概述了在表格数据上解释深度模型的策略。因此,我们的第一项贡献是梳理上述领域的主要研究方向和现有方法,同时突出相关挑战和开放性研究问题。我们的第二项贡献是在五个不同大小且具有不同学习目标的流行真实世界表格数据集上,对传统机器学习方法与11种深度学习方法进行实证比较。我们已将结果作为具有竞争力的基准公开,结果表明,在监督学习任务中,基于梯度提升树集成的算法大多仍优于深度学习模型,这表明用于表格数据的有竞争力的深度学习模型的研究进展停滞不前。据我们所知,这是对表格数据深度学习方法的首次深入综述;因此,这项工作可作为一个有价值的起点,指导对表格数据深度学习感兴趣的研究人员和从业者。

相似文献

1
Deep Neural Networks and Tabular Data: A Survey.深度神经网络与表格数据:一项综述。
IEEE Trans Neural Netw Learn Syst. 2024 Jun;35(6):7499-7519. doi: 10.1109/TNNLS.2022.3229161. Epub 2024 Jun 3.
2
Tabular deep learning: a comparative study applied to multi-task genome-wide prediction.表格深度学习:应用于多任务全基因组预测的比较研究。
BMC Bioinformatics. 2024 Oct 4;25(1):322. doi: 10.1186/s12859-024-05940-1.
3
Graph Neural Network contextual embedding for Deep Learning on tabular data.图神经网络语境嵌入在表格数据上的深度学习。
Neural Netw. 2024 May;173:106180. doi: 10.1016/j.neunet.2024.106180. Epub 2024 Feb 16.
4
Time Sequence Deep Learning Model for Ubiquitous Tabular Data with Unique 3D Tensors Manipulation.用于具有独特3D张量操作的普适表格数据的时间序列深度学习模型。
Entropy (Basel). 2024 Sep 12;26(9):783. doi: 10.3390/e26090783.
5
Perturbation of deep autoencoder weights for model compression and classification of tabular data.扰动深度自动编码器权重以进行模型压缩和表格数据分类。
Neural Netw. 2022 Dec;156:160-169. doi: 10.1016/j.neunet.2022.09.020. Epub 2022 Sep 27.
6
Optimizing neural networks for medical data sets: A case study on neonatal apnea prediction.优化神经网络在医学数据集上的应用:以新生儿呼吸暂停预测为例的研究
Artif Intell Med. 2019 Jul;98:59-76. doi: 10.1016/j.artmed.2019.07.008. Epub 2019 Jul 25.
7
SAINTENS: Self-Attention and Intersample Attention Transformer for Digital Biomarker Development Using Tabular Healthcare Real World Data.使用表格型医疗保健真实世界数据开发数字生物标志物的自注意和样本间注意转换器(SAINTENS)
Stud Health Technol Inform. 2022 May 16;293:212-220. doi: 10.3233/SHTI220371.
8
ChampKit: A framework for rapid evaluation of deep neural networks for patch-based histopathology classification.ChampKit:一种基于补丁的组织病理学分类的深度神经网络快速评估框架。
Comput Methods Programs Biomed. 2023 Sep;239:107631. doi: 10.1016/j.cmpb.2023.107631. Epub 2023 May 30.
9
A comparative analysis of converters of tabular data into image for the classification of Arboviruses using Convolutional Neural Networks.基于卷积神经网络的虫媒病毒分类中表格数据到图像转换器的比较分析。
PLoS One. 2023 Dec 8;18(12):e0295598. doi: 10.1371/journal.pone.0295598. eCollection 2023.
10
NodeFlow: Towards End-to-End Flexible Probabilistic Regression on Tabular Data.节点流:迈向表格数据上的端到端灵活概率回归
Entropy (Basel). 2024 Jul 11;26(7):593. doi: 10.3390/e26070593.

引用本文的文献

1
Constructing multicancer risk cohorts using national data from medical helplines and secondary care.利用医疗求助热线和二级医疗保健的国家数据构建多癌风险队列。
NPJ Digit Med. 2025 Aug 27;8(1):551. doi: 10.1038/s41746-025-01855-0.
2
A novel machine learning framework for stroke type identification in resource constrained settings with robustness to missing data.一种用于在资源受限环境中进行中风类型识别且对缺失数据具有鲁棒性的新型机器学习框架。
Sci Rep. 2025 Aug 25;15(1):31207. doi: 10.1038/s41598-025-16660-8.
3
Visible neural networks for multi-omics integration: a critical review.
用于多组学整合的可视化神经网络:批判性综述
Front Artif Intell. 2025 Jul 17;8:1595291. doi: 10.3389/frai.2025.1595291. eCollection 2025.
4
Development of a novel deep learning method that transforms tabular input variables into images for the prediction of SLD.开发一种新型深度学习方法,该方法将表格输入变量转换为图像以预测睡眠呼吸障碍。
Sci Rep. 2025 Jul 31;15(1):28024. doi: 10.1038/s41598-025-12900-z.
5
The role of artificial intelligence in maternal and child health: Progress, controversies, and future directions.人工智能在母婴健康中的作用:进展、争议及未来方向。
PLOS Digit Health. 2025 Jul 17;4(7):e0000938. doi: 10.1371/journal.pdig.0000938. eCollection 2025 Jul.
6
Improving meningitis surveillance and diagnosis with machine learning: Insights from São Paulo.利用机器学习改善脑膜炎监测与诊断:来自圣保罗的见解。
PLOS Digit Health. 2025 Jul 10;4(7):e0000925. doi: 10.1371/journal.pdig.0000925. eCollection 2025 Jul.
7
A hybrid prediction and multi-objective optimization framework for limestone calcined clay cement concrete mixture design.用于石灰石煅烧粘土水泥混凝土配合比设计的混合预测与多目标优化框架
Sci Rep. 2025 Jul 1;15(1):22120. doi: 10.1038/s41598-025-05288-3.
8
Comparative analysis of convolutional neural networks and traditional machine learning models for IVF live birth prediction: a retrospective analysis of 48514 IVF cycles and an evaluation of deployment feasibility in resource-constrained settings.用于体外受精活产预测的卷积神经网络与传统机器学习模型的比较分析:对48514个体外受精周期的回顾性分析及在资源受限环境中的部署可行性评估
Front Endocrinol (Lausanne). 2025 Jun 12;16:1556681. doi: 10.3389/fendo.2025.1556681. eCollection 2025.
9
A new approach combining a whole-slide foundation model and gradient boosting for predicting BRAF mutation status in dermatopathology.一种结合全切片基础模型和梯度提升法预测皮肤病理学中BRAF突变状态的新方法。
Comput Struct Biotechnol J. 2025 Jun 6;27:2503-2514. doi: 10.1016/j.csbj.2025.06.017. eCollection 2025.
10
How to predict effective drug combinations - moving beyond synergy scores.如何预测有效的药物组合——超越协同分数
iScience. 2025 May 9;28(6):112622. doi: 10.1016/j.isci.2025.112622. eCollection 2025 Jun 20.