• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

使用瓦瑟斯坦损失的确定性自动编码器用于表格数据生成。

Deterministic Autoencoder using Wasserstein loss for tabular data generation.

作者信息

Wang Alex X, Nguyen Binh P

机构信息

School of Mathematics and Statistics, Victoria University of Wellington, Wellington 6012, New Zealand.

School of Mathematics and Statistics, Victoria University of Wellington, Wellington 6012, New Zealand; Faculty of Information Technology, Ho Chi Minh City Open University, 97 Vo Van Tan, District 3, Ho Chi Minh City 70000, Viet Nam.

出版信息

Neural Netw. 2025 May;185:107208. doi: 10.1016/j.neunet.2025.107208. Epub 2025 Jan 29.

DOI:10.1016/j.neunet.2025.107208
PMID:39893805
Abstract

Tabular data generation is a complex task due to its distinctive characteristics and inherent complexities. While Variational Autoencoders have been adapted from the computer vision domain for tabular data synthesis, their reliance on non-deterministic latent space regularization introduces limitations. The stochastic nature of Variational Autoencoders can contribute to collapsed posteriors, yielding suboptimal outcomes and limiting control over the latent space. This characteristic also constrains the exploration of latent space interpolation. To address these challenges, we present the Tabular Wasserstein Autoencoder (TWAE), leveraging the deterministic encoding mechanism of Wasserstein Autoencoders. This characteristic facilitates a deterministic mapping of inputs to latent codes, enhancing the stability and expressiveness of our model's latent space. This, in turn, enables seamless integration with shallow interpolation mechanisms like the synthetic minority over-sampling technique (SMOTE) within the data generation process via deep learning. Specifically, TWAE is trained once to establish a low-dimensional representation of real data, and various latent interpolation methods efficiently generate synthetic latent points, achieving a balance between accuracy and efficiency. Extensive experiments consistently demonstrate TWAE's superiority, showcasing its versatility across diverse feature types and dataset sizes. This innovative approach, combining WAE principles with shallow interpolation, effectively leverages SMOTE's advantages, establishing TWAE as a robust solution for complex tabular data synthesis.

摘要

表格数据生成是一项复杂的任务,因其独特的特征和内在的复杂性。虽然变分自编码器已从计算机视觉领域改编用于表格数据合成,但其对非确定性潜在空间正则化的依赖带来了局限性。变分自编码器的随机性质可能导致后验分布坍塌,产生次优结果并限制对潜在空间的控制。这一特性还限制了潜在空间插值的探索。为应对这些挑战,我们提出了表格瓦瑟斯坦自编码器(TWAE),利用瓦瑟斯坦自编码器的确定性编码机制。这一特性有助于将输入确定性地映射到潜在代码,增强了我们模型潜在空间的稳定性和表现力。这反过来又使得在数据生成过程中能够通过深度学习与诸如合成少数过采样技术(SMOTE)等浅层插值机制无缝集成。具体而言,TWAE经过一次训练以建立真实数据的低维表示,各种潜在插值方法有效地生成合成潜在点,在准确性和效率之间取得平衡。大量实验一致证明了TWAE的优越性,展示了其在各种特征类型和数据集大小上的通用性。这种将WAE原理与浅层插值相结合的创新方法有效地利用了SMOTE的优势,将TWAE确立为复杂表格数据合成的强大解决方案。

相似文献

1
Deterministic Autoencoder using Wasserstein loss for tabular data generation.使用瓦瑟斯坦损失的确定性自动编码器用于表格数据生成。
Neural Netw. 2025 May;185:107208. doi: 10.1016/j.neunet.2025.107208. Epub 2025 Jan 29.
2
Leveraging the variational Bayes autoencoder for survival analysis.利用变分贝叶斯自动编码器进行生存分析。
Sci Rep. 2024 Oct 19;14(1):24567. doi: 10.1038/s41598-024-76047-z.
3
Latent space autoencoder generative adversarial model for retinal image synthesis and vessel segmentation.用于视网膜图像合成与血管分割的潜在空间自动编码器生成对抗模型。
BMC Med Imaging. 2025 May 5;25(1):149. doi: 10.1186/s12880-025-01694-1.
4
Hybrid deep learning for computational precision in cardiac MRI segmentation: Integrating Autoencoders, CNNs, and RNNs for enhanced structural analysis.用于心脏磁共振成像分割计算精度的混合深度学习:整合自动编码器、卷积神经网络和循环神经网络以增强结构分析。
Comput Biol Med. 2025 Mar;186:109597. doi: 10.1016/j.compbiomed.2024.109597. Epub 2025 Jan 1.
5
Deep clustering analysis via variational autoencoder with Gamma mixture latent embeddings.基于具有伽马混合潜在嵌入的变分自编码器的深度聚类分析。
Neural Netw. 2025 Mar;183:106979. doi: 10.1016/j.neunet.2024.106979. Epub 2024 Dec 4.
6
Enhancing parkinson disease detection through feature based deep learning with autoencoders and neural networks.通过基于特征的深度学习以及自动编码器和神经网络增强帕金森病检测。
Sci Rep. 2025 Mar 13;15(1):8624. doi: 10.1038/s41598-025-88293-w.
7
Achieving deep clustering through the use of variational autoencoders and similarity-based loss.通过使用变分自编码器和基于相似度的损失来实现深度聚类。
Math Biosci Eng. 2022 Jul 22;19(10):10344-10360. doi: 10.3934/mbe.2022484.
8
Utility-based Analysis of Statistical Approaches and Deep Learning Models for Synthetic Data Generation With Focus on Correlation Structures: Algorithm Development and Validation.基于效用的统计方法和深度学习模型用于合成数据生成的分析,重点关注相关结构:算法开发与验证
JMIR AI. 2025 Mar 20;4:e65729. doi: 10.2196/65729.
9
Pixel-Wise Wasserstein Autoencoder for Highly Generative Dehazing.用于高生成性去雾的逐像素瓦瑟斯坦自动编码器
IEEE Trans Image Process. 2021;30:5452-5462. doi: 10.1109/TIP.2021.3084743. Epub 2021 Jun 9.
10
Synthetic Lung Ultrasound Data Generation Using Autoencoder With Generative Adversarial Network.使用带有生成对抗网络的自动编码器生成合成肺部超声数据
IEEE Trans Ultrason Ferroelectr Freq Control. 2025 May;72(5):624-635. doi: 10.1109/TUFFC.2025.3555447. Epub 2025 May 7.

引用本文的文献

1
Integration of metaheuristic based feature selection with ensemble representation learning models for privacy aware cyberattack detection in IoT environments.基于元启发式算法的特征选择与集成表示学习模型相结合,用于物联网环境中隐私感知的网络攻击检测。
Sci Rep. 2025 Jul 2;15(1):22887. doi: 10.1038/s41598-025-05545-5.
2
Customizable pattern synthesis: a deep generative approach for lantern designs.可定制图案合成:一种用于灯笼设计的深度生成方法。
PeerJ Comput Sci. 2025 Mar 7;11:e2732. doi: 10.7717/peerj-cs.2732. eCollection 2025.