Suppr超能文献

基于几何的变分自编码器在高维低样本量情况下的数据增强

Data Augmentation in High Dimensional Low Sample Size Setting Using a Geometry-Based Variational Autoencoder.

作者信息

Chadebec Clement, Thibeau-Sutre Elina, Burgos Ninon, Allassonniere Stephanie

出版信息

IEEE Trans Pattern Anal Mach Intell. 2023 Mar;45(3):2879-2896. doi: 10.1109/TPAMI.2022.3185773. Epub 2023 Feb 3.

Abstract

In this paper, we propose a new method to perform data augmentation in a reliable way in the High Dimensional Low Sample Size (HDLSS) setting using a geometry-based variational autoencoder (VAE). Our approach combines the proposal of 1) a new VAE model, the latent space of which is modeled as a Riemannian manifold and which combines both Riemannian metric learning and normalizing flows and 2) a new generation scheme which produces more meaningful samples especially in the context of small data sets. The method is tested through a wide experimental study where its robustness to data sets, classifiers and training samples size is stressed. It is also validated on a medical imaging classification task on the challenging ADNI database where a small number of 3D brain magnetic resonance images (MRIs) are considered and augmented using the proposed VAE framework. In each case, the proposed method allows for a significant and reliable gain in the classification metrics. For instance, balanced accuracy jumps from 66.3% to 74.3% for a state-of-the-art convolutional neural network classifier trained with 50 MRIs of cognitively normal (CN) and 50 Alzheimer disease (AD) patients and from 77.7% to 86.3% when trained with 243 CN and 210 AD while improving greatly sensitivity and specificity metrics.

摘要

在本文中,我们提出了一种新方法,用于在高维小样本量(HDLSS)设置下,使用基于几何的变分自编码器(VAE)以可靠的方式执行数据增强。我们的方法结合了以下两点:1)一种新的VAE模型,其潜在空间被建模为黎曼流形,并且结合了黎曼度量学习和归一化流;2)一种新的生成方案,该方案能生成更有意义的样本,特别是在小数据集的情况下。该方法通过广泛的实验研究进行了测试,其中强调了其对数据集、分类器和训练样本大小的鲁棒性。它还在具有挑战性的ADNI数据库上的医学成像分类任务中得到了验证,在该任务中,使用所提出的VAE框架对少量3D脑磁共振图像(MRI)进行了考虑和增强。在每种情况下,所提出的方法在分类指标上都能实现显著且可靠的提升。例如,对于使用50名认知正常(CN)和50名阿尔茨海默病(AD)患者的MRI训练的先进卷积神经网络分类器,平衡准确率从66.3%跃升至74.3%;而在使用243名CN和210名AD患者的MRI训练时,平衡准确率从77.7%跃升至86.3%,同时极大地提高了敏感性和特异性指标。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验