Suppr超能文献

基于 VAE 和 GSDAE 的数据增强算法识别精神分裂症的生物标志物。

Biomarkers identification for Schizophrenia via VAE and GSDAE-based data augmentation.

机构信息

School of Mathematics and Statistics, Xi'an Jiaotong University, Xi'an, 710049, China.

School of Mathematics and Statistics, Xi'an Jiaotong University, Xi'an, 710049, China; Department of Mathematics and Statistics, University of Ottawa, Ottawa, K7L 3P7, Canada.

出版信息

Comput Biol Med. 2022 Jul;146:105603. doi: 10.1016/j.compbiomed.2022.105603. Epub 2022 May 13.

Abstract

Deep learning has made great progress in analyzing MRI data, while the MRI data with high dimensional but small sample size (HDSSS) brings many limitations to biomarkers identification. Few-shot learning has been proposed to solve such problems and data augmentation is a typical method of it. The variational auto-encoder (VAE) is a generative method based on variational Bayesian inference that is used for data augmentation. Graph regularized sparse deep autoencoder (GSDAE) can reconstruct sparse samples and keep the manifold structure of data which will facilitate biomarkers selection greatly. To generate better HDSSS data for biomarkers identification, a data augmentation method based on VAE and GSDAE is proposed in this paper, termed GS-VDAE. Instead of utilizing the final products of GSDAE, our proposed model embeds the generation procedure into GSDAE for augmentation. In this way, the augmented samples will be rooted in the significant features extracted from the original samples, which can ensure the newly formed samples contain the most significant characteristics of the original samples. The classification accuracy of the samples generated directly from VAE is 0.74, while the classification accuracy of the samples generated from GS-VDAE is 0.84, which proves the validity of our model. Additionally, a regression feature selection method with truncated nuclear norm regularization is chosen for biomarkers selection. The biomarkers selection results of schizophrenia data reveal that the augmented samples obtained by our proposed method can get higher classification accuracy with less ranked features compared with original samples, which proves the validation of our model.

摘要

深度学习在分析 MRI 数据方面取得了很大的进展,而高维小样本量(HDSSS)的 MRI 数据给生物标志物识别带来了许多限制。少样本学习被提出来解决这些问题,数据扩充是其中一种典型的方法。变分自编码器(VAE)是一种基于变分贝叶斯推断的生成方法,用于数据扩充。图正则稀疏深度自动编码器(GSDAE)可以重建稀疏样本,并保持数据的流形结构,这将极大地方便生物标志物的选择。为了生成更好的 HDSSS 数据用于生物标志物识别,本文提出了一种基于 VAE 和 GSDAE 的数据扩充方法,称为 GS-VDAE。我们的模型不是利用 GSDAE 的最终产物,而是将生成过程嵌入到 GSDAE 中进行扩充。这样,扩充后的样本将根植于从原始样本中提取的重要特征,从而确保新形成的样本包含原始样本的最重要特征。直接从 VAE 生成的样本的分类准确率为 0.74,而从 GS-VDAE 生成的样本的分类准确率为 0.84,这证明了我们模型的有效性。此外,还选择了带有截断核范数正则化的回归特征选择方法用于生物标志物选择。精神分裂症数据的生物标志物选择结果表明,与原始样本相比,我们提出的方法获得的扩充样本可以用更少的排名特征获得更高的分类准确率,这证明了我们模型的有效性。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验