Suppr超能文献

张量张量代数在多向数据的最优表示和压缩中的应用。

Tensor-tensor algebra for optimal representation and compression of multiway data.

机构信息

Department of Mathematics, Tufts University, Medford, MA 02155.

Mathematics of AI, IBM Research, Yorktown Heights, NY 10598.

出版信息

Proc Natl Acad Sci U S A. 2021 Jul 13;118(28). doi: 10.1073/pnas.2015851118.

Abstract

With the advent of machine learning and its overarching pervasiveness it is imperative to devise ways to represent large datasets efficiently while distilling intrinsic features necessary for subsequent analysis. The primary workhorse used in data dimensionality reduction and feature extraction has been the matrix singular value decomposition (SVD), which presupposes that data have been arranged in matrix format. A primary goal in this study is to show that high-dimensional datasets are more compressible when treated as tensors (i.e., multiway arrays) and compressed via tensor-SVDs under the tensor-tensor product constructs and its generalizations. We begin by proving Eckart-Young optimality results for families of tensor-SVDs under two different truncation strategies. Since such optimality properties can be proven in both matrix and tensor-based algebras, a fundamental question arises: Does the tensor construct subsume the matrix construct in terms of representation efficiency? The answer is positive, as proven by showing that a tensor-tensor representation of an equal dimensional spanning space can be superior to its matrix counterpart. We then use these optimality results to investigate how the compressed representation provided by the truncated tensor SVD is related both theoretically and empirically to its two closest tensor-based analogs, the truncated high-order SVD and the truncated tensor-train SVD.

摘要

随着机器学习的出现及其广泛应用,我们必须设计出有效的方法来表示大型数据集,同时提取出后续分析所需的内在特征。在数据降维和特征提取中,主要的工作是矩阵奇异值分解(SVD),它假定数据已经以矩阵格式排列。本研究的主要目标是表明,当将高维数据集视为张量(即多向数组)并通过张量-SVD 在张量张量积结构及其推广下进行压缩时,它们更具可压缩性。我们首先证明了两种不同截断策略下张量-SVD 族的 Eckart-Young 最优性结果。由于这种最优性属性可以在矩阵和张量代数中证明,因此会出现一个基本问题:张量结构在表示效率方面是否包含矩阵结构?答案是肯定的,通过证明具有相同维度的张量子空间的张量张量表示可以优于其矩阵对应物来证明这一点。然后,我们使用这些最优性结果从理论和经验上研究了截断张量 SVD 提供的压缩表示与两个最接近的张量基类似物(截断高阶 SVD 和截断张量训练 SVD)之间的关系。

相似文献

2
Optimal High-order Tensor SVD via Tensor-Train Orthogonal Iteration.通过张量列正交迭代实现最优高阶张量奇异值分解
IEEE Trans Inf Theory. 2022 Jun;68(6):3991-4019. doi: 10.1109/tit.2022.3152733. Epub 2022 Feb 18.
3
Stable tensor neural networks for efficient deep learning.用于高效深度学习的稳定张量神经网络。
Front Big Data. 2024 May 30;7:1363978. doi: 10.3389/fdata.2024.1363978. eCollection 2024.
4
Efficient enhancement of low-rank tensor completion via thin QR decomposition.通过薄QR分解有效增强低秩张量补全
Front Big Data. 2024 Jul 2;7:1382144. doi: 10.3389/fdata.2024.1382144. eCollection 2024.
6
Optimal Sparse Singular Value Decomposition for High-Dimensional High-Order Data.高维高阶数据的最优稀疏奇异值分解
J Am Stat Assoc. 2019;114(528):1708-1725. doi: 10.1080/01621459.2018.1527227. Epub 2019 Mar 20.
7
Low-Rank Tensor Function Representation for Multi-Dimensional Data Recovery.用于多维数据恢复的低秩张量函数表示
IEEE Trans Pattern Anal Mach Intell. 2024 May;46(5):3351-3369. doi: 10.1109/TPAMI.2023.3341688. Epub 2024 Apr 3.

引用本文的文献

1
Stable tensor neural networks for efficient deep learning.用于高效深度学习的稳定张量神经网络。
Front Big Data. 2024 May 30;7:1363978. doi: 10.3389/fdata.2024.1363978. eCollection 2024.
2
Dimensionality reduction of longitudinal 'omics data using modern tensor factorizations.利用现代张量分解进行纵向 'omics 数据的降维。
PLoS Comput Biol. 2022 Jul 15;18(7):e1010212. doi: 10.1371/journal.pcbi.1010212. eCollection 2022 Jul.

本文引用的文献

1
TTHRESH: Tensor Compression for Multidimensional Visual Data.TTHRESH:多维视觉数据的张量压缩
IEEE Trans Vis Comput Graph. 2020 Sep;26(9):2891-2903. doi: 10.1109/TVCG.2019.2904063. Epub 2019 Mar 8.

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验