张量张量代数在多向数据的最优表示和压缩中的应用。

Tensor-tensor algebra for optimal representation and compression of multiway data.

机构信息

Department of Mathematics, Tufts University, Medford, MA 02155.

Mathematics of AI, IBM Research, Yorktown Heights, NY 10598.

出版信息

Proc Natl Acad Sci U S A. 2021 Jul 13;118(28). doi: 10.1073/pnas.2015851118.

DOI:10.1073/pnas.2015851118

PMID:34234014

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC8285895/

Abstract

With the advent of machine learning and its overarching pervasiveness it is imperative to devise ways to represent large datasets efficiently while distilling intrinsic features necessary for subsequent analysis. The primary workhorse used in data dimensionality reduction and feature extraction has been the matrix singular value decomposition (SVD), which presupposes that data have been arranged in matrix format. A primary goal in this study is to show that high-dimensional datasets are more compressible when treated as tensors (i.e., multiway arrays) and compressed via tensor-SVDs under the tensor-tensor product constructs and its generalizations. We begin by proving Eckart-Young optimality results for families of tensor-SVDs under two different truncation strategies. Since such optimality properties can be proven in both matrix and tensor-based algebras, a fundamental question arises: Does the tensor construct subsume the matrix construct in terms of representation efficiency? The answer is positive, as proven by showing that a tensor-tensor representation of an equal dimensional spanning space can be superior to its matrix counterpart. We then use these optimality results to investigate how the compressed representation provided by the truncated tensor SVD is related both theoretically and empirically to its two closest tensor-based analogs, the truncated high-order SVD and the truncated tensor-train SVD.

摘要

随着机器学习的出现及其广泛应用，我们必须设计出有效的方法来表示大型数据集，同时提取出后续分析所需的内在特征。在数据降维和特征提取中，主要的工作是矩阵奇异值分解（SVD），它假定数据已经以矩阵格式排列。本研究的主要目标是表明，当将高维数据集视为张量（即多向数组）并通过张量-SVD 在张量张量积结构及其推广下进行压缩时，它们更具可压缩性。我们首先证明了两种不同截断策略下张量-SVD 族的 Eckart-Young 最优性结果。由于这种最优性属性可以在矩阵和张量代数中证明，因此会出现一个基本问题：张量结构在表示效率方面是否包含矩阵结构？答案是肯定的，通过证明具有相同维度的张量子空间的张量张量表示可以优于其矩阵对应物来证明这一点。然后，我们使用这些最优性结果从理论和经验上研究了截断张量 SVD 提供的压缩表示与两个最接近的张量基类似物（截断高阶 SVD 和截断张量训练 SVD）之间的关系。

相似文献

Tensor-tensor algebra for optimal representation and compression of multiway data.

Proc Natl Acad Sci U S A. 2021 Jul 13;118(28). doi: 10.1073/pnas.2015851118.

Optimal High-order Tensor SVD via Tensor-Train Orthogonal Iteration.

IEEE Trans Inf Theory. 2022 Jun;68(6):3991-4019. doi: 10.1109/tit.2022.3152733. Epub 2022 Feb 18.

Stable tensor neural networks for efficient deep learning.

Front Big Data. 2024 May 30;7:1363978. doi: 10.3389/fdata.2024.1363978. eCollection 2024.

Efficient enhancement of low-rank tensor completion via thin QR decomposition.

Front Big Data. 2024 Jul 2;7:1382144. doi: 10.3389/fdata.2024.1382144. eCollection 2024.

Graph-Regularized Non-Negative Tensor-Ring Decomposition for Multiway Representation Learning.

IEEE Trans Cybern. 2023 May;53(5):3114-3127. doi: 10.1109/TCYB.2022.3157133. Epub 2023 Apr 21.

Optimal Sparse Singular Value Decomposition for High-Dimensional High-Order Data.

J Am Stat Assoc. 2019;114(528):1708-1725. doi: 10.1080/01621459.2018.1527227. Epub 2019 Mar 20.

Low-Rank Tensor Function Representation for Multi-Dimensional Data Recovery.

IEEE Trans Pattern Anal Mach Intell. 2024 May;46(5):3351-3369. doi: 10.1109/TPAMI.2023.3341688. Epub 2024 Apr 3.

Nonconvex Robust High-Order Tensor Completion Using Randomized Low-Rank Approximation.

IEEE Trans Image Process. 2024;33:2835-2850. doi: 10.1109/TIP.2024.3385284. Epub 2024 Apr 17.

Robust Corrupted Data Recovery and Clustering via Generalized Transformed Tensor Low-Rank Representation.

IEEE Trans Neural Netw Learn Syst. 2024 Jul;35(7):8839-8853. doi: 10.1109/TNNLS.2022.3215983. Epub 2024 Jul 8.

Improved robust tensor principal component analysis for accelerating dynamic MR imaging reconstruction.

Med Biol Eng Comput. 2020 Jul;58(7):1483-1498. doi: 10.1007/s11517-020-02161-5. Epub 2020 May 5.

引用本文的文献

Stable tensor neural networks for efficient deep learning.

Front Big Data. 2024 May 30;7:1363978. doi: 10.3389/fdata.2024.1363978. eCollection 2024.

Dimensionality reduction of longitudinal 'omics data using modern tensor factorizations.

PLoS Comput Biol. 2022 Jul 15;18(7):e1010212. doi: 10.1371/journal.pcbi.1010212. eCollection 2022 Jul.

本文引用的文献

TTHRESH: Tensor Compression for Multidimensional Visual Data.

IEEE Trans Vis Comput Graph. 2020 Sep;26(9):2891-2903. doi: 10.1109/TVCG.2019.2904063. Epub 2019 Mar 8.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

张量张量代数在多向数据的最优表示和压缩中的应用。

Tensor-tensor algebra for optimal representation and compression of multiway data.

机构信息

Department of Mathematics, Tufts University, Medford, MA 02155.

Mathematics of AI, IBM Research, Yorktown Heights, NY 10598.

出版信息

Proc Natl Acad Sci U S A. 2021 Jul 13;118(28). doi: 10.1073/pnas.2015851118.

DOI:10.1073/pnas.2015851118

PMID:34234014

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC8285895/

Abstract

摘要

张量张量代数在多向数据的最优表示和压缩中的应用。

Tensor-tensor algebra for optimal representation and compression of multiway data.

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

张量张量代数在多向数据的最优表示和压缩中的应用。

Tensor-tensor algebra for optimal representation and compression of multiway data.

机构信息

出版信息