• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

可扩展非负矩阵三因子分解

Scalable non-negative matrix tri-factorization.

作者信息

Čopar Andrej, Žitnik Marinka, Zupan Blaž

机构信息

Faculty of Computer and Information Science, University of Ljubljana, Ljubljana, Slovenia.

Department of Computer Science, Stanford University, Stanford, 94305 CA USA.

出版信息

BioData Min. 2017 Dec 29;10:41. doi: 10.1186/s13040-017-0160-6. eCollection 2017.

DOI:10.1186/s13040-017-0160-6
PMID:29299064
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC5746986/
Abstract

BACKGROUND

Matrix factorization is a well established pattern discovery tool that has seen numerous applications in biomedical data analytics, such as gene expression co-clustering, patient stratification, and gene-disease association mining. Matrix factorization learns a latent data model that takes a data matrix and transforms it into a latent feature space enabling generalization, noise removal and feature discovery. However, factorization algorithms are numerically intensive, and hence there is a pressing challenge to scale current algorithms to work with large datasets. Our focus in this paper is matrix tri-factorization, a popular method that is not limited by the assumption of standard matrix factorization about data residing in one latent space. Matrix tri-factorization solves this by inferring a separate latent space for each dimension in a data matrix, and a latent mapping of interactions between the inferred spaces, making the approach particularly suitable for biomedical data mining.

RESULTS

We developed a block-wise approach for latent factor learning in matrix tri-factorization. The approach partitions a data matrix into disjoint submatrices that are treated independently and fed into a parallel factorization system. An appealing property of the proposed approach is its mathematical equivalence with serial matrix tri-factorization. In a study on large biomedical datasets we show that our approach scales well on multi-processor and multi-GPU architectures. On a four-GPU system we demonstrate that our approach can be more than 100-times faster than its single-processor counterpart.

CONCLUSIONS

A general approach for scaling non-negative matrix tri-factorization is proposed. The approach is especially useful parallel matrix factorization implemented in a multi-GPU environment. We expect the new approach will be useful in emerging procedures for latent factor analysis, notably for data integration, where many large data matrices need to be collectively factorized.

摘要

背景

矩阵分解是一种成熟的模式发现工具,已在生物医学数据分析中得到广泛应用,如基因表达共聚类、患者分层和基因-疾病关联挖掘。矩阵分解学习一种潜在数据模型,该模型将数据矩阵转换为潜在特征空间,从而实现泛化、噪声去除和特征发现。然而,分解算法计算量很大,因此将当前算法扩展以处理大型数据集面临紧迫挑战。本文我们关注的是矩阵三因子分解,这是一种流行的方法,不受标准矩阵分解关于数据位于一个潜在空间这一假设的限制。矩阵三因子分解通过为数据矩阵的每个维度推断一个单独的潜在空间以及推断空间之间相互作用的潜在映射来解决这个问题,使得该方法特别适合生物医学数据挖掘。

结果

我们开发了一种用于矩阵三因子分解中潜在因子学习的分块方法。该方法将数据矩阵划分为不相交的子矩阵,这些子矩阵被独立处理并输入到并行分解系统中。所提方法的一个吸引人的特性是它与串行矩阵三因子分解在数学上等价。在对大型生物医学数据集的研究中,我们表明我们的方法在多处理器和多GPU架构上扩展性良好。在一个四GPU系统上,我们证明我们的方法比其单处理器对应方法快100倍以上。

结论

提出了一种扩展非负矩阵三因子分解的通用方法。该方法对于在多GPU环境中实现的并行矩阵分解特别有用。我们期望新方法将在潜在因子分析的新兴过程中有用,特别是在数据集成方面,其中许多大型数据矩阵需要集体分解。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6de7/5746986/5345c54bb85e/13040_2017_160_Fig9_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6de7/5746986/442b525ba93d/13040_2017_160_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6de7/5746986/60e1a26364de/13040_2017_160_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6de7/5746986/969a839d5b31/13040_2017_160_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6de7/5746986/defdafbf5dac/13040_2017_160_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6de7/5746986/f50f9f4c36ed/13040_2017_160_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6de7/5746986/89ac3e59a703/13040_2017_160_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6de7/5746986/e1ebd8102b3d/13040_2017_160_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6de7/5746986/afbcddebfe12/13040_2017_160_Fig8_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6de7/5746986/5345c54bb85e/13040_2017_160_Fig9_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6de7/5746986/442b525ba93d/13040_2017_160_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6de7/5746986/60e1a26364de/13040_2017_160_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6de7/5746986/969a839d5b31/13040_2017_160_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6de7/5746986/defdafbf5dac/13040_2017_160_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6de7/5746986/f50f9f4c36ed/13040_2017_160_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6de7/5746986/89ac3e59a703/13040_2017_160_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6de7/5746986/e1ebd8102b3d/13040_2017_160_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6de7/5746986/afbcddebfe12/13040_2017_160_Fig8_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6de7/5746986/5345c54bb85e/13040_2017_160_Fig9_HTML.jpg

相似文献

1
Scalable non-negative matrix tri-factorization.可扩展非负矩阵三因子分解
BioData Min. 2017 Dec 29;10:41. doi: 10.1186/s13040-017-0160-6. eCollection 2017.
2
NMF-mGPU: non-negative matrix factorization on multi-GPU systems.NMF-mGPU:多GPU系统上的非负矩阵分解
BMC Bioinformatics. 2015 Feb 13;16:43. doi: 10.1186/s12859-015-0485-4.
3
Robust capped norm dual hyper-graph regularized non-negative matrix tri-factorization.鲁棒帽范数对偶超图正则化非负矩阵三因子分解
Math Biosci Eng. 2023 May 24;20(7):12486-12509. doi: 10.3934/mbe.2023556.
4
Bayesian semi-nonnegative matrix tri-factorization to identify pathways associated with cancer phenotypes.贝叶斯半非负矩阵三因子分解鉴定与癌症表型相关的途径。
Pac Symp Biocomput. 2020;25:427-438.
5
Learning matrix factorization with scalable distance metric and regularizer.使用可扩展距离度量和正则化器学习矩阵分解。
Neural Netw. 2023 Apr;161:254-266. doi: 10.1016/j.neunet.2023.01.034. Epub 2023 Feb 3.
6
Non-Negative Matrix Tri-Factorization for Representation Learning in Multi-Omics Datasets with Applications to Drug Repurposing and Selection.非负矩阵三因子分解在多组学数据集中的表示学习及其在药物重定位和选择中的应用
Int J Mol Sci. 2024 Sep 4;25(17):9576. doi: 10.3390/ijms25179576.
7
Multi-view clustering via multi-manifold regularized non-negative matrix factorization.基于多流形正则化非负矩阵分解的多视图聚类
Neural Netw. 2017 Apr;88:74-89. doi: 10.1016/j.neunet.2017.02.003. Epub 2017 Feb 8.
8
Fast optimization of non-negative matrix tri-factorization.快速优化非负矩阵三因子分解。
PLoS One. 2019 Jun 11;14(6):e0217994. doi: 10.1371/journal.pone.0217994. eCollection 2019.
9
Computational drug repositioning based on multi-similarities bilinear matrix factorization.基于多相似度双线性矩阵分解的计算药物重定位。
Brief Bioinform. 2021 Jul 20;22(4). doi: 10.1093/bib/bbaa267.
10
Network-based integrative clustering of multiple types of genomic data using non-negative matrix factorization.基于网络的多种类型基因组数据的非负矩阵分解综合聚类分析。
Comput Biol Med. 2020 Mar;118:103625. doi: 10.1016/j.compbiomed.2020.103625. Epub 2020 Jan 23.

引用本文的文献

1
Spatially Informed Nonnegative Matrix Trifactorization for Coclustering Mass Spectrometry Data.用于共聚类质谱数据的空间信息非负矩阵三因子分解
Biom J. 2025 Apr;67(2):e70031. doi: 10.1002/bimj.70031.
2
Fast optimization of non-negative matrix tri-factorization.快速优化非负矩阵三因子分解。
PLoS One. 2019 Jun 11;14(6):e0217994. doi: 10.1371/journal.pone.0217994. eCollection 2019.

本文引用的文献

1
Toward a Shared Vision for Cancer Genomic Data.迈向癌症基因组数据的共同愿景。
N Engl J Med. 2016 Sep 22;375(12):1109-12. doi: 10.1056/NEJMp1607591.
2
GATA binding protein 2 overexpression is associated with poor prognosis in KRAS mutant colorectal cancer.GATA结合蛋白2过表达与KRAS突变型结直肠癌的不良预后相关。
Oncol Rep. 2016 Sep;36(3):1672-8. doi: 10.3892/or.2016.4961. Epub 2016 Jul 21.
3
FAT4 functions as a tumour suppressor in gastric cancer by modulating Wnt/β-catenin signalling.FAT4通过调节Wnt/β-连环蛋白信号通路在胃癌中发挥肿瘤抑制作用。
Br J Cancer. 2015 Dec 22;113(12):1720-9. doi: 10.1038/bjc.2015.367. Epub 2015 Dec 3.
4
Data Fusion by Matrix Factorization.矩阵分解的数据融合。
IEEE Trans Pattern Anal Mach Intell. 2015 Jan;37(1):41-53. doi: 10.1109/TPAMI.2014.2343973.
5
Understanding multicellular function and disease with human tissue-specific networks.利用人类组织特异性网络理解多细胞功能与疾病。
Nat Genet. 2015 Jun;47(6):569-76. doi: 10.1038/ng.3259. Epub 2015 Apr 27.
6
NMF-mGPU: non-negative matrix factorization on multi-GPU systems.NMF-mGPU:多GPU系统上的非负矩阵分解
BMC Bioinformatics. 2015 Feb 13;16:43. doi: 10.1186/s12859-015-0485-4.
7
ArrayExpress update--simplifying data submissions.ArrayExpress更新——简化数据提交
Nucleic Acids Res. 2015 Jan;43(Database issue):D1113-6. doi: 10.1093/nar/gku1057. Epub 2014 Oct 31.
8
Peak picking NMR spectral data using non-negative matrix factorization.使用非负矩阵分解进行 NMR 光谱数据的峰提取。
BMC Bioinformatics. 2014 Feb 11;15:46. doi: 10.1186/1471-2105-15-46.
9
Non-negative matrix factorization of multimodal MRI, fMRI and phenotypic data reveals differential changes in default mode subnetworks in ADHD.多模态磁共振成像(MRI)、功能磁共振成像(fMRI)和表型数据的非负矩阵分解揭示了注意力缺陷多动障碍(ADHD)默认模式子网的差异变化。
Neuroimage. 2014 Nov 15;102 Pt 1:207-19. doi: 10.1016/j.neuroimage.2013.12.015. Epub 2013 Dec 19.
10
Automatic relevance determination in nonnegative matrix factorization with the β-divergence.基于β散度的非负矩阵分解中的自动相关性确定。
IEEE Trans Pattern Anal Mach Intell. 2013 Jul;35(7):1592-605. doi: 10.1109/TPAMI.2012.240.