矩阵分解的数据融合。

Data Fusion by Matrix Factorization.

出版信息

IEEE Trans Pattern Anal Mach Intell. 2015 Jan;37(1):41-53. doi: 10.1109/TPAMI.2014.2343973.

DOI:10.1109/TPAMI.2014.2343973

Abstract

For most problems in science and engineering we can obtain data sets that describe the observed system from various perspectives and record the behavior of its individual components. Heterogeneous data sets can be collectively mined by data fusion. Fusion can focus on a specific target relation and exploit directly associated data together with contextual data and data about system's constraints. In the paper we describe a data fusion approach with penalized matrix tri-factorization (DFMF) that simultaneously factorizes data matrices to reveal hidden associations. The approach can directly consider any data that can be expressed in a matrix, including those from feature-based representations, ontologies, associations and networks. We demonstrate the utility of DFMF for gene function prediction task with eleven different data sources and for prediction of pharmacologic actions by fusing six data sources. Our data fusion algorithm compares favorably to alternative data integration approaches and achieves higher accuracy than can be obtained from any single data source alone.

摘要

对于大多数科学和工程问题，我们可以从多个角度获得描述观测系统的数据，并记录其各个组件的行为。异构数据集可以通过数据融合进行集体挖掘。融合可以集中在特定的目标关系上，并利用直接相关的数据以及上下文数据和关于系统约束的数据。在本文中，我们描述了一种具有惩罚矩阵三因子分解（DFMF）的数据融合方法，该方法可以同时对数据矩阵进行因子分解，以揭示隐藏的关联。该方法可以直接考虑任何可以用矩阵表示的数据，包括基于特征的表示、本体、关联和网络的数据。我们使用 11 种不同的数据源演示了 DFMF 在基因功能预测任务中的效用，并使用 6 种数据源融合来预测药物作用。我们的数据融合算法优于替代的数据集成方法，并比仅从任何单个数据源获得的准确性更高。

相似文献

Data Fusion by Matrix Factorization.

IEEE Trans Pattern Anal Mach Intell. 2015 Jan;37(1):41-53. doi: 10.1109/TPAMI.2014.2343973.

Weighted deep factorizing heterogeneous molecular network for genome-phenome association prediction.

Methods. 2022 Sep;205:18-28. doi: 10.1016/j.ymeth.2022.05.008. Epub 2022 Jun 8.

Analysis of data fusion methods in virtual screening: theoretical model.

J Chem Inf Model. 2006 Nov-Dec;46(6):2193-205. doi: 10.1021/ci049615w.

Matrix factorization algorithms for the identification of muscle synergies: evaluation on simulated and experimental data sets.

J Neurophysiol. 2006 Apr;95(4):2199-212. doi: 10.1152/jn.00222.2005. Epub 2006 Jan 4.

Protein Sub-Nuclear Localization Based on Effective Fusion Representations and Dimension Reduction Algorithm LDA.

Int J Mol Sci. 2015 Dec 19;16(12):30343-61. doi: 10.3390/ijms161226237.

Scalable non-negative matrix tri-factorization.

BioData Min. 2017 Dec 29;10:41. doi: 10.1186/s13040-017-0160-6. eCollection 2017.

Nonnegative local coordinate factorization for image representation.

IEEE Trans Image Process. 2013 Mar;22(3):969-79. doi: 10.1109/TIP.2012.2224357. Epub 2012 Oct 12.

Partially shared latent factor learning with multiview data.

IEEE Trans Neural Netw Learn Syst. 2015 Jun;26(6):1233-46. doi: 10.1109/TNNLS.2014.2335234. Epub 2014 Jul 28.

Accelerated multiplicative updates and hierarchical ALS algorithms for nonnegative matrix factorization.

Neural Comput. 2012 Apr;24(4):1085-105. doi: 10.1162/NECO_a_00256. Epub 2011 Dec 14.

Recursive inverse factorization.

J Chem Phys. 2008 Mar 14;128(10):104105. doi: 10.1063/1.2884921.

引用本文的文献

Integrative analysis of multi-omics data reveals importance of collagen and the PI3K AKT signalling pathway in CAKUT.

Sci Rep. 2024 Sep 5;14(1):20731. doi: 10.1038/s41598-024-71721-8.

Proteomics appending a complementary dimension to precision oncotherapy.

Comput Struct Biotechnol J. 2024 Apr 20;23:1725-1739. doi: 10.1016/j.csbj.2024.04.044. eCollection 2024 Dec.

Gene Target Prediction of Environmental Chemicals Using Coupled Matrix-Matrix Completion.

Environ Sci Technol. 2024 Apr 2;58(13):5889-5898. doi: 10.1021/acs.est.4c00458. Epub 2024 Mar 19.

An in-depth comparison of linear and non-linear joint embedding methods for bulk and single-cell multi-omics.

Brief Bioinform. 2023 Nov 22;25(1). doi: 10.1093/bib/bbad416.

Hetnet connectivity search provides rapid insights into how biomedical entities are related.

Gigascience. 2022 Dec 28;12. doi: 10.1093/gigascience/giad047. Epub 2023 Jul 28.

Nonlinear data fusion over Entity-Relation graphs for Drug-Target Interaction prediction.

Bioinformatics. 2023 Jun 1;39(6). doi: 10.1093/bioinformatics/btad348.

3PNMF-MKL: A non-negative matrix factorization-based multiple kernel learning method for multi-modal data integration and its application to gene signature detection.

Front Genet. 2023 Feb 14;14:1095330. doi: 10.3389/fgene.2023.1095330. eCollection 2023.

Hetnet connectivity search provides rapid insights into how two biomedical entities are related.

bioRxiv. 2023 Jan 7:2023.01.05.522941. doi: 10.1101/2023.01.05.522941.

Heterogeneous data integration methods for patient similarity networks.

Brief Bioinform. 2022 Jul 18;23(4). doi: 10.1093/bib/bbac207.

A Novel Patient Similarity Network (PSN) Framework Based on Multi-Model Deep Learning for Precision Medicine.

J Pers Med. 2022 May 10;12(5):768. doi: 10.3390/jpm12050768.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

矩阵分解的数据融合。

Data Fusion by Matrix Factorization.

出版信息

相似文献

引用本文的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献