Suppr超能文献

链接矩阵分解

Linked matrix factorization.

作者信息

O'Connell Michael J, Lock Eric F

机构信息

Department of Statistics, Miami University, Oxford, Ohio 45056.

Division of Biostatistics, University of Minnesota, Minneapolis, Minnesota 55455.

出版信息

Biometrics. 2019 Jun;75(2):582-592. doi: 10.1111/biom.13010. Epub 2019 Apr 2.

Abstract

Several recent methods address the dimension reduction and decomposition of linked high-content data matrices. Typically, these methods consider one dimension, rows or columns, that is shared among the matrices. This shared dimension may represent common features measured for different sample sets (horizontal integration) or a common sample set with features from different platforms (vertical integration). We introduce an approach for simultaneous horizontal and vertical integration, Linked Matrix Factorization (LMF), for the general case where some matrices share rows (e.g., features) and some share columns (e.g., samples). Our motivating application is a cytotoxicity study with accompanying genomic and molecular chemical attribute data. The toxicity matrix (cell lines chemicals) shares samples with a genotype matrix (cell lines SNPs) and shares features with a molecular attribute matrix (chemicals attributes). LMF gives a unified low-rank factorization of these three matrices, which allows for the decomposition of systematic variation that is shared and systematic variation that is specific to each matrix. This allows for efficient dimension reduction, exploratory visualization, and the imputation of missing data even when entire rows or columns are missing. We present theoretical results concerning the uniqueness, identifiability, and minimal parametrization of LMF, and evaluate it with extensive simulation studies.

摘要

最近有几种方法用于处理链接的高内涵数据矩阵的降维和分解。通常,这些方法考虑矩阵之间共享的一个维度,行或列。这个共享维度可能代表针对不同样本集测量的共同特征(水平整合),或者具有来自不同平台特征的共同样本集(垂直整合)。我们针对一些矩阵共享行(例如,特征)而一些矩阵共享列(例如,样本)的一般情况,引入了一种用于同时进行水平和垂直整合的方法,即链接矩阵分解(LMF)。我们的激励应用是一项伴随基因组和分子化学属性数据的细胞毒性研究。毒性矩阵(细胞系×化学物质)与基因型矩阵(细胞系×单核苷酸多态性)共享样本,并与分子属性矩阵(化学物质×属性)共享特征。LMF对这三个矩阵进行统一的低秩分解,这允许对共享的系统变异和每个矩阵特有的系统变异进行分解。即使当整行或整列缺失时,这也允许进行有效的降维、探索性可视化以及缺失数据的插补。我们给出了关于LMF的唯一性、可识别性和最小参数化的理论结果,并用广泛的模拟研究对其进行了评估。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验