整合组学应用中的元分析主成分分析。

Meta-analytic principal component analysis in integrative omics application.

机构信息

Department of Statistics, Keimyung University, Daegu 42601, South Korea.

Genomics Division, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA.

出版信息

Bioinformatics. 2018 Apr 15;34(8):1321-1328. doi: 10.1093/bioinformatics/btx765.

DOI:10.1093/bioinformatics/btx765

PMID:29186328

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC5905607/

Abstract

MOTIVATION

With the prevalent usage of microarray and massively parallel sequencing, numerous high-throughput omics datasets have become available in the public domain. Integrating abundant information among omics datasets is critical to elucidate biological mechanisms. Due to the high-dimensional nature of the data, methods such as principal component analysis (PCA) have been widely applied, aiming at effective dimension reduction and exploratory visualization.

RESULTS

In this article, we combine multiple omics datasets of identical or similar biological hypothesis and introduce two variations of meta-analytic framework of PCA, namely MetaPCA. Regularization is further incorporated to facilitate sparse feature selection in MetaPCA. We apply MetaPCA and sparse MetaPCA to simulations, three transcriptomic meta-analysis studies in yeast cell cycle, prostate cancer, mouse metabolism and a TCGA pan-cancer methylation study. The result shows improved accuracy, robustness and exploratory visualization of the proposed framework.

AVAILABILITY AND IMPLEMENTATION

An R package MetaPCA is available online. (http://tsenglab.biostat.pitt.edu/software.htm).

CONTACT

ctseng@pitt.edu.

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

摘要

动机

随着微阵列和大规模平行测序的广泛应用，大量高通量组学数据集已经在公共领域中可用。整合组学数据集中丰富的信息对于阐明生物学机制至关重要。由于数据的高维性质，已经广泛应用了主成分分析（PCA）等方法，旨在实现有效的降维和探索性可视化。

结果

在本文中，我们将相同或相似生物学假设的多个组学数据集结合起来，并引入了两种 PCA 的荟萃分析框架变体，即 MetaPCA。进一步纳入正则化以促进 MetaPCA 中的稀疏特征选择。我们将 MetaPCA 和稀疏 MetaPCA 应用于模拟、酵母细胞周期、前列腺癌、小鼠代谢的三个转录组荟萃分析研究以及 TCGA 泛癌甲基化研究。结果表明，所提出的框架提高了准确性、稳健性和探索性可视化。

可用性和实现

一个名为 MetaPCA 的 R 包可在线获得。（http://tsenglab.biostat.pitt.edu/software.htm）。

联系方式

ctsend@pitt.edu。

补充信息

补充数据可在 Bioinformatics 在线获得。

Suppr 超能文献

文献检索

文件翻译

深度研究

Suppr 超能文献

文献检索

文件翻译

深度研究

整合组学应用中的元分析主成分分析。

Meta-analytic principal component analysis in integrative omics application.

机构信息

出版信息

MOTIVATION

RESULTS

AVAILABILITY AND IMPLEMENTATION

CONTACT

SUPPLEMENTARY INFORMATION

动机

结果

可用性和实现

联系方式

补充信息

相似文献

引用本文的文献

本文引用的文献

整合组学应用中的元分析主成分分析。

Meta-analytic principal component analysis in integrative omics application.

机构信息

出版信息

MOTIVATION

RESULTS

AVAILABILITY AND IMPLEMENTATION

CONTACT

SUPPLEMENTARY INFORMATION

动机

结果

可用性和实现

联系方式

补充信息

相似文献

引用本文的文献

本文引用的文献