G.S.Sanyal School of Telecommunications, Indian Institute of Technology Kharagpur, Kharagpur, West Bengal 721302, India, Computational Biology & Bioinformatics, Duke University, Durham, NC 27705, USA, Quintiles, Durham, NC 27703, USA and Electrical & Computer Engineering Department, Duke University, Durham, NC 27705, USA.
Bioinformatics. 2014 May 15;30(10):1370-6. doi: 10.1093/bioinformatics/btu064. Epub 2014 Jan 30.
A non-parametric Bayesian factor model is proposed for joint analysis of multi-platform genomics data. The approach is based on factorizing the latent space (feature space) into a shared component and a data-specific component with the dimensionality of these components (spaces) inferred via a beta-Bernoulli process. The proposed approach is demonstrated by jointly analyzing gene expression/copy number variations and gene expression/methylation data for ovarian cancer patients, showing that the proposed model can potentially uncover key drivers related to cancer.
The source code for this model is written in MATLAB and has been made publicly available at https://sites.google.com/site/jointgenomics/.
Supplementary data are available at Bioinformatics online.
提出了一种非参数贝叶斯因子模型,用于联合分析多平台基因组学数据。该方法基于将潜在空间(特征空间)分解为共享分量和数据特定分量,这些分量(空间)的维数通过贝塔-伯努利过程进行推断。通过联合分析卵巢癌患者的基因表达/拷贝数变化和基因表达/甲基化数据,验证了该方法的有效性,结果表明该模型可以潜在地揭示与癌症相关的关键驱动因素。
该模型的源代码是用 MATLAB 编写的,并已在 https://sites.google.com/site/jointgenomics/ 上公开。
补充数据可在 Bioinformatics 在线获取。