Nieto-Barajas Luis, Ji Yuan, Baladandayuthapani Veerabhadran
Department of Statistics, ITAM, Rio Hondo 1, Progreso Tizapan, 01080 Mexico, D.F. Mexico.
Biomedical Informatics, NorthShore University HealthSystem and University of Chicago, 1001 University Place, Evanston, Illinois 60201, USA.
Braz J Probab Stat. 2016 Aug;30(3):345-365. doi: 10.1214/15-bjps283. Epub 2016 Jul 29.
We propose a two-step method for the analysis of copy number data. We first define the partitions of genome aberrations and conditional on the partitions we introduce a semiparametric Bayesian model for the analysis of multiple samples from patients with different subtypes of a disease. While the biological interest is to identify regions of differential copy numbers across disease subtypes, our model also includes sample-specific random effects that account for copy number alterations between different samples in the same disease subtype. We model the subtype and sample-specific effects using a random effects mixture model. The subtype's main effects are characterized by a mixture distribution whose components are assigned Dirichlet process priors. The performance of the proposed model is examined using simulated data as well as a breast cancer genomic data set.
我们提出了一种用于分析拷贝数数据的两步法。我们首先定义基因组畸变的分区,并基于这些分区引入一个半参数贝叶斯模型,用于分析患有某疾病不同亚型的患者的多个样本。虽然生物学上的兴趣在于识别不同疾病亚型间拷贝数有差异的区域,但我们的模型还包括样本特异性随机效应,以解释同一疾病亚型中不同样本之间的拷贝数改变。我们使用随机效应混合模型对亚型和样本特异性效应进行建模。亚型的主要效应由一个混合分布表征,其成分被赋予狄利克雷过程先验。我们使用模拟数据以及一个乳腺癌基因组数据集来检验所提出模型的性能。