一种用于比较DNA拷贝数的半参数贝叶斯模型。

A semiparametric Bayesian model for comparing DNA copy numbers.

作者信息

Nieto-Barajas Luis, Ji Yuan, Baladandayuthapani Veerabhadran

机构信息

Department of Statistics, ITAM, Rio Hondo 1, Progreso Tizapan, 01080 Mexico, D.F. Mexico.

Biomedical Informatics, NorthShore University HealthSystem and University of Chicago, 1001 University Place, Evanston, Illinois 60201, USA.

出版信息

Braz J Probab Stat. 2016 Aug;30(3):345-365. doi: 10.1214/15-bjps283. Epub 2016 Jul 29.

DOI:10.1214/15-bjps283

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10552905/

Abstract

We propose a two-step method for the analysis of copy number data. We first define the partitions of genome aberrations and conditional on the partitions we introduce a semiparametric Bayesian model for the analysis of multiple samples from patients with different subtypes of a disease. While the biological interest is to identify regions of differential copy numbers across disease subtypes, our model also includes sample-specific random effects that account for copy number alterations between different samples in the same disease subtype. We model the subtype and sample-specific effects using a random effects mixture model. The subtype's main effects are characterized by a mixture distribution whose components are assigned Dirichlet process priors. The performance of the proposed model is examined using simulated data as well as a breast cancer genomic data set.

摘要

我们提出了一种用于分析拷贝数数据的两步法。我们首先定义基因组畸变的分区，并基于这些分区引入一个半参数贝叶斯模型，用于分析患有某疾病不同亚型的患者的多个样本。虽然生物学上的兴趣在于识别不同疾病亚型间拷贝数有差异的区域，但我们的模型还包括样本特异性随机效应，以解释同一疾病亚型中不同样本之间的拷贝数改变。我们使用随机效应混合模型对亚型和样本特异性效应进行建模。亚型的主要效应由一个混合分布表征，其成分被赋予狄利克雷过程先验。我们使用模拟数据以及一个乳腺癌基因组数据集来检验所提出模型的性能。

相似文献

1

A semiparametric Bayesian model for comparing DNA copy numbers.

Braz J Probab Stat. 2016 Aug;30(3):345-365. doi: 10.1214/15-bjps283. Epub 2016 Jul 29.

2

Bayesian Random Segmentation Models to Identify Shared Copy Number Aberrations for Array CGH Data.

J Am Stat Assoc. 2010 Dec;105(492):1358-1375. doi: 10.1198/jasa.2010.ap09250.

3

Spiked Dirichlet Process Prior for Bayesian Multiple Hypothesis Testing in Random Effects Models.

Bayesian Anal. 2009;4(4):707-732. doi: 10.1214/09-BA426.

4

A Bayesian semiparametric factor analysis model for subtype identification.

Stat Appl Genet Mol Biol. 2017 Apr 25;16(2):145-158. doi: 10.1515/sagmb-2016-0051.

5

A hierarchical spike-and-slab model for pan-cancer survival using pan-omic data.

BMC Bioinformatics. 2022 Jun 17;23(1):235. doi: 10.1186/s12859-022-04770-3.

6

Bayesian disease classification using copy number data.

Cancer Inform. 2014 Oct 1;13(Suppl 2):83-91. doi: 10.4137/CIN.S13785. eCollection 2014.

7

Generalized species sampling priors with latent Beta reinforcements.

J Am Stat Assoc. 2014 Dec 1;109(508):1466-1480. doi: 10.1080/01621459.2014.950735.

8

Identification of Copy Number Aberrations in Breast Cancer Subtypes Using Persistence Topology.

Microarrays (Basel). 2015 Aug 12;4(3):339-69. doi: 10.3390/microarrays4030339.

9

Detecting copy number variations from array CGH data based on a conditional random field model.

J Bioinform Comput Biol. 2010 Apr;8(2):295-314. doi: 10.1142/s021972001000480x.

10

A Bayesian nonparametric testing procedure for paired samples.

Biometrics. 2020 Dec;76(4):1133-1146. doi: 10.1111/biom.13234. Epub 2020 Feb 18.

本文引用的文献

1

Reconstructing DNA copy number by joint segmentation of multiple sequences.

BMC Bioinformatics. 2012 Aug 16;13:205. doi: 10.1186/1471-2105-13-205.

2

The genomic and transcriptomic architecture of 2,000 breast tumours reveals novel subgroups.

Nature. 2012 Apr 18;486(7403):346-52. doi: 10.1038/nature10983.

3

Bayesian Hidden Markov Modeling of Array CGH Data.

J Am Stat Assoc. 2008 Jun 1;103(482):485-497. doi: 10.1198/016214507000000923.

4

Bayesian Nonparametric Hidden Markov Models with application to the analysis of copy-number-variation in mammalian genomes.

J R Stat Soc Series B Stat Methodol. 2011 Jan 1;73(1):37-57. doi: 10.1111/j.1467-9868.2010.00756.x.

5

Bayesian Random Segmentation Models to Identify Shared Copy Number Aberrations for Array CGH Data.

J Am Stat Assoc. 2010 Dec;105(492):1358-1375. doi: 10.1198/jasa.2010.ap09250.

6

Multi-platform segmentation for joint detection of copy number variants.

Bioinformatics. 2011 Jun 1;27(11):1555-61. doi: 10.1093/bioinformatics/btr162. Epub 2011 Apr 5.

7

Genomic architecture characterizes tumor progression paths and fate in breast cancer patients.

Sci Transl Med. 2010 Jun 30;2(38):38ra47. doi: 10.1126/scitranslmed.3000611.

8

Modeling recurrent DNA copy number alterations in array CGH data.

Bioinformatics. 2007 Jul 1;23(13):i450-8. doi: 10.1093/bioinformatics/btm221.

9

Spatial smoothing and hot spot detection for CGH data using the fused lasso.

Biostatistics. 2008 Jan;9(1):18-29. doi: 10.1093/biostatistics/kxm013. Epub 2007 May 18.

10

Detection of DNA copy number alterations using penalized least squares regression.

Bioinformatics. 2005 Oct 15;21(20):3811-7. doi: 10.1093/bioinformatics/bti646. Epub 2005 Aug 30.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

文档翻译

学术文献翻译模型，支持多种主流文档格式。