Guha Subharup, Ji Yuan, Baladandayuthapani Veerabhadran
Department of Statistics, University of Missouri, Columbia, MO, USA.
Center for Biomedical Informatics, North Shore University Health System, Evanston, IL, USA. ; Department of Health Studies, The University of Chicago, IL, USA.
Cancer Inform. 2014 Oct 1;13(Suppl 2):83-91. doi: 10.4137/CIN.S13785. eCollection 2014.
DNA copy number variations (CNVs) have been shown to be associated with cancer development and progression. The detection of these CNVs has the potential to impact the basic knowledge and treatment of many types of cancers, and can play a role in the discovery and development of molecular-based personalized cancer therapies. One of the most common types of high-resolution chromosomal microarrays is array-based comparative genomic hybridization (aCGH) methods that assay DNA CNVs across the whole genomic landscape in a single experiment. In this article we propose methods to use aCGH profiles to predict disease states. We employ a Bayesian classification model and treat disease states as outcome, and aCGH profiles as covariates in order to identify significant regions of the genome associated with disease subclasses. We propose a principled two-stage method where we first make inferences on the underlying copy number states associated with the aCGH emissions based on hidden Markov model (HMM) formulations to account for serial dependencies in neighboring probes. Subsequently, we infer associations with disease outcomes, conditional on the copy number states, using Bayesian linear variable selection procedures. The selected probes and their effects are parameters that are useful for predicting the disease categories of any additional individuals on the basis of their aCGH profiles. Using simulated datasets, we investigate the method's accuracy in detecting disease category. Our methodology is motivated by and applied to a breast cancer dataset consisting of aCGH profiles assayed on patients from multiple disease subtypes.
DNA拷贝数变异(CNVs)已被证明与癌症的发生和发展相关。这些CNVs的检测有可能影响多种癌症的基础知识和治疗,并能在基于分子的个性化癌症治疗的发现和发展中发挥作用。基于阵列的比较基因组杂交(aCGH)方法是最常见的高分辨率染色体微阵列类型之一,它能在单个实验中分析整个基因组范围内的DNA CNVs。在本文中,我们提出了利用aCGH图谱预测疾病状态的方法。我们采用贝叶斯分类模型,将疾病状态视为结果,将aCGH图谱视为协变量,以识别与疾病亚类相关的基因组重要区域。我们提出了一种有原则的两阶段方法,首先基于隐马尔可夫模型(HMM)公式对与aCGH信号相关的潜在拷贝数状态进行推断,以考虑相邻探针中的序列依赖性。随后,我们使用贝叶斯线性变量选择程序,在拷贝数状态的条件下推断与疾病结果的关联。所选的探针及其效应是基于aCGH图谱预测任何其他个体疾病类别的有用参数。我们使用模拟数据集研究了该方法在检测疾病类别方面的准确性。我们的方法受到一个乳腺癌数据集的启发,并应用于该数据集,该数据集由对来自多种疾病亚型患者进行检测的aCGH图谱组成。