Department of Molecular, Cell, and Developmental Biology, University of California, Los Angeles, CA, USA.
Department of Molecular and Medical Pharmacology, David Geffen School of Medicine, University of California, Los Angeles, CA, USA.
Sci Rep. 2019 Aug 16;9(1):11944. doi: 10.1038/s41598-019-48302-1.
Aneuploidy, defined as abnormal chromosome number or somatic DNA copy number, is a characteristic of many aggressive tumors and is thought to drive tumorigenesis. Gene expression-aneuploidy association studies have previously been conducted to explore cellular mechanisms associated with aneuploidy. However, in an observational setting, gene expression is influenced by many factors that can act as confounders between gene expression and aneuploidy, leading to spurious correlations between the two variables. These factors include known confounders such as sample purity or batch effect, as well as gene co-regulation which induces correlations between the expression of causal genes and non-causal genes. We use a linear mixed-effects model (LMM) to account for confounding effects of tumor purity and gene co-regulation on gene expression-aneuploidy associations. When applied to patient tumor data across diverse tumor types, we observe that the LMM both accounts for the impact of purity on aneuploidy measurements and identifies a new association between histone gene expression and aneuploidy.
非整倍体,定义为异常染色体数量或体细胞 DNA 拷贝数,是许多侵袭性肿瘤的特征,被认为是肿瘤发生的驱动因素。先前已经进行了基因表达-非整倍体关联研究,以探索与非整倍体相关的细胞机制。然而,在观察性研究中,基因表达受到许多因素的影响,这些因素可能成为基因表达和非整倍体之间的混杂因素,导致这两个变量之间存在虚假相关性。这些因素包括已知的混杂因素,如样本纯度或批次效应,以及基因共调控,它会在因果基因和非因果基因的表达之间产生相关性。我们使用线性混合效应模型 (LMM) 来解释肿瘤纯度和基因共调控对基因表达-非整倍体关联的混杂影响。当应用于不同肿瘤类型的患者肿瘤数据时,我们观察到 LMM 既考虑了纯度对非整倍体测量的影响,又确定了组蛋白基因表达与非整倍体之间的新关联。