California Institute for Quantitative Biosciences, University of California, Berkeley, Berkeley, California, United States of America.
PLoS Comput Biol. 2010 Sep 30;6(9):e1000952. doi: 10.1371/journal.pcbi.1000952.
Mammalian gene expression patterns, and their variability across populations of cells, are regulated by factors specific to each gene in concert with its surrounding cellular and genomic environment. Lentiviruses such as HIV integrate their genomes into semi-random genomic locations in the cells they infect, and the resulting viral gene expression provides a natural system to dissect the contributions of genomic environment to transcriptional regulation. Previously, we showed that expression heterogeneity and its modulation by specific host factors at HIV integration sites are key determinants of infected-cell fate and a possible source of latent infections. Here, we assess the integration context dependence of expression heterogeneity from diverse single integrations of a HIV-promoter/GFP-reporter cassette in Jurkat T-cells. Systematically fitting a stochastic model of gene expression to our data reveals an underlying transcriptional dynamic, by which multiple transcripts are produced during short, infrequent bursts, that quantitatively accounts for the wide, highly skewed protein expression distributions observed in each of our clonal cell populations. Interestingly, we find that the size of transcriptional bursts is the primary systematic covariate over integration sites, varying from a few to tens of transcripts across integration sites, and correlating well with mean expression. In contrast, burst frequencies are scattered about a typical value of several per cell-division time and demonstrate little correlation with the clonal means. This pattern of modulation generates consistently noisy distributions over the sampled integration positions, with large expression variability relative to the mean maintained even for the most productive integrations, and could contribute to specifying heterogeneous, integration-site-dependent viral production patterns in HIV-infected cells. Genomic environment thus emerges as a significant control parameter for gene expression variation that may contribute to structuring mammalian genomes, as well as be exploited for survival by integrating viruses.
哺乳动物的基因表达模式及其在细胞群体中的变异性是由每个基因特有的因素与周围细胞和基因组环境共同调节的。慢病毒,如 HIV,将其基因组整合到它们感染的细胞中的半随机基因组位置,由此产生的病毒基因表达提供了一个自然系统来剖析基因组环境对转录调控的贡献。以前,我们表明 HIV 整合位点的表达异质性及其受特定宿主因素的调节是感染细胞命运的关键决定因素,也是潜伏感染的可能来源。在这里,我们评估了 HIV 启动子/GFP 报告基因盒在 Jurkat T 细胞中的多种单整合对表达异质性的整合背景依赖性。通过系统地将基因表达的随机模型拟合到我们的数据中,揭示了一个潜在的转录动态,即多个转录本在短暂而频繁的爆发中产生,这定量解释了我们每个克隆细胞群体中观察到的广泛的、高度偏态的蛋白质表达分布。有趣的是,我们发现转录爆发的大小是整合位点的主要系统协变量,跨整合位点从几个到几十个转录本不等,与平均表达高度相关。相比之下,爆发频率分散在典型的每个细胞分裂时间几个左右,与克隆平均值相关性较小。这种调节模式在采样的整合位置上产生了一致的噪声分布,与平均值相比,表达的可变性很大,即使对于最具生产力的整合也是如此,并且可能有助于在 HIV 感染的细胞中指定具有整合位点依赖性的异质病毒产生模式。因此,基因组环境作为基因表达变异的一个重要控制参数出现,它可能有助于构建哺乳动物基因组,并被整合病毒用来生存。