Nadig Ajay, Replogle Joseph M, Pogson Angela N, McCarroll Steven A, Weissman Jonathan S, Robinson Elise B, O'Connor Luke J
Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA.
Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA, USA.
bioRxiv. 2024 Jul 3:2024.07.03.601903. doi: 10.1101/2024.07.03.601903.
Single cell CRISPR screens such as Perturb-seq enable transcriptomic profiling of genetic perturbations at scale. However, the data produced by these screens are often noisy due to cost and technical constraints, limiting power to detect true effects with conventional differential expression analyses. Here, we introduce TRanscriptome-wide Analysis of Differential Expression (TRADE), a statistical framework which estimates the transcriptome-wide distribution of true differential expression effects from noisy gene-level measurements. Within TRADE, we derive multiple novel, interpretable statistical metrics, including the "transcriptome-wide impact", an estimator of the overall transcriptional effect of a perturbation which is stable across sampling depths. We analyze new and published large-scale Perturb-seq datasets to show that many true transcriptional effects are not statistically significant, but detectable in aggregate with TRADE. In a genome-scale Perturb-seq screen, we find that a typical gene perturbation affects an estimated 45 genes, whereas a typical essential gene perturbation affects over 500 genes. An advantage of our approach is its ability to compare the transcriptomic effects of genetic perturbations across contexts and dosages despite differences in power. We use this ability to identify perturbations with cell-type dependent effects and to find examples of perturbations where transcriptional responses are not only larger in magnitude, but also qualitatively different, as a function of dosage. Lastly, we expand our analysis to case/control comparison of gene expression for neuropsychiatric conditions, finding that transcriptomic effect correlations are greater than genetic correlations for these diagnoses. TRADE lays an analytic foundation for the systematic comparison of genetic perturbation atlases, as well as differential expression experiments more broadly.
诸如Perturb-seq之类的单细胞CRISPR筛选能够大规模地对基因扰动进行转录组分析。然而,由于成本和技术限制,这些筛选产生的数据往往存在噪声,限制了通过传统差异表达分析检测真实效应的能力。在此,我们引入了差异表达的全转录组分析(TRADE),这是一个统计框架,可从有噪声的基因水平测量中估计全转录组范围内真实差异表达效应的分布。在TRADE框架内,我们推导了多个新颖的、可解释的统计指标,包括“全转录组影响”,这是一种扰动总体转录效应的估计量,在不同采样深度下都很稳定。我们分析了新的和已发表的大规模Perturb-seq数据集,结果表明,许多真实的转录效应在统计学上并不显著,但通过TRADE可以汇总检测到。在一项全基因组规模的Perturb-seq筛选中,我们发现,典型的基因扰动会影响约45个基因,而典型的必需基因扰动会影响超过500个基因。我们方法的一个优点是,尽管检测能力存在差异,但它能够比较不同背景和剂量下基因扰动的转录组效应。我们利用这一能力来识别具有细胞类型依赖性效应的扰动,并找到一些扰动的例子,其中转录反应不仅在幅度上更大,而且在质量上也因剂量而异。最后,我们将分析扩展到神经精神疾病基因表达的病例/对照比较,发现这些诊断的转录组效应相关性大于遗传相关性。TRADE为系统比较基因扰动图谱以及更广泛的差异表达实验奠定了分析基础。