Suppr超能文献

单细胞 CRISPR 筛选的指数族测量误差模型。

Exponential family measurement error models for single-cell CRISPR screens.

机构信息

Department of Biostatistics, Harvard T.H. Chan School of Public Health, Building 2 435, 655 Huntington Ave, Boston, MA 02115, United States.

Department of Statistics and Data Science, Carnegie Mellon University, Baker Hall 228B, 4909 Frew St, Pittsburgh, PA 15213, United States.

出版信息

Biostatistics. 2024 Oct 1;25(4):1254-1272. doi: 10.1093/biostatistics/kxae010.

Abstract

CRISPR genome engineering and single-cell RNA sequencing have accelerated biological discovery. Single-cell CRISPR screens unite these two technologies, linking genetic perturbations in individual cells to changes in gene expression and illuminating regulatory networks underlying diseases. Despite their promise, single-cell CRISPR screens present considerable statistical challenges. We demonstrate through theoretical and real data analyses that a standard method for estimation and inference in single-cell CRISPR screens-"thresholded regression"-exhibits attenuation bias and a bias-variance tradeoff as a function of an intrinsic, challenging-to-select tuning parameter. To overcome these difficulties, we introduce GLM-EIV ("GLM-based errors-in-variables"), a new method for single-cell CRISPR screen analysis. GLM-EIV extends the classical errors-in-variables model to responses and noisy predictors that are exponential family-distributed and potentially impacted by the same set of confounding variables. We develop a computational infrastructure to deploy GLM-EIV across hundreds of processors on clouds (e.g. Microsoft Azure) and high-performance clusters. Leveraging this infrastructure, we apply GLM-EIV to analyze two recent, large-scale, single-cell CRISPR screen datasets, yielding several new insights.

摘要

CRISPR 基因组工程和单细胞 RNA 测序加速了生物学发现。单细胞 CRISPR 筛选将这两种技术结合在一起,将单个细胞中的遗传扰动与基因表达的变化联系起来,并阐明了疾病背后的调控网络。尽管它们很有前途,但单细胞 CRISPR 筛选存在相当大的统计挑战。我们通过理论和真实数据分析证明,单细胞 CRISPR 筛选中用于估计和推断的一种标准方法-"阈值回归" - 表现出衰减偏差和偏差-方差权衡,这是一个内在的、难以选择的调整参数的函数。为了克服这些困难,我们引入了 GLM-EIV(基于广义线性模型的误差变量),这是一种用于单细胞 CRISPR 筛选分析的新方法。GLM-EIV 将经典的误差变量模型扩展到响应和嘈杂预测变量,这些响应和嘈杂预测变量是指数家族分布的,并且可能受到同一组混杂变量的影响。我们开发了一种计算基础设施,以便在云(例如 Microsoft Azure)和高性能集群上的数百个处理器上部署 GLM-EIV。利用这个基础设施,我们应用 GLM-EIV 来分析两个最近的大规模单细胞 CRISPR 筛选数据集,得出了一些新的见解。

相似文献

6
Functional Genomics via CRISPR-Cas.通过 CRISPR-Cas 进行功能基因组学。
J Mol Biol. 2019 Jan 4;431(1):48-65. doi: 10.1016/j.jmb.2018.06.034. Epub 2018 Jun 28.
9
Pooled CRISPR screening with single-cell transcriptome readout.结合单细胞转录组读数的CRISPR筛选。
Nat Methods. 2017 Mar;14(3):297-301. doi: 10.1038/nmeth.4177. Epub 2017 Jan 18.

本文引用的文献

8
A new era in functional genomics screens.功能基因组学筛选的新时代。
Nat Rev Genet. 2022 Feb;23(2):89-103. doi: 10.1038/s41576-021-00409-w. Epub 2021 Sep 20.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验