Kim Dong Jun, Joh Christine Suh Yun, Jeong So Young, Kim Yong Jun, Koh Seong Joon, Kim Hyun Je
Department of Biomedical Sciences, Seoul National University Graduate School, Seoul, 03080, Republic of Korea.
Department of Microbiology and Immunology, Seoul National University College of Medicine, Seoul, 03080, Republic of Korea.
Genomics Inform. 2025 May 20;23(1):14. doi: 10.1186/s44342-025-00043-6.
In single-cell RNA sequencing (scRNA-seq) data, issues related to the high expression of non-variable RNAs often arise due to organ traits or sample quality. Computational methods, such as SoupX (Young (Gigascience 9:giaa151, 2020)), have been used to solve this problem but it may remove biologically relevant data. This study presents a clustered regularly interspaced short palindromic repeats (CRISPR)-Cas9-based method that selectively removes non-variable RNAs. We applied this approach to scRNA-seq data from human intestinal tissues of 17 patients. By targeting non-variable genes, including ribosomal and mitochondrial RNAs, CRISPR-Cas9 treatment effectively reduced their expression, outperforming computational methods in both the number and extent of gene removal. The CRISPR-Cas9 treated samples, sequenced at half the depth compared to untreated samples, maintained comparable sequencing quality, and saturation, demonstrating that this approach can reduce sequencing costs while preserving data quality. Cell type composition and gene expression patterns remained consistent between treated and original datasets, with no unintended gene deletions. Overall, our findings suggest that the CRISPR-Cas9-based method offers a cost-effective solution for improving scRNA-seq data quality, particularly for tissues with high levels of non-variable RNAs, without compromising biological integrity.
在单细胞RNA测序(scRNA-seq)数据中,由于器官特征或样本质量,常常会出现与非可变RNA高表达相关的问题。诸如SoupX(Young(《Gigascience》9:giaa151,2020))等计算方法已被用于解决此问题,但它可能会去除生物学相关数据。本研究提出了一种基于成簇规律间隔短回文重复序列(CRISPR)-Cas9的方法,该方法可选择性去除非可变RNA。我们将这种方法应用于17名患者的人类肠道组织的scRNA-seq数据。通过靶向非可变基因,包括核糖体RNA和线粒体RNA,CRISPR-Cas9处理有效地降低了它们的表达,在基因去除的数量和程度上均优于计算方法。与未处理的样本相比,以一半深度进行测序的CRISPR-Cas9处理样本保持了相当的测序质量和饱和度,这表明该方法可以在保持数据质量的同时降低测序成本。处理后的数据集与原始数据集之间的细胞类型组成和基因表达模式保持一致,没有意外的基因缺失。总体而言,我们的研究结果表明,基于CRISPR-Cas9的方法为提高scRNA-seq数据质量提供了一种经济有效的解决方案,特别是对于非可变RNA水平较高的组织,同时不会损害生物学完整性。