Suppr超能文献

分子采样和数据处理的差异解释了单细胞和单核 RNA-seq 实验之间的变化。

Differences in molecular sampling and data processing explain variation among single-cell and single-nucleus RNA-seq experiments.

机构信息

Department of Biomedical Informatics, University of Utah, Salt Lake City, Utah 84108, USA.

Seoul National University, College of Veterinary Medicine, Seoul, 08826, South Korea.

出版信息

Genome Res. 2024 Mar 20;34(2):179-188. doi: 10.1101/gr.278253.123.

Abstract

A mechanistic understanding of the biological and technical factors that impact transcript measurements is essential to designing and analyzing single-cell and single-nucleus RNA sequencing experiments. Nuclei contain the same pre-mRNA population as cells, but they contain a small subset of the mRNAs. Nonetheless, early studies argued that single-nucleus analysis yielded results comparable to cellular samples if pre-mRNA measurements were included. However, typical workflows do not distinguish between pre-mRNA and mRNA when estimating gene expression, and variation in their relative abundances across cell types has received limited attention. These gaps are especially important given that incorporating pre-mRNA has become commonplace for both assays, despite known gene length bias in pre-mRNA capture. Here, we reanalyze public data sets from mouse and human to describe the mechanisms and contrasting effects of mRNA and pre-mRNA sampling on gene expression and marker gene selection in single-cell and single-nucleus RNA-seq. We show that pre-mRNA levels vary considerably among cell types, which mediates the degree of gene length bias and limits the generalizability of a recently published normalization method intended to correct for this bias. As an alternative, we repurpose an existing post hoc gene length-based correction method from conventional RNA-seq gene set enrichment analysis. Finally, we show that inclusion of pre-mRNA in bioinformatic processing can impart a larger effect than assay choice itself, which is pivotal to the effective reuse of existing data. These analyses advance our understanding of the sources of variation in single-cell and single-nucleus RNA-seq experiments and provide useful guidance for future studies.

摘要

要设计和分析单细胞和单核 RNA 测序实验,必须深入了解影响转录本测量的生物学和技术因素,以建立机制模型。细胞核中包含与细胞相同的前体 mRNA 群体,但只包含一小部分 mRNA。尽管如此,如果包括前体 mRNA 测量,早期的研究认为单核分析可以得到与细胞样本相当的结果。然而,在估计基因表达时,典型的工作流程不会区分前体 RNA 和 mRNA,并且它们在细胞类型之间的相对丰度的变化受到的关注有限。这些差距非常重要,因为尽管前体 RNA 捕获存在已知的基因长度偏倚,但这两种检测方法都已经普遍包含了前体 RNA。在这里,我们重新分析了来自小鼠和人类的公共数据集,以描述 mRNA 和前体 RNA 采样对单细胞和单核 RNA-seq 中基因表达和标记基因选择的影响机制和对比效果。我们表明,前体 RNA 水平在细胞类型之间存在显著差异,这调节了基因长度偏倚的程度,并限制了最近发表的旨在纠正这种偏倚的归一化方法的通用性。作为替代方法,我们重新利用了常规 RNA-seq 基因集富集分析中现有的基于基因长度的事后校正方法。最后,我们表明,在生物信息学处理中包含前体 RNA 可以产生比检测方法选择本身更大的影响,这对于有效重用现有数据至关重要。这些分析提高了我们对单细胞和单核 RNA-seq 实验中变异来源的理解,并为未来的研究提供了有用的指导。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0a56/10984380/afcbb67d5e45/179f01.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验