Nancy E. and Peter C. Meinig School of Biomedical Engineering, Cornell University, Ithaca, NY, 14853, USA.
Baker Institute for Animal Health, College of Veterinary Medicine, Cornell University, Ithaca, NY, 14853, USA.
Nat Commun. 2021 Apr 12;12(1):2158. doi: 10.1038/s41467-021-22496-3.
Conventional scRNA-seq expression analyses rely on the availability of a high quality genome annotation. Yet, as we show here with scRNA-seq experiments and analyses spanning human, mouse, chicken, mole rat, lemur and sea urchin, genome annotations are often incomplete, in particular for organisms that are not routinely studied. To overcome this hurdle, we created a scRNA-seq analysis routine that recovers biologically relevant transcriptional activity beyond the scope of the best available genome annotation by performing scRNA-seq analysis on any region in the genome for which transcriptional products are detected. Our tool generates a single-cell expression matrix for all transcriptionally active regions (TARs), performs single-cell TAR expression analysis to identify biologically significant TARs, and then annotates TARs using gene homology analysis. This procedure uses single-cell expression analyses as a filter to direct annotation efforts to biologically significant transcripts and thereby uncovers biology to which scRNA-seq would otherwise be in the dark.
传统的 scRNA-seq 表达分析依赖于高质量的基因组注释。然而,正如我们在这里通过跨越人类、小鼠、鸡、鼹鼠、狐猴和海胆的 scRNA-seq 实验和分析所展示的,基因组注释往往是不完整的,特别是对于那些不常研究的生物。为了克服这一障碍,我们创建了一种 scRNA-seq 分析程序,通过在基因组中任何转录产物被检测到的区域上进行 scRNA-seq 分析,从而恢复超出最佳可用基因组注释范围的生物相关转录活性。我们的工具为所有转录活性区域 (TAR) 生成单细胞表达矩阵,对单细胞 TAR 表达进行分析以识别具有生物学意义的 TAR,然后使用基因同源性分析对 TAR 进行注释。该程序使用单细胞表达分析作为筛选器,将注释工作引导到具有生物学意义的转录本上,从而揭示 scRNA-seq 原本无法揭示的生物学信息。