Suppr超能文献

STAR和Kallisto在单细胞RNA测序数据比对中的评估

Evaluation of STAR and Kallisto on Single Cell RNA-Seq Data Alignment.

作者信息

Du Yuheng, Huang Qianhui, Arisdakessian Cedric, Garmire Lana X

机构信息

Department of Biostatistics, School of Public Health, University of Michigan, Ann Arbor, MI, 48105.

University of Hawaii at Manoa, Department of Information and Computer Science, Honolulu, HI, 96816.

出版信息

G3 (Bethesda). 2020 May 4;10(5):1775-1783. doi: 10.1534/g3.120.401160.

Abstract

Alignment of scRNA-Seq data are the first and one of the most critical steps of the scRNA-Seq analysis workflow, and thus the choice of proper aligners is of paramount importance. Recently, STAR an alignment method and Kallisto a pseudoalignment method have both gained a vast amount of popularity in the single cell sequencing field. However, an unbiased third-party comparison of these two methods in scRNA-Seq is lacking. Here we conduct a systematic comparison of them on a variety of Drop-seq, Fluidigm and 10x genomics data, from the aspects of gene abundance, alignment accuracy, as well as computational speed and memory use. We observe that STAR globally produces more genes and higher gene-expression values, compared to Kallisto, as well as Bowtie2, another popular alignment method for bulk RNA-Seq. STAR also yields higher correlations of the Gini index for the genes with RNA-FISH validation results. Using 10x genomics PBMC 3K scRNA-Seq and mouse cortex single nuclei RNA-Seq data, STAR shows similar or better cell-type annotation results, by detecting a larger subset of known gene markers. However, the gain of accuracy and gene abundance of STAR alignment comes with the price of significantly slower computation time (4 folds) and more memory (7.7 folds), compared to Kallisto.

摘要

单细胞RNA测序(scRNA-Seq)数据的比对是scRNA-Seq分析流程的首要且关键步骤之一,因此选择合适的比对工具至关重要。最近,STAR(一种比对方法)和Kallisto(一种伪比对方法)在单细胞测序领域都颇受欢迎。然而,在scRNA-Seq中缺乏对这两种方法的公正第三方比较。在此,我们基于多种Drop-seq、Fluidigm和10x基因组学数据,从基因丰度、比对准确性以及计算速度和内存使用等方面对它们进行了系统比较。我们观察到,与Kallisto以及另一种用于批量RNA-Seq的常用比对方法Bowtie2相比,STAR总体上能产生更多基因和更高的基因表达值。对于具有RNA荧光原位杂交(RNA-FISH)验证结果的基因,STAR的基尼指数相关性也更高。使用10x基因组学PBMC 3K scRNA-Seq和小鼠皮质单细胞核RNA-Seq数据,通过检测更大的已知基因标记子集,STAR显示出相似或更好的细胞类型注释结果。然而,与Kallisto相比,STAR比对在准确性和基因丰度方面的提升是以显著更长的计算时间(4倍)和更多的内存(7.7倍)为代价的。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3196/7202009/84798c5fb7c5/1775f1.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验