Dobin Alexander, Gingeras Thomas R
Cold Spring Harbor Laboratory, Cold Spring Harbor, New York.
Curr Protoc Bioinformatics. 2015 Sep 3;51:11.14.1-11.14.19. doi: 10.1002/0471250953.bi1114s51.
Mapping of large sets of high-throughput sequencing reads to a reference genome is one of the foundational steps in RNA-seq data analysis. The STAR software package performs this task with high levels of accuracy and speed. In addition to detecting annotated and novel splice junctions, STAR is capable of discovering more complex RNA sequence arrangements, such as chimeric and circular RNA. STAR can align spliced sequences of any length with moderate error rates, providing scalability for emerging sequencing technologies. STAR generates output files that can be used for many downstream analyses such as transcript/gene expression quantification, differential gene expression, novel isoform reconstruction, and signal visualization. In this unit, we describe computational protocols that produce various output files, use different RNA-seq datatypes, and utilize different mapping strategies. STAR is open source software that can be run on Unix, Linux, or Mac OS X systems.
将大量高通量测序 reads 映射到参考基因组是 RNA-seq 数据分析的基础步骤之一。STAR 软件包以高水平的准确性和速度执行此任务。除了检测注释的和新的剪接接头外,STAR 还能够发现更复杂的 RNA 序列排列,例如嵌合和环状 RNA。STAR 可以以适度的错误率比对任何长度的剪接序列,为新兴测序技术提供可扩展性。STAR 生成的输出文件可用于许多下游分析,如转录本/基因表达定量、差异基因表达、新异构体重建和信号可视化。在本单元中,我们描述了产生各种输出文件、使用不同 RNA-seq 数据类型并采用不同映射策略的计算协议。STAR 是开源软件,可以在 Unix、Linux 或 Mac OS X 系统上运行。