MOE Key Laboratory of Bioinformatics, Bioinformatics Division and Center for Synthetic and Systems Biology, TNLIST, Department of Automation, Tsinghua University, Beijing 100084, China.
Gene. 2013 Apr 10;518(1):164-70. doi: 10.1016/j.gene.2012.11.045. Epub 2012 Dec 8.
Recent study revealed that most human genes have alternative splicing and can produce multiple isoforms of transcripts. Differences in the relative abundance of the isoforms of a gene can have significant biological consequences. Identifying genes that are differentially spliced between two groups of RNA-sequencing samples is an important basic task in the study of transcriptomes with next-generation sequencing technology. We use the negative binomial (NB) distribution to model sequencing reads on exons, and propose a NB-statistic to detect differentially spliced genes between two groups of samples by comparing read counts on all exons. The method opens a new exon-based approach instead of isoform-based approach for the task. It does not require information about isoform composition, nor need the estimation of isoform expression. Experiments on simulated data and real RNA-seq data of human kidney and liver samples illustrated the method's good performance and applicability. It can also detect previously unknown alternative splicing events, and highlight exons that are most likely differentially spliced between the compared samples. We developed an NB-statistic method that can detect differentially spliced genes between two groups of samples without using a prior knowledge on the annotation of alternative splicing. It does not need to infer isoform structure or to estimate isoform expression. It is a useful method designed for comparing two groups of RNA-seq samples. Besides identifying differentially spliced genes, the method can highlight on the exons that contribute the most to the differential splicing. We developed a software tool called DSGseq for the presented method available at http://bioinfo.au.tsinghua.edu.cn/software/DSGseq.
最近的研究表明,大多数人类基因都具有选择性剪接,可以产生多种转录本的同工型。基因同工型相对丰度的差异可能会产生显著的生物学后果。鉴定两个 RNA-seq 样本组之间差异剪接的基因是使用下一代测序技术研究转录组的一项重要基础任务。我们使用负二项式(NB)分布来对外显子上的测序reads 进行建模,并通过比较所有外显子上的reads 计数,提出了一种 NB 统计量来检测两个样本组之间差异剪接的基因。该方法为该任务开辟了一种新的基于外显子的方法,而不是基于同工型的方法。它不需要关于同工型组成的信息,也不需要估计同工型的表达。在模拟数据和人类肾和肝样本的真实 RNA-seq 数据上的实验表明了该方法的良好性能和适用性。它还可以检测到以前未知的选择性剪接事件,并突出显示在比较样本之间最有可能差异剪接的外显子。我们开发了一种 NB 统计量方法,无需使用关于选择性剪接注释的先验知识即可检测两个样本组之间差异剪接的基因。它不需要推断同工型结构或估计同工型表达。这是一种专门用于比较两组 RNA-seq 样本的有用方法。除了鉴定差异剪接基因外,该方法还可以突出对差异剪接贡献最大的外显子。我们开发了一种名为 DSGseq 的软件工具,用于提供的方法,可在 http://bioinfo.au.tsinghua.edu.cn/software/DSGseq 上获得。