European Molecular Biology Laboratory, European Bioinformatics Institute, Cambridge, UK.
Departament de Genètica, Facultat de Biologia, Universitat de Barcelona, Barcelona, Spain.
Nat Methods. 2013 Dec;10(12):1177-84. doi: 10.1038/nmeth.2714. Epub 2013 Nov 3.
We evaluated 25 protocol variants of 14 independent computational methods for exon identification, transcript reconstruction and expression-level quantification from RNA-seq data. Our results show that most algorithms are able to identify discrete transcript components with high success rates but that assembly of complete isoform structures poses a major challenge even when all constituent elements are identified. Expression-level estimates also varied widely across methods, even when based on similar transcript models. Consequently, the complexity of higher eukaryotic genomes imposes severe limitations on transcript recall and splice product discrimination that are likely to remain limiting factors for the analysis of current-generation RNA-seq data.
我们评估了 25 种针对从 RNA-seq 数据中鉴定外显子、转录本重构和表达水平定量的独立计算方法的协议变体。我们的结果表明,大多数算法都能够以高成功率识别离散的转录本成分,但即使所有组成部分都被识别出来,组装完整的异构体结构仍然是一个主要挑战。即使基于相似的转录本模型,表达水平的估计也在方法之间存在很大差异。因此,高等真核基因组的复杂性对外显子的召回和剪接产物的区分施加了严重的限制,这可能仍然是当前一代 RNA-seq 数据分析的限制因素。